#Content Moderation
Latest news articles tagged with "Content Moderation". Follow the timeline of events, related topics, and entities.
Articles (17)
-
๐บ๐ธ Reddit is moving on from r/all
[USA]
Reddit is deprecating r/all, one of its feeds that shows popular posts on the platform, as part of "ongoing efforts to simplify Reddit and improve Home feed personalization." Reddit has offered both r...
Related: #Platform Changes -
๐ฌ๐ง AI videos of sexualised black women removed from TikTok after BBC investigation
[United Kingdom]
Dozens of Instagram and TikTok accounts have used AI avatars to promote explicit content, the BBC finds.
Related: #AI Ethics -
๐บ๐ธ Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection
[USA]
arXiv:2603.18015v1 Announce Type: cross Abstract: Although automated harmful content detection systems are frequently used to monitor online platforms, moderators and end users frequently cannot unde...
Related: #AI Explainability -
๐บ๐ธ Meta to cut back on third-party vendors in favor of AI for content enforcement
[USA]
Meta is beginning a multiyear rollout of more advanced AI systems that will handle content enforcement-related tasks like catching scams and illegal media.
Related: #AI Integration -
๐บ๐ธ Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
[USA]
arXiv:2603.17759v1 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that cu...
Related: #AI Ethics, #Multimodal Analysis -
๐บ๐ธ OpenAIโs adult mode will reportedly be smutty, not pornographic
[USA]
OpenAI's delayed "adult mode" for ChatGPT is expected to support saucy text conversations at launch, but not the chatbot's ability to generate images, voice, or video. Speaking to The Wall Street Jour...
Related: #AI Ethics -
๐บ๐ธ Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks
[USA]
arXiv:2603.11914v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly trained to align with human values, primarily focusing on task level, i.e., refusing to execute directl...
Related: #AI Safety -
๐บ๐ธ YouTube expands AI deepfake detection to politicians, government officials, and journalists
[USA]
YouTube's AI deepfake detection tool is becoming available to politicians, journalists, and officials, letting them flag unauthorized likenesses for removal.
Related: #Technology, #Misinformation, #Digital Privacy -
๐บ๐ธ Metaโs deepfake moderation isnโt good enough, says Oversight Board
[USA]
Meta's methods for identifying deepfakes are "not robust or comprehensive enough" to handle how quickly misinformation spreads during armed conflicts like the Iran war. That's according to the Meta Ov...
Related: #AI Misinformation -
๐บ๐ธ X probes offensive posts by xAIโs Grok chatbot, Sky News reports
[USA]
Related: #AI Safety
-
๐บ๐ธ BBC Reveals How It Came To Air The N-Word At BAFTA Film Awards, And How The Slur Remained On iPlayer For So Long
[USA]
The BBC Director General has for the first time set out in detail the corporationโs version of events that led to the BAFTA broadcast debacle, while his content chief has told staff โthe incident has ...
Related: #Broadcasting Error -
๐บ๐ธ BBC Removes BAFTA Film Awards From iPlayer After Leaving N-Word Outburst in Tape-Delayed Broadcast
[USA]
The BBC has removed the BAFTA Film Awards from being available to stream on iPlayer after not cutting a racial slur from its tape-delayed broadcast on Sunday night. The outburst came from John Davidso...
Related: #Broadcasting Ethics, #Disability Awareness, #Cultural Sensitivity -
๐บ๐ธ Grok faces more scrutiny over deepfakes
[USA]
X faces a new EU privacy investigation after its Grok chatbot generated nonconsensual deepfake images on the platform
Related: #AI Ethics, #Privacy Regulation, #Tech Accountability -
๐บ๐ธ Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation
[USA]
arXiv:2602.07954v3 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) become increasingly deployed in Polish language applications, the need for efficient and accurate content saf...
Related: #Artificial Intelligence, #Natural Language Processing -
๐บ๐ธ UpScrolledโs social network is struggling to moderate hate speech after fast growth
[USA]
Upscrolled, a social network that surged in the wake of the U.S. TikTok deal, has seen an uptick in harmful content, including user names and hashtags that contain racial slurs.
Related: #Technology, #Social Media -
๐บ๐ธ xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection
[USA]
arXiv:2602.05874v1 Announce Type: cross Abstract: Hate speech detection is commonly framed as a direct binary classification problem despite being a composite concept defined through multiple interac...
Related: #Artificial Intelligence, #Technology -
๐บ๐ธ gpt-oss-safeguard technical report
[USA]
gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from a provided policy in order to label content under ...
Related: #AI Safety, #Open-Weight Models, #Policy-Based Reasoning
Key Entities (20)
- Meta (2 news)
- AI safety (2 news)
- Artificial intelligence content detection (1 news)
- Oversight board (1 news)
- Artificial intelligence (1 news)
- TikTok (1 news)
- OpenAI (1 news)
- ChatGPT (1 news)
- Sky News (1 news)
- Misinformation (1 news)
- YouTube (1 news)
- Content moderation (1 news)
- Digital Services Act (1 news)
- Regulation of artificial intelligence (1 news)
- Reddit (1 news)
- Not safe for work (1 news)
- John Davidson (1 news)
- BBC (1 news)
- Tourette syndrome (1 news)
- British Academy Film Awards (1 news)
About the topic: Content Moderation
The topic "Content Moderation" aggregates 17+ news articles from various countries.