#Content Moderation

Latest news articles tagged with "Content Moderation". Follow the timeline of events, related topics, and entities.

Articles (17)

🇺🇸 Reddit is moving on from r/all — 02/04/2026 [USA]
Reddit is deprecating r/all, one of its feeds that shows popular posts on the platform, as part of "ongoing efforts to simplify Reddit and improve Home feed personalization." Reddit has offered both r...
Related: #Platform Changes
🇬🇧 AI videos of sexualised black women removed from TikTok after BBC investigation — 22/03/2026 [United Kingdom]
Dozens of Instagram and TikTok accounts have used AI avatars to promote explicit content, the BBC finds.
Related: #AI Ethics
🇺🇸 Beyond Accuracy: An Explainability-Driven Analysis of Harmful Content Detection — 20/03/2026 [USA]
arXiv:2603.18015v1 Announce Type: cross Abstract: Although automated harmful content detection systems are frequently used to monitor online platforms, moderators and end users frequently cannot unde...
Related: #AI Explainability
🇺🇸 Meta to cut back on third-party vendors in favor of AI for content enforcement — 19/03/2026 [USA]
Meta is beginning a multiyear rollout of more advanced AI systems that will handle content enforcement-related tasks like catching scams and illegal media.
Related: #AI Integration
🇺🇸 Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor — 19/03/2026 [USA]
arXiv:2603.17759v1 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that cu...
Related: #AI Ethics, #Multimodal Analysis
🇺🇸 OpenAI’s adult mode will reportedly be smutty, not pornographic — 16/03/2026 [USA]
OpenAI's delayed "adult mode" for ChatGPT is expected to support saucy text conversations at launch, but not the chatbot's ability to generate images, voice, or video. Speaking to The Wall Street Jour...
Related: #AI Ethics
🇺🇸 Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks — 13/03/2026 [USA]
arXiv:2603.11914v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly trained to align with human values, primarily focusing on task level, i.e., refusing to execute directl...
Related: #AI Safety
🇺🇸 YouTube expands AI deepfake detection to politicians, government officials, and journalists — 10/03/2026 [USA]
YouTube's AI deepfake detection tool is becoming available to politicians, journalists, and officials, letting them flag unauthorized likenesses for removal.
Related: #Technology, #Misinformation, #Digital Privacy
🇺🇸 Meta’s deepfake moderation isn’t good enough, says Oversight Board — 10/03/2026 [USA]
Meta's methods for identifying deepfakes are "not robust or comprehensive enough" to handle how quickly misinformation spreads during armed conflicts like the Iran war. That's according to the Meta Ov...
Related: #AI Misinformation
🇺🇸 X probes offensive posts by xAI’s Grok chatbot, Sky News reports — 08/03/2026 [USA] Related: #AI Safety
🇺🇸 BBC Reveals How It Came To Air The N-Word At BAFTA Film Awards, And How The Slur Remained On iPlayer For So Long — 06/03/2026 [USA]
The BBC Director General has for the first time set out in detail the corporation’s version of events that led to the BAFTA broadcast debacle, while his content chief has told staff “the incident has ...
Related: #Broadcasting Error
🇺🇸 BBC Removes BAFTA Film Awards From iPlayer After Leaving N-Word Outburst in Tape-Delayed Broadcast — 23/02/2026 [USA]
The BBC has removed the BAFTA Film Awards from being available to stream on iPlayer after not cutting a racial slur from its tape-delayed broadcast on Sunday night. The outburst came from John Davidso...
Related: #Broadcasting Ethics, #Disability Awareness, #Cultural Sensitivity
🇺🇸 Grok faces more scrutiny over deepfakes — 17/02/2026 [USA]
X faces a new EU privacy investigation after its Grok chatbot generated nonconsensual deepfake images on the platform
Related: #AI Ethics, #Privacy Regulation, #Tech Accountability
🇺🇸 Bielik Guard: Efficient Polish Language Safety Classifiers for LLM Content Moderation — 16/02/2026 [USA]
arXiv:2602.07954v3 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) become increasingly deployed in Polish language applications, the need for efficient and accurate content saf...
Related: #Artificial Intelligence, #Natural Language Processing
🇺🇸 UpScrolled’s social network is struggling to moderate hate speech after fast growth — 11/02/2026 [USA]
Upscrolled, a social network that surged in the wake of the U.S. TikTok deal, has seen an uptick in harmful content, including user names and hashtags that contain racial slurs.
Related: #Technology, #Social Media
🇺🇸 xList-Hate: A Checklist-Based Framework for Interpretable and Generalizable Hate Speech Detection — 07/02/2026 [USA]
arXiv:2602.05874v1 Announce Type: cross Abstract: Hate speech detection is commonly framed as a direct binary classification problem despite being a composite concept defined through multiple interac...
Related: #Artificial Intelligence, #Technology
🇺🇸 gpt-oss-safeguard technical report — 29/10/2025 [USA]
gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are two open-weight reasoning models post-trained from the gpt-oss models and trained to reason from a provided policy in order to label content under ...
Related: #AI Safety, #Open-Weight Models, #Policy-Based Reasoning

Key Entities (20)

Meta (2 news)
AI safety (2 news)
Artificial intelligence content detection (1 news)
Oversight board (1 news)
Artificial intelligence (1 news)
TikTok (1 news)
OpenAI (1 news)
ChatGPT (1 news)
Sky News (1 news)
Misinformation (1 news)
YouTube (1 news)
Content moderation (1 news)
Digital Services Act (1 news)
Regulation of artificial intelligence (1 news)
Reddit (1 news)
Not safe for work (1 news)
John Davidson (1 news)
BBC (1 news)
Tourette syndrome (1 news)
British Academy Film Awards (1 news)

About the topic: Content Moderation

The topic "Content Moderation" aggregates 17+ news articles from various countries.