SP
BravenNow
Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
| USA | technology | ✓ Verified - arxiv.org

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

#harmful humor #multimodal benchmark #multilingual #AI detection #content moderation #covert humor #overt humor #cultural nuances

📌 Key Takeaways

  • Researchers introduce a new benchmark for detecting harmful humor in text and images across multiple languages.
  • The benchmark distinguishes between overt and covert harmful humor to improve detection accuracy.
  • It addresses challenges in AI's ability to understand context and cultural nuances in humor.
  • The tool aims to help platforms moderate content more effectively by identifying subtle harmful jokes.

📖 Full Retelling

arXiv:2603.17759v1 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span E

🏷️ Themes

AI Ethics, Content Moderation, Multimodal Analysis

📚 Related People & Topics

Artificial intelligence content detection

Software to detect AI-generated content

Artificial intelligence detection software aims to determine whether some content (text, image, video or audio) was generated using artificial intelligence (AI). This software is often unreliable.

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Artificial intelligence content detection:

🌐 Digital literacy 1 shared
🌐 Deepfake 1 shared
🌐 AI literacy 1 shared
🌐 Echoes 1 shared
View full profile

Mentioned Entities

Artificial intelligence content detection

Software to detect AI-generated content

Deep Analysis

Why It Matters

This research matters because harmful humor disguised as jokes can perpetuate stereotypes, discrimination, and psychological harm while evading content moderation systems. It affects social media platforms, AI developers, content moderators, and vulnerable communities who may be targeted through seemingly harmless content. The benchmark addresses a critical gap in AI's ability to detect subtle forms of harmful content that current systems often miss, which is essential for creating safer online environments and more ethical AI systems.

Context & Background

  • Current content moderation systems primarily focus on overtly harmful content like hate speech and explicit harassment
  • AI humor detection has historically struggled with cultural context, sarcasm, and implicit meaning
  • Research shows harmful humor can normalize prejudice while providing plausible deniability through the 'just joking' defense
  • Multimodal AI (combining text, images, audio) is becoming increasingly important for content analysis
  • Previous benchmarks have typically addressed either humor detection or harmful content separately, not their intersection

What Happens Next

Researchers will likely use this benchmark to develop more sophisticated AI models that can detect covert harmful humor across languages and media types. Social media platforms may integrate these detection capabilities into their moderation systems within 1-2 years. Expect increased academic research on the psychological and social impacts of harmful humor, and potential regulatory discussions about platform responsibility for subtle harmful content. The benchmark will also enable comparative studies of humor norms across different cultures and languages.

Frequently Asked Questions

What is the difference between overt and covert harmful humor?

Overt harmful humor is explicitly offensive and clearly targets groups or individuals, while covert harmful humor uses subtlety, irony, or plausible deniability to disguise harmful content as harmless jokes. Covert examples might include coded language, backhanded compliments, or humor that relies on harmful stereotypes without explicit statements.

Why does this benchmark include multiple languages and media types?

Harmful humor varies significantly across cultures and often combines text with images or audio for maximum impact. A multilingual, multimodal approach ensures the benchmark reflects real-world content where harmful humor crosses linguistic boundaries and uses multiple communication channels to convey harmful messages.

How could this research affect social media platforms?

Platforms could implement more sophisticated content moderation that detects subtle harmful humor currently slipping through filters. This might lead to updated community guidelines, improved user reporting systems, and better protection for targeted communities while raising questions about humor censorship and cultural sensitivity.

What are the main challenges in detecting harmful humor?

Key challenges include cultural context understanding, distinguishing between harmless edgy humor and genuinely harmful content, handling sarcasm and irony, and avoiding over-censorship of legitimate comedy. The subjective nature of humor makes consistent detection particularly difficult across diverse user bases.

How might this research impact AI development?

This benchmark will push AI developers to create more nuanced models that understand context, cultural references, and implicit meaning. It represents progress toward AI systems that can comprehend complex human communication beyond literal interpretation, with applications extending beyond content moderation to general natural language understanding.

}
Original Source
arXiv:2603.17759v1 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span E
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine