3/19/2026 | USA | technology | ✓ Verified - arxiv.org

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

#harmful humor #multimodal benchmark #multilingual #AI detection #content moderation #covert humor #overt humor #cultural nuances

📌 Key Takeaways

Researchers introduce a new benchmark for detecting harmful humor in text and images across multiple languages.
The benchmark distinguishes between overt and covert harmful humor to improve detection accuracy.
It addresses challenges in AI's ability to understand context and cultural nuances in humor.
The tool aims to help platforms moderate content more effectively by identifying subtle harmful jokes.

📖 Full Retelling

arXiv:2603.17759v1 Announce Type: cross Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span E

🏷️ Themes

AI Ethics, Content Moderation, Multimodal Analysis

📚 Related People & Topics

Artificial intelligence content detection

Software to detect AI-generated content

Artificial intelligence detection software aims to determine whether some content (text, image, video or audio) was generated using artificial intelligence (AI). This software is often unreliable.

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Artificial intelligence content detection:

🌐 Digital literacy 1 shared

🌐 Deepfake 1 shared

🌐 AI literacy 1 shared

🌐 Echoes 1 shared

View full profile

Mentioned Entities

Artificial intelligence content detection

Software to detect AI-generated content

Deep Analysis

Why It Matters

This research matters because harmful humor disguised as jokes can perpetuate stereotypes, discrimination, and psychological harm while evading content moderation systems. It affects social media platforms, AI developers, content moderators, and vulnerable communities who may be targeted through seemingly harmless content. The benchmark addresses a critical gap in AI's ability to detect subtle forms of harmful content that current systems often miss, which is essential for creating safer online environments and more ethical AI systems.

Context & Background

Current content moderation systems primarily focus on overtly harmful content like hate speech and explicit harassment
AI humor detection has historically struggled with cultural context, sarcasm, and implicit meaning
Research shows harmful humor can normalize prejudice while providing plausible deniability through the 'just joking' defense
Multimodal AI (combining text, images, audio) is becoming increasingly important for content analysis
Previous benchmarks have typically addressed either humor detection or harmful content separately, not their intersection

What Happens Next

Researchers will likely use this benchmark to develop more sophisticated AI models that can detect covert harmful humor across languages and media types. Social media platforms may integrate these detection capabilities into their moderation systems within 1-2 years. Expect increased academic research on the psychological and social impacts of harmful humor, and potential regulatory discussions about platform responsibility for subtle harmful content. The benchmark will also enable comparative studies of humor norms across different cultures and languages.

Frequently Asked Questions

What is the difference between overt and covert harmful humor?

Overt harmful humor is explicitly offensive and clearly targets groups or individuals, while covert harmful humor uses subtlety, irony, or plausible deniability to disguise harmful content as harmless jokes. Covert examples might include coded language, backhanded compliments, or humor that relies on harmful stereotypes without explicit statements.

Why does this benchmark include multiple languages and media types?

Harmful humor varies significantly across cultures and often combines text with images or audio for maximum impact. A multilingual, multimodal approach ensures the benchmark reflects real-world content where harmful humor crosses linguistic boundaries and uses multiple communication channels to convey harmful messages.

How could this research affect social media platforms?

Platforms could implement more sophisticated content moderation that detects subtle harmful humor currently slipping through filters. This might lead to updated community guidelines, improved user reporting systems, and better protection for targeted communities while raising questions about humor censorship and cultural sensitivity.

What are the main challenges in detecting harmful humor?

Key challenges include cultural context understanding, distinguishing between harmless edgy humor and genuinely harmful content, handling sarcasm and irony, and avoiding over-censorship of legitimate comedy. The subjective nature of humor makes consistent detection particularly difficult across diverse user bases.

How might this research impact AI development?

This benchmark will push AI developers to create more nuanced models that understand context, cultural references, and implicit meaning. It represents progress toward AI systems that can comprehend complex human communication beyond literal interpretation, with applications extending beyond content moderation to general natural language understanding.

}

Original Source

              arXiv:2603.17759v1 Announce Type: cross 
Abstract: Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span E
            

Read full article at source

Source

arxiv.org

Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

📚 Related People & Topics

Artificial intelligence content detection

Entity Intersection Graph

Mentioned Entities

Artificial intelligence content detection

Deep Analysis

Why It Matters

Context & Background

What Happens Next

Frequently Asked Questions

Source

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine