#Deception Detection
Latest news articles tagged with "Deception Detection". Follow the timeline of events, related topics, and entities.
Articles (3)
-
πΊπΈ LieCraft: A Multi-Agent Framework for Evaluating Deceptive Capabilities in Language Models
[USA]
arXiv:2603.06874v1 Announce Type: new Abstract: Large Language Models (LLMs) exhibit impressive general-purpose capabilities but also introduce serious safety risks, particularly the potential for de...
Related: #AI Evaluation -
πΊπΈ "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems
[USA]
arXiv:2602.21127v1 Announce Type: cross Abstract: Large language model (LLM) agents are rapidly becoming trusted copilots in high-stakes domains like software development and healthcare. However, thi...
Related: #AI Security, #Human-AI Interaction -
πΊπΈ The Obfuscation Atlas: Mapping Where Honesty Emerges in RLVR with Deception Probes
[USA]
arXiv:2602.15515v1 Announce Type: cross Abstract: Training against white-box deception detectors has been proposed as a way to make AI systems honest. However, such training risks models learning to ...
Related: #AI Honesty, #Reward Hacking, #Obfuscation in RLVR, #Safe AI Deployment
Key Entities (2)
- Are You Sure? (1 news)
- AI safety (1 news)
About the topic: Deception Detection
The topic "Deception Detection" aggregates 3+ news articles from various countries.