#AI safety and security

Latest news articles tagged with "AI safety and security". Follow the timeline of events, related topics, and entities.

Articles (2)

🇺🇸 Recursive language models for jailbreak detection: a procedural defense for tool-augmented agents — 19/02/2026 [USA]
arXiv:2602.16520v1 Announce Type: cross Abstract: Jailbreak prompts are a practical and evolving threat to large language models (LLMs), particularly in agentic systems that execute tools over untrus...
Related: #LLM jailbreak detection, #Agentic system safeguards, #Recursive language modeling, #Evasive prompt strategies
🇺🇸 Closing the Distribution Gap in Adversarial Training for LLMs — 18/02/2026 [USA]
arXiv:2602.15238v1 Announce Type: cross Abstract: Adversarial training for LLMs is one of the most promising methods to reliably improve robustness against adversaries. However, despite significant p...
Related: #Adversarial robustness, #Large language models, #Distribution shift, #Training methodology

The topic "AI safety and security" aggregates 2+ news articles from various countries.