#AI Security
Latest news articles tagged with "AI Security". Follow the timeline of events, related topics, and entities.
Articles (30)
-
πΊπΈ SALLIE: Safeguarding Against Latent Language & Image Exploits
[USA]
arXiv:2604.06247v1 Announce Type: cross Abstract: Large Language Models (LLMs) and Vision-Language Models (VLMs) remain highly vulnerable to textual and visual jailbreaks, as well as prompt injection...
Related: #Multimodal Models, #Research Innovation -
πΊπΈ The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?
[USA]
arXiv:2604.06436v1 Announce Type: cross Abstract: We prove that no continuous, utility-preserving wrapper defense-a function $D: X\to X$ that preprocesses inputs before the model sees them-can make a...
Related: #Theoretical Computer Science, #LLM Vulnerabilities -
πΊπΈ When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
[USA]
arXiv:2603.19247v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly integrated into high-stakes applications, making robust safety guarantees a central practical and comme...
Related: #Prompt Engineering -
πΊπΈ Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance
[USA]
arXiv:2603.19974v1 Announce Type: cross Abstract: Autonomous coding agents are increasingly integrated into software development workflows, offering capabilities that extend beyond code suggestion to...
Related: #Cyber Threats -
πΊπΈ From Masks to Pixels and Meaning: A New Taxonomy, Benchmark, and Metrics for VLM Image Tampering
[USA]
arXiv:2603.20193v1 Announce Type: cross Abstract: Existing tampering detection benchmarks largely rely on object masks, which severely misalign with the true edit signal: many pixels inside a mask ar...
Related: #Image Analysis -
πΊπΈ A Framework for Formalizing LLM Agent Security
[USA]
arXiv:2603.19469v1 Announce Type: cross Abstract: Security in LLM agents is inherently contextual. For example, the same action taken by an agent may represent legitimate behavior or a security viola...
Related: #Formal Methods -
πΊπΈ Evolving Jailbreaks: Automated Multi-Objective Long-Tail Attacks on Large Language Models
[USA]
arXiv:2603.20122v1 Announce Type: cross Abstract: Large Language Models (LLMs) have been widely deployed, especially through free Web-based applications that expose them to diverse user-generated inp...
Related: #Jailbreak Attacks -
πΊπΈ The Autonomy Tax: Defense Training Breaks LLM Agents
[USA]
arXiv:2603.19423v1 Announce Type: cross Abstract: Large language model (LLM) agents increasingly rely on external tools (file operations, API calls, database transactions) to autonomously complete co...
Related: #Autonomy Trade-offs -
πΊπΈ LISAA: A Framework for Large Language Model Information Security Awareness Assessment
[USA]
arXiv:2411.13207v3 Announce Type: replace-cross Abstract: The popularity of large language models (LLMs) continues to grow, and LLM-based assistants have become ubiquitous. Information security aware...
Related: #Assessment Framework -
πΊπΈ Behavioral Fingerprints for LLM Endpoint Stability and Identity
[USA]
arXiv:2603.19022v1 Announce Type: new Abstract: The consistency of AI-native applications depends on the behavioral consistency of the model endpoints that power them. Traditional reliability metrics...
Related: #LLM Monitoring -
πΊπΈ Functional Subspace Watermarking for Large Language Models
[USA]
arXiv:2603.18793v1 Announce Type: cross Abstract: Model watermarking utilizes internal representations to protect the ownership of large language models (LLMs). However, these features inevitably und...
Related: #Watermarking Technology -
πΊπΈ Retrieval-Augmented LLMs for Security Incident Analysis
[USA]
arXiv:2603.18196v1 Announce Type: cross Abstract: Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts, net...
Related: #Incident Response -
πΊπΈ From Weak Cues to Real Identities: Evaluating Inference-Driven De-Anonymization in LLM Agents
[USA]
arXiv:2603.18382v1 Announce Type: new Abstract: Anonymization is widely treated as a practical safeguard because re-identifying anonymous records was historically costly, requiring domain expertise, ...
Related: #Privacy -
πΊπΈ CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models
[USA]
arXiv:2603.18545v1 Announce Type: cross Abstract: Medical vision--language models (MVLMs) are increasingly used as perceptual backbones in radiology pipelines and as the visual front end of multimoda...
Related: #Medical AI -
πΊπΈ Implicit Patterns in LLM-Based Binary Analysis
[USA]
arXiv:2603.19138v1 Announce Type: new Abstract: Binary vulnerability analysis is increasingly performed by LLM-based agents in an iterative, multi-pass manner, with the model as the core decision-mak...
Related: #Binary Analysis -
πΊπΈ Access Controlled Website Interaction for Agentic AI with Delegated Critical Tasks
[USA]
arXiv:2603.18197v1 Announce Type: new Abstract: Recent studies reveal gaps in delegating critical tasks to agentic AI that accesses websites on the user's behalf, primarily due to limited access cont...
Related: #Web Automation -
πΊπΈ NANOZK: Layerwise Zero-Knowledge Proofs for Verifiable Large Language Model Inference
[USA]
arXiv:2603.18046v1 Announce Type: cross Abstract: When users query proprietary LLM APIs, they receive outputs with no cryptographic assurance that the claimed model was actually used. Service provide...
Related: #Cryptography -
πΊπΈ A rogue AI led to a serious security incident at Meta
[USA]
For almost two hours last week, Meta employees had unauthorized access to company and user data thanks to an AI agent that gave an employee inaccurate technical advice, as previously reported by The I...
Related: #Data Breach -
πΊπΈ Signalβs Creator Is Helping Encrypt Meta AI
[USA]
Moxie Marlinspike says the technology powering his encrypted AI chatbot, Confer, will be integrated into Meta AI. The move could help protect the AI conversations of millions of people.
Related: #Privacy -
πΊπΈ Understanding and Defending VLM Jailbreaks via Jailbreak-Related Representation Shift
[USA]
arXiv:2603.17372v1 Announce Type: cross Abstract: Large vision-language models (VLMs) often exhibit weakened safety alignment with the integration of the visual modality. Even when text prompts conta...
Related: #Model Defense -
πΊπΈ Rel-Zero: Harnessing Patch-Pair Invariance for Robust Zero-Watermarking Against AI Editing
[USA]
arXiv:2603.17531v1 Announce Type: cross Abstract: Recent advancements in diffusion-based image editing pose a significant threat to the authenticity of digital visual content. Traditional embedding-b...
Related: #Digital Watermarking -
πΊπΈ Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems
[USA]
arXiv:2603.17176v1 Announce Type: cross Abstract: Retrieval augmented generation systems have become an integral part of everyday life. Whether in internet search engines, email systems, or service c...
Related: #Document Integrity -
πΊπΈ Adversarial attacks against Modern Vision-Language Models
[USA]
arXiv:2603.16960v1 Announce Type: cross Abstract: We study adversarial robustness of open-source vision-language model (VLM) agents deployed in a self-contained e-commerce environment built to simula...
Related: #Adversarial Attacks -
πΊπΈ Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning
[USA]
arXiv:2603.17174v1 Announce Type: cross Abstract: Code generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these...
Related: #Code Generation -
πΊπΈ Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare
[USA]
arXiv:2603.17419v1 Announce Type: cross Abstract: Autonomous AI agents powered by large language models are being deployed in production with capabilities including shell execution, file system acces...
Related: #Healthcare Technology -
πΊπΈ PAuth - Precise Task-Scoped Authorization For Agents
[USA]
arXiv:2603.17170v1 Announce Type: cross Abstract: The emerging agentic web envisions AI agents that reliably fulfill users' natural-language (NL)-based tasks by interacting with existing web services...
Related: #Authorization -
πΊπΈ BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator
[USA]
arXiv:2603.15692v1 Announce Type: cross Abstract: Backdoor attacks compromise model reliability by using triggers to manipulate outputs. Trigger inversion can accurately locate these triggers via a g...
Related: #Backdoor Defense -
πΊπΈ Detecting Sentiment Steering Attacks on RAG-enabled Large Language Models
[USA]
arXiv:2603.16342v1 Announce Type: cross Abstract: The proliferation of large-scale IoT networks has been both a blessing and a curse. Not only has it revolutionized the way organizations operate by i...
Related: #LLM Vulnerabilities -
πΊπΈ How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition
[USA]
arXiv:2603.15714v1 Announce Type: cross Abstract: LLM based agents are increasingly deployed in high stakes settings where they process external data sources such as emails, documents, and code repos...
Related: #Vulnerability Research -
πΊπΈ Sears Exposed AI Chatbot Phone Calls and Text Chats to Anyone on the Web
[USA]
Customer conversations with chatbots can include contact information and personal details that make it easier for scammers to launch phishing attacks and commit fraud.
Related: #Data Breach
Key Entities (13)
- Large language model (4 news)
- Meta (2 news)
- AI agent (2 news)
- Co-Dependents Anonymous (1 news)
- Information security (1 news)
- Artificial intelligence (1 news)
- Signal (1 news)
- AI safety (1 news)
- Chatbot (1 news)
- Sears (1 news)
- VLM (1 news)
- Artificial intelligence content detection (1 news)
- Watermark (disambiguation) (1 news)
About the topic: AI Security
The topic "AI Security" aggregates 30+ news articles from various countries.