2/20/2026 | USA | technology | ✓ Verified - arxiv.org

AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks

#LLM agents #long‑horizon attacks #AgentLAB #intent hijacking #tool chaining #task injection #objective drifting #memory poisoning #AI security benchmark #multi‑turn interaction #vulnerability testing #defense mechanisms #arXiv cs.AI

📌 Key Takeaways

AgentLAB benchmarks susceptibility of LLM agents to adaptive, long‑horizon attacks.
Five novel attack categories are defined: intent hijacking, tool chaining, task injection, objective drifting, and memory poisoning.
The benchmark covers 28 realistic agentic environments and 644 security test cases.
Evaluation shows mainstream LLM agents remain highly vulnerable to these long‑horizon attacks.
Defenses designed for single‑turn interactions fail to mitigate long‑horizon threats.
AgentLAB is positioned as a continual measurement tool for advancing LLM security.

📖 Full Retelling

Who: A research team led by Tanqiu Jiang and colleagues, including Yuhui Wang, Jiacheng Liang, and Ting Wang. What: They present AgentLAB, the first comprehensive benchmark to assess large language model (LLM) agents against long‑horizon attacks that exploit multi‑turn interactions. Where: The work is published on arXiv in the Artificial Intelligence (cs.AI) category, with the benchmark publicly available online. When: The preprint was submitted on 18 February 2026 and is currently a v1 release. Why: To systematically uncover the vulnerabilities of LLM agents in extended, complex environments and to illustrate that existing single‑turn defenses are inadequate, thereby encouraging the development of robust security measures for practical deployments.

🏷️ Themes

Artificial Intelligence Security, Large Language Model Agents, Long‑Horizon Attack Vectors, Benchmark Development, Multi‑Turn Interaction Vulnerabilities, Defensive Strategy Evaluation

Entity Intersection Graph

No entity connections available yet for this article.

Deep Analysis

Why It Matters

AgentLAB provides the first systematic way to test LLM agents against multi-turn attacks, revealing that current defenses are inadequate. This benchmark helps developers prioritize security improvements for real-world deployments.

Context & Background

LLM agents are increasingly used in complex, long-horizon tasks
AgentLAB introduces five novel attack types: intent hijacking, tool chaining, task injection, objective drifting, and memory poisoning
The benchmark covers 28 realistic environments and 644 security test cases

What Happens Next

Researchers will use AgentLAB to evaluate new defense mechanisms and track progress in securing LLM agents. The community may adopt the benchmark as a standard for safety certification of AI agents.

Frequently Asked Questions

What is AgentLAB?

AgentLAB is a benchmark designed to evaluate LLM agents against long-horizon attacks.

How many attack types does it cover?

It covers five attack types: intent hijacking, tool chaining, task injection, objective drifting, and memory poisoning.

Is AgentLAB publicly available?

Yes, it is publicly available at the URL provided in the paper.

Can it be used for single-turn attacks?

AgentLAB focuses on multi-turn long-horizon attacks, but its framework can be adapted for single-turn scenarios.

}

Original Source

              --> Computer Science > Artificial Intelligence arXiv:2602.16901 [Submitted on 18 Feb 2026] Title: AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks Authors: Tanqiu Jiang , Yuhui Wang , Jiacheng Liang , Ting Wang View a PDF of the paper titled AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks, by Tanqiu Jiang and 3 other authors View PDF HTML Abstract: LLM agents are increasingly deployed in long-horizon, complex environments to solve challenging problems, but this expansion exposes them to long-horizon attacks that exploit multi-turn user-agent-environment interactions to achieve objectives infeasible in single-turn settings. To measure agent vulnerabilities to such risks, we present AgentLAB, the first benchmark dedicated to evaluating LLM agent susceptibility to adaptive, long-horizon attacks. Currently, AgentLAB supports five novel attack types including intent hijacking, tool chaining, task injection, objective drifting, and memory poisoning, spanning 28 realistic agentic environments, and 644 security test cases. Leveraging AgentLAB, we evaluate representative LLM agents and find that they remain highly susceptible to long-horizon attacks; moreover, defenses designed for single-turn interactions fail to reliably mitigate long-horizon threats. We anticipate that AgentLAB will serve as a valuable benchmark for tracking progress on securing LLM agents in practical settings. The benchmark is publicly available at this https URL . Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2602.16901 [cs.AI] (or arXiv:2602.16901v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2602.16901 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Tanqiu Jiang [ view email ] [v1] Wed, 18 Feb 2026 21:30:20 UTC (1,114 KB) Full-text links: Access Paper: View a PDF of the paper titled AgentLAB: Benchmarking LLM Agents against Long-Horizon Attacks, by Tanqiu Jiang and 3 other authors View PDF HTML TeX ...
            

Read full article at source

Source

arxiv.org