SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
#Agentic Skills #LLM Agents #AI Security #Procedural Knowledge #Skill Taxonomy #Autonomous Systems
📌 Key Takeaways
- Researchers mapped the complete lifecycle of agentic skills in LLM agents
- Two complementary taxonomies were introduced to classify agentic skills
- Security vulnerabilities were identified through analysis of the ClawHavoc campaign
- Curated skills can improve agent success rates while self-generated skills may degrade performance
📖 Full Retelling
A team of seven researchers including Yanna Jiang, Delong Li, Haiyu Deng, Baihe Ma, Xu Wang, Qin Wang, and Guangsheng Yu published a comprehensive mapping of agentic skills in LLM agents on arXiv on February 24, 2026, addressing the growing need for reliable reusable procedural capabilities in artificial intelligence systems. The paper introduces agentic skills as callable modules that package procedural knowledge with explicit applicability conditions, execution policies, termination criteria, and reusable interfaces. Unlike one-off plans or atomic tool calls, these skills are designed to operate effectively across multiple tasks, enabling more robust and reliable AI systems. The researchers map the skill layer across its complete lifecycle, including discovery, practice, distillation, storage, composition, evaluation, and update processes, providing a framework for understanding how these capabilities can be developed and implemented. The researchers introduce two complementary taxonomies to classify agentic skills. The first consists of seven design patterns that capture how skills are packaged and executed in practice, ranging from metadata-driven progressive disclosure to self-evolving libraries and marketplace distribution. The second taxonomy, orthogonal to the first, describes skills by their representation (natural language, code, policy, hybrid) and scope (web, OS, software engineering, robotics environments). The paper also analyzes security and governance implications, highlighting supply-chain risks, prompt injection via skill payloads, and trust-tiered execution, with a case study of the ClawHavoc campaign where nearly 1,200 malicious skills infiltrated a major agent marketplace, exfiltrating sensitive data at scale.
🏷️ Themes
AI Capabilities, Security Implications, System Design
Entity Intersection Graph
No entity connections available yet for this article.
Original Source
--> Computer Science > Cryptography and Security arXiv:2602.20867 [Submitted on 24 Feb 2026] Title: SoK: Agentic Skills -- Beyond Tool Use in LLM Agents Authors: Yanna Jiang , Delong Li , Haiyu Deng , Baihe Ma , Xu Wang , Qin Wang , Guangsheng Yu View a PDF of the paper titled SoK: Agentic Skills -- Beyond Tool Use in LLM Agents, by Yanna Jiang and 6 other authors View PDF HTML Abstract: Agentic systems increasingly rely on reusable procedural capabilities, \textit{a.k.a., agentic skills}, to execute long-horizon workflows reliably. These capabilities are callable modules that package procedural knowledge with explicit applicability conditions, execution policies, termination criteria, and reusable interfaces. Unlike one-off plans or atomic tool calls, skills operate (and often do well) across tasks. This paper maps the skill layer across the full lifecycle (discovery, practice, distillation, storage, composition, evaluation, and update) and introduces two complementary taxonomies. The first is a system-level set of \textbf{seven design patterns} capturing how skills are packaged and executed in practice, from metadata-driven progressive disclosure and executable code skills to self-evolving libraries and marketplace distribution. The second is an orthogonal \textbf{representation $\times$ scope} taxonomy describing what skills \emph (natural language, code, policy, hybrid) and what environments they operate over (web, OS, software engineering, robotics). We analyze the security and governance implications of skill-based agents, covering supply-chain risks, prompt injection via skill payloads, and trust-tiered execution, grounded by a case study of the ClawHavoc campaign in which nearly 1{,}200 malicious skills infiltrated a major agent marketplace, exfiltrating API keys, cryptocurrency wallets, and browser credentials at scale. We further survey deterministic evaluation approaches, anchored by recent benchmark evidence that curated skills can substantially improve ...
Read full article at source