Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents
#Contextual Memory Virtualization #Large Language Models #DAG-based state management #Structurally lossless trimming #Token reduction #Prompt caching #LLM agents
📌 Key Takeaways
- CMV treats accumulated LLM understanding as version-controlled state
- System models session history as a Directed Acyclic Graph
- Structurally lossless trimming reduces tokens by mean 20% and up to 86%
- Evaluated on 76 coding sessions showing economic viability under prompt caching
📖 Full Retelling
Computer science researcher Cosmo Santoni introduced Contextual Memory Virtualization (CMV), a novel system for managing accumulated understanding in large language models, on February 25, 2026, through a paper published on arXiv. This innovation addresses the critical issue of valuable contextual information being lost when LLM sessions reach context limits and undergo lossy compaction. CMV treats accumulated LLM understanding as version-controlled state, borrowing concepts from operating system virtual memory to create a more efficient context management system. The system models session history as a Directed Acyclic Graph with formally defined snapshot, branch, and trim primitives that enable context reuse across independent parallel sessions. Santoni's research introduces a three-pass structurally lossless trimming algorithm that preserves every user message and assistant response verbatim while significantly reducing token counts by a mean of 20% and up to 86% for sessions with significant overhead by stripping mechanical bloat such as raw tool outputs, base64 images, and metadata. The system was evaluated through a single-user case study across 76 real-world coding sessions, demonstrating that trimming remains economically viable under prompt caching, with the strongest gains in mixed tool-use sessions which average 39% reduction and reach break-even within 10 turns.
🏷️ Themes
Artificial Intelligence, Software Engineering, Computer Science
📚 Related People & Topics
Large language model
Type of machine learning model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...
Entity Intersection Graph
Connections for Large language model:
🌐
Educational technology
4 shared
🌐
Reinforcement learning
3 shared
🌐
Machine learning
2 shared
🌐
Artificial intelligence
2 shared
🌐
Benchmark
2 shared
Original Source
--> Computer Science > Software Engineering arXiv:2602.22402 [Submitted on 25 Feb 2026] Title: Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents Authors: Cosmo Santoni View a PDF of the paper titled Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents, by Cosmo Santoni View PDF HTML Abstract: As large language models engage in extended reasoning tasks, they accumulate significant state -- architectural mappings, trade-off decisions, codebase conventions -- within the context window. This understanding is lost when sessions reach context limits and undergo lossy compaction. We propose Contextual Memory Virtualisation , a system that treats accumulated LLM understanding as version-controlled state. Borrowing from operating system virtual memory, CMV models session history as a Directed Acyclic Graph with formally defined snapshot, branch, and trim primitives that enable context reuse across independent parallel sessions. We introduce a three-pass structurally lossless trimming algorithm that preserves every user message and assistant response verbatim while reducing token counts by a mean of 20% and up to 86% for sessions with significant overhead by stripping mechanical bloat such as raw tool outputs, base64 images, and metadata. A single-user case-study evaluation across 76 real-world coding sessions demonstrates that trimming remains economically viable under prompt caching, with the strongest gains in mixed tool-use sessions, which average 39% reduction and reach break-even within 10 turns. A reference implementation is available at this https URL . Comments: 11 pages. 6 figures. Introduces a DAG-based state management system for LLM agents. Evaluation on 76 coding sessions shows up to 86% token reduction (mean 20%) while remaining economically viable under prompt caching. Includes reference implementation for Claude Code Subjects: Software Engine...
Read full article at source