#LLM Optimization
Latest news articles tagged with "LLM Optimization". Follow the timeline of events, related topics, and entities.
Articles (17)
-
πΊπΈ Breaking the Capability Ceiling of LLM Post-Training by Reintroducing Markov States
[USA]
arXiv:2603.19987v1 Announce Type: cross Abstract: Reinforcement learning (RL) has become a standard paradigm for post-training and aligning Large Language Models (LLMs), yet recent evidence suggests ...
Related: #AI Research -
πΊπΈ Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs
[USA]
arXiv:2603.20046v1 Announce Type: new Abstract: Reinforcement Learning (RL) with rubric-based rewards has recently shown remarkable progress in enhancing general reasoning capabilities of Large Langu...
Related: #Reinforcement Learning -
πΊπΈ PowerFlow: Unlocking the Dual Nature of LLMs via Principled Distribution Matching
[USA]
arXiv:2603.18363v1 Announce Type: cross Abstract: Unsupervised Reinforcement Learning from Internal Feedback (RLIF) has emerged as a promising paradigm for eliciting the latent capabilities of Large ...
Related: #AI Research -
πΊπΈ Automatic Configuration of LLM Post-Training Pipelines
[USA]
arXiv:2603.18773v1 Announce Type: cross Abstract: LLM post-training pipelines that combine supervised fine-tuning and reinforcement learning are difficult to configure under realistic compute budgets...
Related: #AI Automation -
πΊπΈ Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution
[USA]
arXiv:2603.18897v1 Announce Type: cross Abstract: LLM-powered agents are emerging as a dominant paradigm for autonomous task solving. Unlike standard inference workloads, agents operate in a strictly...
Related: #AI Acceleration -
πΊπΈ Efficient LLM Serving for Agentic Workflows: A Data Systems Perspective
[USA]
arXiv:2603.16104v1 Announce Type: cross Abstract: Agentic workflows are composed of sequences of interdependent Large Language Model (LLM) calls, and they have become a dominant workload in modern AI...
Related: #Data Systems -
πΊπΈ Not All Queries Need Rewriting: When Prompt-Only LLM Refinement Helps and Hurts Dense Retrieval
[USA]
arXiv:2603.13301v1 Announce Type: cross Abstract: Prompt-only, single-step LLM query rewriting, where a rewrite is generated from the query alone without retrieval feedback, is commonly used in produ...
Related: #Information Retrieval -
πΊπΈ Spend Less, Reason Better: Budget-Aware Value Tree Search for LLM Agents
[USA]
arXiv:2603.12634v1 Announce Type: cross Abstract: Test-time scaling has become a dominant paradigm for improving LLM agent reliability, yet current approaches treat compute as an abundant resource, a...
Related: #AI Efficiency -
πΊπΈ Improving LLM Performance Through Black-Box Online Tuning: A Case for Adding System Specs to Factsheets for Trusted AI
[USA]
arXiv:2603.11340v1 Announce Type: new Abstract: In this paper, we present a novel black-box online controller that uses only end-to-end measurements over short segments, without internal instrumentat...
Related: #AI Transparency -
πΊπΈ LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation
[USA]
arXiv:2603.10899v1 Announce Type: cross Abstract: Transformer-based large language models (LLMs) rely on key-value (KV) caching to avoid redundant computation during autoregressive inference. While t...
Related: #AI Efficiency -
πΊπΈ ARKV: Adaptive and Resource-Efficient KV Cache Management under Limited Memory Budget for Long-Context Inference in LLMs
[USA]
arXiv:2603.08727v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly deployed in scenarios demanding ultra-long context reasoning, such as agentic workflows and deep resear...
Related: #Memory Management -
πΊπΈ Zipage: Maintain High Request Concurrency for LLM Reasoning through Compressed PagedAttention
[USA]
arXiv:2603.08743v1 Announce Type: cross Abstract: With reasoning becoming the generative paradigm for large language models (LLMs), the memory bottleneck caused by KV cache during the decoding phase ...
Related: #Memory Efficiency -
πΊπΈ ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation
[USA]
arXiv:2603.04436v1 Announce Type: cross Abstract: Federated fine-tuning of large language models (LLMs) enables collaborative tuning across distributed clients. However, due to the large size of LLMs...
Related: #Federated Learning -
πΊπΈ Adaptive Memory Admission Control for LLM Agents
[USA]
arXiv:2603.04549v1 Announce Type: new Abstract: LLM-based agents increasingly rely on long-term memory to support multi-session reasoning and interaction, yet current systems provide little control o...
Related: #AI Memory Management -
πΊπΈ Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm
[USA]
arXiv:2603.04799v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used for semantic query processing over large corpora. A set of semantic operators derived from relatio...
Related: #AI Efficiency -
πΊπΈ Towards Autonomous Memory Agents
[USA]
arXiv:2602.22406v1 Announce Type: new Abstract: Recent memory agents improve LLMs by extracting experiences and conversation history into an external storage. This enables low-overhead context assemb...
Related: #Artificial Intelligence, #Memory Systems -
πΊπΈ SCOPE: Selective Conformal Optimized Pairwise LLM Judging
[USA]
arXiv:2602.13110v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used as judges to replace costly human preference labels in pairwise evaluation. Despite their practica...
Related: #AI Evaluation, #Statistical Guarantees
Key Entities (4)
- Large language model (2 news)
- Artificial intelligence (1 news)
- Calibration (statistics) (1 news)
- Exchangeable random variables (1 news)
About the topic: LLM Optimization
The topic "LLM Optimization" aggregates 17+ news articles from various countries.