#Model Optimization

Latest news articles tagged with "Model Optimization". Follow the timeline of events, related topics, and entities.

Articles (30)

🇺🇸 TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models — 09/04/2026 [USA]
arXiv:2604.06291v1 Announce Type: cross Abstract: Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of Large Language Models (LLMs), and recent Mixture-of-Experts (MoE) extensions fu...
Related: #Artificial Intelligence, #Machine Learning
🇺🇸 An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models — 23/03/2026 [USA]
arXiv:2603.20100v1 Announce Type: cross Abstract: Direct Preference Optimization (DPO) is widely used after supervised fine-tuning (SFT) to align language models, yet empirical behavior under small b...
Related: #AI Training
🇺🇸 Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion — 23/03/2026 [USA]
arXiv:2603.19266v1 Announce Type: cross Abstract: Distilling robust reasoning capabilities from large language models (LLMs) into smaller, computationally efficient student models remains an unresolv...
Related: #AI Research
🇺🇸 Speculating Experts Accelerates Inference for Mixture-of-Experts — 23/03/2026 [USA]
arXiv:2603.19289v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) models have gained popularity as a means of scaling the capacity of large language models (LLMs) while maintaining sparse ac...
Related: #AI Inference
🇺🇸 LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models — 23/03/2026 [USA]
arXiv:2603.19255v1 Announce Type: cross Abstract: Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise control of output length remains a per...
Related: #AI Research
🇺🇸 TARo: Token-level Adaptive Routing for LLM Test-time Alignment — 20/03/2026 [USA]
arXiv:2603.18411v1 Announce Type: cross Abstract: Large language models (LLMs) exhibit strong reasoning capabilities but typically require expensive post-training to reach high performance. Recent te...
Related: #AI Alignment
🇺🇸 Empirical Recipes for Efficient and Compact Vision-Language Models — 19/03/2026 [USA]
arXiv:2603.16987v1 Announce Type: cross Abstract: Deploying vision-language models (VLMs) in resource-constrained settings demands low latency and high throughput, yet existing compact VLMs often fal...
Related: #AI Efficiency
🇺🇸 Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models — 19/03/2026 [USA]
arXiv:2603.17044v1 Announce Type: cross Abstract: Unified multimodal models share a language model backbone for both understanding and generating images. Can DPO align both capabilities simultaneousl...
Related: #Multimodal AI
🇺🇸 RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference — 19/03/2026 [USA]
arXiv:2603.17891v1 Announce Type: cross Abstract: Post training quantization is essential for deploying large language models (LLMs) on resource constrained hardware, yet state of the art methods enf...
Related: #AI Efficiency
🇺🇸 Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models — 17/03/2026 [USA]
arXiv:2603.13985v1 Announce Type: new Abstract: Pre-trained Large Language Model (LLM) exhibits broad capabilities, yet, for specific tasks or domains their attainment of higher accuracy and more rel...
Related: #AI Training
🇺🇸 CA-HFP: Curvature-Aware Heterogeneous Federated Pruning with Model Reconstruction — 16/03/2026 [USA]
arXiv:2603.12591v1 Announce Type: cross Abstract: Federated learning on heterogeneous edge devices requires personalized compression while preserving aggregation compatibility and stable convergence....
Related: #Federated Learning
🇺🇸 Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing — 13/03/2026 [USA]
arXiv:2603.11535v1 Announce Type: new Abstract: Token-choice Mixture-of-Experts (TC-MoE) routes each token to a fixed number of experts, limiting dynamic computation allocation and requiring auxiliar...
Related: #AI Efficiency
🇺🇸 Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models — 12/03/2026 [USA]
arXiv:2603.10195v1 Announce Type: cross Abstract: Large Language Models frequently generate fluent but factually incorrect text. We propose Adaptive Activation Cancellation (AAC), a real-time inferen...
Related: #AI Safety
🇺🇸 ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping — 12/03/2026 [USA]
arXiv:2603.10088v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) are emerging as a promising alternative to autoregressive models (ARMs) due to their ability to capture bidir...
Related: #AI Efficiency
🇺🇸 Mashup Learning: Faster Finetuning by Remixing Past Checkpoints — 12/03/2026 [USA]
arXiv:2603.10156v1 Announce Type: cross Abstract: Finetuning on domain-specific data is a well-established method for enhancing LLM performance on downstream tasks. Training on each dataset produces ...
Related: #Machine Learning
🇺🇸 Correction of Transformer-Based Models with Smoothing Pseudo-Projector — 11/03/2026 [USA]
arXiv:2603.09815v1 Announce Type: cross Abstract: The pseudo-projector is a lightweight modification that can be integrated into existing language models and other neural networks without altering th...
Related: #AI Correction
🇺🇸 Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning — 11/03/2026 [USA]
arXiv:2512.15943v2 Announce Type: replace Abstract: As organizations scale adoption of generative AI, model cost optimization and operational efficiency have emerged as critical factors determining s...
Related: #AI Efficiency
🇺🇸 Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates — 11/03/2026 [USA]
arXiv:2603.08914v1 Announce Type: cross Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (S...
Related: #Neural Networks
🇺🇸 Exploiting Label-Aware Channel Scoring for Adaptive Channel Pruning in Split Learning — 11/03/2026 [USA]
arXiv:2603.09792v1 Announce Type: cross Abstract: Split learning (SL) transfers most of the training workload to the server, which alleviates computational burden on client devices. However, the tran...
Related: #Machine Learning
🇺🇸 Improving instruction hierarchy in frontier LLMs — 10/03/2026 [USA]
IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks.
Related: #AI Development
🇺🇸 Improved Constrained Generation by Bridging Pretrained Generative Models — 10/03/2026 [USA]
arXiv:2603.06742v1 Announce Type: cross Abstract: Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical law...
Related: #AI Generation
🇺🇸 Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection — 10/03/2026 [USA]
arXiv:2603.06745v1 Announce Type: cross Abstract: Large Language Models (LLMs), despite advances in instruction tuning, often fail to follow complex user instructions. Activation steering techniques ...
Related: #AI Safety
🇺🇸 Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration — 09/03/2026 [USA]
arXiv:2603.06001v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models enable robots to perform manipulation tasks directly from natural language instructions and are increasingly view...
Related: #AI Research
🇺🇸 Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering — 09/03/2026 [USA]
arXiv:2505.12189v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal logical validity. This can lead to wrong i...
Related: #AI Bias
🇺🇸 Stable-LoRA: Stabilizing Feature Learning of Low-Rank Adaptation — 06/03/2026 [USA]
arXiv:2603.05204v1 Announce Type: cross Abstract: Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient method for fine-tuning Large Langauge Models. It updates the weight matrix as $W=W...
Related: #Machine Learning
🇺🇸 Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection — 06/03/2026 [USA]
arXiv:2603.04427v1 Announce Type: cross Abstract: Standard transformer attention uses identical dimensionality for queries, keys, and values ($d_q = d_k = d_v = \dmodel$). Our insight is that these c...
Related: #AI Efficiency
🇺🇸 Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction — 27/02/2026 [USA]
arXiv:2602.23315v1 Announce Type: new Abstract: An artificial intelligence (AI) model can be viewed as a function that maps inputs to outputs in high-dimensional spaces. Once designed and well traine...
Related: #Artificial Intelligence, #Uncertainty Reduction
🇺🇸 Elimination-compensation pruning for fully-connected neural networks — 25/02/2026 [USA]
arXiv:2602.20467v1 Announce Type: cross Abstract: The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothes...
Related: #Machine Learning, #Neural Networks
🇺🇸 Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models — 18/02/2026 [USA]
arXiv:2602.15772v1 Announce Type: cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities often comes at the expense of understanding, and ...
Related: #Multimodal AI, #Generation vs. Understanding, #Reasoning and Reflection, #Trade‑off Analysis
🇺🇸 Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders — 16/02/2026 [USA]
arXiv:2507.03262v4 Announce Type: replace-cross Abstract: Recent multimodal large language models (MLLMs) increasingly integrate multiple vision encoders to improve performance on various benchmarks,...
Related: #AI Efficiency, #Multimodal Learning

Key Entities (13)

About the topic: Model Optimization

The topic "Model Optimization" aggregates 30+ news articles from various countries.