#Model Optimization
Latest news articles tagged with "Model Optimization". Follow the timeline of events, related topics, and entities.
Articles (30)
-
πΊπΈ TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models
[USA]
arXiv:2604.06291v1 Announce Type: cross Abstract: Low-Rank Adaptation (LoRA) enables parameter-efficient fine-tuning of Large Language Models (LLMs), and recent Mixture-of-Experts (MoE) extensions fu...
Related: #Artificial Intelligence, #Machine Learning -
πΊπΈ An Empirical Study of SFT-DPO Interaction and Parameterization in Small Language Models
[USA]
arXiv:2603.20100v1 Announce Type: cross Abstract: Direct Preference Optimization (DPO) is widely used after supervised fine-tuning (SFT) to align language models, yet empirical behavior under small b...
Related: #AI Training -
πΊπΈ Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
[USA]
arXiv:2603.19266v1 Announce Type: cross Abstract: Distilling robust reasoning capabilities from large language models (LLMs) into smaller, computationally efficient student models remains an unresolv...
Related: #AI Research -
πΊπΈ Speculating Experts Accelerates Inference for Mixture-of-Experts
[USA]
arXiv:2603.19289v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) models have gained popularity as a means of scaling the capacity of large language models (LLMs) while maintaining sparse ac...
Related: #AI Inference -
πΊπΈ LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models
[USA]
arXiv:2603.19255v1 Announce Type: cross Abstract: Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise control of output length remains a per...
Related: #AI Research -
πΊπΈ TARo: Token-level Adaptive Routing for LLM Test-time Alignment
[USA]
arXiv:2603.18411v1 Announce Type: cross Abstract: Large language models (LLMs) exhibit strong reasoning capabilities but typically require expensive post-training to reach high performance. Recent te...
Related: #AI Alignment -
πΊπΈ Empirical Recipes for Efficient and Compact Vision-Language Models
[USA]
arXiv:2603.16987v1 Announce Type: cross Abstract: Deploying vision-language models (VLMs) in resource-constrained settings demands low latency and high throughput, yet existing compact VLMs often fal...
Related: #AI Efficiency -
πΊπΈ Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
[USA]
arXiv:2603.17044v1 Announce Type: cross Abstract: Unified multimodal models share a language model backbone for both understanding and generating images. Can DPO align both capabilities simultaneousl...
Related: #Multimodal AI -
πΊπΈ RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
[USA]
arXiv:2603.17891v1 Announce Type: cross Abstract: Post training quantization is essential for deploying large language models (LLMs) on resource constrained hardware, yet state of the art methods enf...
Related: #AI Efficiency -
πΊπΈ Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models
[USA]
arXiv:2603.13985v1 Announce Type: new Abstract: Pre-trained Large Language Model (LLM) exhibits broad capabilities, yet, for specific tasks or domains their attainment of higher accuracy and more rel...
Related: #AI Training -
πΊπΈ CA-HFP: Curvature-Aware Heterogeneous Federated Pruning with Model Reconstruction
[USA]
arXiv:2603.12591v1 Announce Type: cross Abstract: Federated learning on heterogeneous edge devices requires personalized compression while preserving aggregation compatibility and stable convergence....
Related: #Federated Learning -
πΊπΈ Expert Threshold Routing for Autoregressive Language Modeling with Dynamic Computation Allocation and Load Balancing
[USA]
arXiv:2603.11535v1 Announce Type: new Abstract: Token-choice Mixture-of-Experts (TC-MoE) routes each token to a fixed number of experts, limiting dynamic computation allocation and requiring auxiliar...
Related: #AI Efficiency -
πΊπΈ Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models
[USA]
arXiv:2603.10195v1 Announce Type: cross Abstract: Large Language Models frequently generate fluent but factually incorrect text. We propose Adaptive Activation Cancellation (AAC), a real-time inferen...
Related: #AI Safety -
πΊπΈ ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping
[USA]
arXiv:2603.10088v1 Announce Type: cross Abstract: Diffusion large language models (dLLMs) are emerging as a promising alternative to autoregressive models (ARMs) due to their ability to capture bidir...
Related: #AI Efficiency -
πΊπΈ Mashup Learning: Faster Finetuning by Remixing Past Checkpoints
[USA]
arXiv:2603.10156v1 Announce Type: cross Abstract: Finetuning on domain-specific data is a well-established method for enhancing LLM performance on downstream tasks. Training on each dataset produces ...
Related: #Machine Learning -
πΊπΈ Correction of Transformer-Based Models with Smoothing Pseudo-Projector
[USA]
arXiv:2603.09815v1 Announce Type: cross Abstract: The pseudo-projector is a lightweight modification that can be integrated into existing language models and other neural networks without altering th...
Related: #AI Correction -
πΊπΈ Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning
[USA]
arXiv:2512.15943v2 Announce Type: replace Abstract: As organizations scale adoption of generative AI, model cost optimization and operational efficiency have emerged as critical factors determining s...
Related: #AI Efficiency -
πΊπΈ Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates
[USA]
arXiv:2603.08914v1 Announce Type: cross Abstract: Over-parameterized neural networks incur prohibitive memory and computational costs for resource-constrained deployment. The Strong Lottery Ticket (S...
Related: #Neural Networks -
πΊπΈ Exploiting Label-Aware Channel Scoring for Adaptive Channel Pruning in Split Learning
[USA]
arXiv:2603.09792v1 Announce Type: cross Abstract: Split learning (SL) transfers most of the training workload to the server, which alleviates computational burden on client devices. However, the tran...
Related: #Machine Learning -
πΊπΈ Improving instruction hierarchy in frontier LLMs
[USA]
IH-Challenge trains models to prioritize trusted instructions, improving instruction hierarchy, safety steerability, and resistance to prompt injection attacks.
Related: #AI Development -
πΊπΈ Improved Constrained Generation by Bridging Pretrained Generative Models
[USA]
arXiv:2603.06742v1 Announce Type: cross Abstract: Constrained generative modeling is fundamental to applications such as robotic control and autonomous driving, where models must respect physical law...
Related: #AI Generation -
πΊπΈ Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection
[USA]
arXiv:2603.06745v1 Announce Type: cross Abstract: Large Language Models (LLMs), despite advances in instruction tuning, often fail to follow complex user instructions. Activation steering techniques ...
Related: #AI Safety -
πΊπΈ Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration
[USA]
arXiv:2603.06001v1 Announce Type: cross Abstract: Vision-Language-Action (VLA) models enable robots to perform manipulation tasks directly from natural language instructions and are increasingly view...
Related: #AI Research -
πΊπΈ Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering
[USA]
arXiv:2505.12189v2 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal logical validity. This can lead to wrong i...
Related: #AI Bias -
πΊπΈ Stable-LoRA: Stabilizing Feature Learning of Low-Rank Adaptation
[USA]
arXiv:2603.05204v1 Announce Type: cross Abstract: Low-Rank Adaptation (LoRA) is a widely adopted parameter-efficient method for fine-tuning Large Langauge Models. It updates the weight matrix as $W=W...
Related: #Machine Learning -
πΊπΈ Thin Keys, Full Values: Reducing KV Cache via Low-Dimensional Attention Selection
[USA]
arXiv:2603.04427v1 Announce Type: cross Abstract: Standard transformer attention uses identical dimensionality for queries, keys, and values ($d_q = d_k = d_v = \dmodel$). Our insight is that these c...
Related: #AI Efficiency -
πΊπΈ Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction
[USA]
arXiv:2602.23315v1 Announce Type: new Abstract: An artificial intelligence (AI) model can be viewed as a function that maps inputs to outputs in high-dimensional spaces. Once designed and well traine...
Related: #Artificial Intelligence, #Uncertainty Reduction -
πΊπΈ Elimination-compensation pruning for fully-connected neural networks
[USA]
arXiv:2602.20467v1 Announce Type: cross Abstract: The unmatched ability of Deep Neural Networks in capturing complex patterns in large and noisy datasets is often associated with their large hypothes...
Related: #Machine Learning, #Neural Networks -
πΊπΈ Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
[USA]
arXiv:2602.15772v1 Announce Type: cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities often comes at the expense of understanding, and ...
Related: #Multimodal AI, #Generation vs. Understanding, #Reasoning and Reflection, #Tradeβoff Analysis -
πΊπΈ Investigating Redundancy in Multimodal Large Language Models with Multiple Vision Encoders
[USA]
arXiv:2507.03262v4 Announce Type: replace-cross Abstract: Recent multimodal large language models (MLLMs) increasingly integrate multiple vision encoders to improve performance on various benchmarks,...
Related: #AI Efficiency, #Multimodal Learning
Key Entities (13)
- Large language model (5 news)
- DPO (2 news)
- Generative engine optimization (2 news)
- Artificial intelligence (2 news)
- SFT (1 news)
- Ramp (disambiguation) (1 news)
- Uncertainty quantification (1 news)
- Resampling (1 news)
- Machine learning (1 news)
- Reinforcement learning (1 news)
- Redundancy (1 news)
- Deep learning (1 news)
- AI safety (1 news)
About the topic: Model Optimization
The topic "Model Optimization" aggregates 30+ news articles from various countries.