#Machine Learning Optimization
Latest news articles tagged with "Machine Learning Optimization". Follow the timeline of events, related topics, and entities.
Articles (6)
-
🇺🇸 S2O: Early Stopping for Sparse Attention via Online Permutation
[USA]
arXiv:2602.22575v1 Announce Type: cross Abstract: Attention scales quadratically with sequence length, fundamentally limiting long-context inference. Existing block-granularity sparsification can red...
Related: #Attention Mechanisms, #Computational Efficiency -
🇺🇸 Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences
[USA]
arXiv:2602.21585v1 Announce Type: cross Abstract: Many applications seek to optimize LLM outputs at test time by iteratively proposing, scoring, and refining candidates over a discrete output space. ...
Related: #Large Language Models, #Reward-Free AI Systems -
🇺🇸 Semantic Parallelism: Redefining Efficient MoE Inference via Model-Data Co-Scheduling
[USA]
arXiv:2503.04398v4 Announce Type: replace-cross Abstract: Prevailing LLM serving engines employ expert parallelism (EP) to implement multi-device inference of massive MoE models. However, the efficie...
Related: #Distributed Computing, #LLM Efficiency -
🇺🇸 PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization
[USA]
arXiv:2602.20676v1 Announce Type: cross Abstract: In search systems, effectively coordinating the two core objectives of search relevance matching and click-through rate (CTR) prediction is crucial f...
Related: #Information Retrieval, #User Experience Enhancement -
🇺🇸 Predictive Batch Scheduling: Accelerating Language Model Training Through Loss-Aware Sample Prioritization
[USA]
arXiv:2602.17066v1 Announce Type: new Abstract: We introduce Predictive Batch Scheduling (PBS), a novel training optimization technique that accelerates language model convergence by dynamically prio...
Related: #Curriculum Learning, #Transformer Training, #Compute‑Efficiency, #Feature‑Based Prediction -
🇺🇸 QuEPT: Quantized Elastic Precision Transformers with One-Shot Calibration for Multi-Bit Switching
[USA]
arXiv:2602.12609v1 Announce Type: cross Abstract: Elastic precision quantization enables multi-bit deployment via a single optimization pass, fitting diverse quantization scenarios.Yet, the high stor...
Related: #Quantization Techniques, #Large Language Models