#Language Models
Latest news articles tagged with "Language Models". Follow the timeline of events, related topics, and entities.
Articles (19)
-
πΊπΈ When the Pure Reasoner Meets the Impossible Object: Analytic vs. Synthetic Fine-Tuning and the Suppression of Genesis in Language Models
[USA]
arXiv:2603.19265v1 Announce Type: cross Abstract: This paper investigates the ontological consequences of fine-tuning Large Language Models (LLMs) on "impossible objects" -- entities defined by mutua...
Related: #AI Fine-Tuning -
πΊπΈ Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models
[USA]
arXiv:2603.18002v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have made impressive progress in connecting vision and language, but they still struggle with spatial unders...
Related: #AI Research, #3D Vision -
πΊπΈ Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context
[USA]
arXiv:2603.15653v1 Announce Type: cross Abstract: Long-context handling remains a core challenge for language models: even with extended context windows, models often fail to reliably extract, reason...
Related: #AI Research -
πΊπΈ Think First, Diffuse Fast: Improving Diffusion Language Model Reasoning via Autoregressive Plan Conditioning
[USA]
arXiv:2603.13243v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) generate text via iterative denoising but consistently underperform on multi-step reasoning. We hypothesize thi...
Related: #AI Reasoning -
πΊπΈ Information-Consistent Language Model Recommendations through Group Relative Policy Optimization
[USA]
arXiv:2512.12858v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly deployed in business-critical domains such as finance, education, healthcare, and customer supp...
Related: #AI Alignment -
πΊπΈ Bielik-Minitron-7B: Compressing Large Language Models via Structured Pruning and Knowledge Distillation for the Polish Language
[USA]
arXiv:2603.11881v1 Announce Type: cross Abstract: This report details the creation of Bielik-Minitron-7B, a compressed 7.35B parameter version of the Bielik-11B-v3.0 model, specifically optimized for...
Related: #AI Compression -
πΊπΈ CEI: A Benchmark for Evaluating Pragmatic Reasoning in Language Models
[USA]
arXiv:2603.09993v1 Announce Type: cross Abstract: Pragmatic reasoning, inferring intended meaning beyond literal semantics, underpins everyday communication yet remains difficult for large language m...
Related: #AI Evaluation -
πΊπΈ Evo: Autoregressive-Diffusion Large Language Models with Evolving Balance
[USA]
arXiv:2603.06617v1 Announce Type: cross Abstract: We introduce \textbf{Evo}, a duality latent trajectory model that bridges autoregressive (AR) and diffusion-based language generation within a contin...
Related: #AI Research -
πΊπΈ SpecFuse: Ensembling Large Language Models via Next-Segment Prediction
[USA]
arXiv:2412.07380v3 Announce Type: replace-cross Abstract: Ensembles of generative large language models (LLMs) are a promising way to compensate for individual model limitations, integrating the stre...
Related: #AI Ensembling -
πΊπΈ Progressive Refinement Regulation for Accelerating Diffusion Language Model Decoding
[USA]
arXiv:2603.04514v1 Announce Type: new Abstract: Diffusion language models generate text through iterative denoising under a uniform refinement rule applied to all tokens. However, tokens stabilize at...
Related: #AI Acceleration -
πΊπΈ Med-V1: Small Language Models for Zero-shot and Scalable Biomedical Evidence Attribution
[USA]
arXiv:2603.05308v1 Announce Type: cross Abstract: Assessing whether an article supports an assertion is essential for hallucination detection and claim verification. While large language models (LLMs...
Related: #Biomedical AI -
πΊπΈ Balancing Coverage and Draft Latency in Vocabulary Trimming for Faster Speculative Decoding
[USA]
arXiv:2603.05210v1 Announce Type: cross Abstract: Speculative decoding accelerates inference for Large Language Models by using a lightweight draft model to propose candidate tokens that are verified...
Related: #AI Efficiency -
πΊπΈ Free Lunch for Pass@$k$? Low Cost Diverse Sampling for Diffusion Language Models
[USA]
arXiv:2603.04893v1 Announce Type: cross Abstract: Diverse outputs in text generation are necessary for effective exploration in complex reasoning tasks, such as code generation and mathematical probl...
Related: #AI Sampling -
πΊπΈ UpSkill: Mutual Information Skill Learning for Structured Response Diversity in LLMs
[USA]
arXiv:2602.22296v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has improved the reasoning abilities of large language models (LLMs) on mathematics and program...
Related: #Machine Learning, #Artificial Intelligence -
πΊπΈ A Survey on the Optimization of Large Language Model-based Agents
[USA]
arXiv:2503.12434v2 Announce Type: replace Abstract: With the rapid development of Large Language Models (LLMs), LLM-based agents have been widely adopted in various fields, becoming essential for aut...
Related: #Artificial Intelligence, #Machine Learning -
πΊπΈ Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning
[USA]
arXiv:2602.20722v1 Announce Type: new Abstract: Traditional on-policy Reinforcement Learning with Verifiable Rewards (RLVR) frameworks suffer from experience waste and reward homogeneity, which direc...
Related: #Artificial Intelligence, #Machine Learning, #Reinforcement Learning -
πΊπΈ Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR
[USA]
arXiv:2602.12642v1 Announce Type: cross Abstract: Reward-maximizing RL methods enhance the reasoning performance of LLMs, but often reduce the diversity among outputs. Recent works address this issue...
Related: #Machine Learning, #Reinforcement Learning -
πΊπΈ Designing RNAs with Language Models
[USA]
arXiv:2602.12470v1 Announce Type: cross Abstract: RNA design, the task of finding a sequence that folds into a target secondary structure, has broad biological and biomedical impact but remains compu...
Related: #Computational Biology, #RNA Design, #Biomedical Applications -
πΊπΈ CtrlCoT: Dual-Granularity Chain-of-Thought Compression for Controllable Reasoning
[USA]
arXiv:2601.20467v1 Announce Type: new Abstract: Chain-of-thought (CoT) prompting improves LLM reasoning but incurs high latency and memory cost due to verbose traces, motivating CoT compression with ...
Related: #Artificial Intelligence, #Computational Efficiency
Key Entities (17)
- Artificial intelligence (5 news)
- Large language model (3 news)
- Data efficiency (1 news)
- Policy gradient method (1 news)
- ACM Computing Surveys (1 news)
- AI agent (1 news)
- Computational biology (1 news)
- Bioinformatics (1 news)
- Polish language (1 news)
- Free lunch (1 news)
- Evo (1 news)
- Partition function (1 news)
- Reinforcement learning (1 news)
- Mutual information (1 news)
- Artificial intelligence in healthcare (1 news)
- Impossible Object (1 news)
- Machine learning (1 news)
About the topic: Language Models
The topic "Language Models" aggregates 19+ news articles from various countries.