#Model optimization

Latest news articles tagged with "Model optimization". Follow the timeline of events, related topics, and entities.

Articles (2)

🇺🇸 KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem — 25/02/2026 [USA]
arXiv:2602.20217v1 Announce Type: cross Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on s...
Related: #AI acceleration, #Computational efficiency, #Hardware adaptation
🇺🇸 Vision Token Reduction via Attention-Driven Self-Compression for Efficient Multimodal Large Language Models — 16/02/2026 [USA]
arXiv:2602.12618v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) incur significant computational cost from processing numerous vision tokens through all LLM layers. Prior pr...
Related: #Computational efficiency, #Multimodal AI