#Model optimization
Latest news articles tagged with "Model optimization". Follow the timeline of events, related topics, and entities.
Articles (2)
-
πΊπΈ KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem
[USA]
arXiv:2602.20217v1 Announce Type: cross Abstract: Self-speculative decoding (SSD) accelerates LLM inference by skipping layers to create an efficient draft model, yet existing methods often rely on s...
Related: #AI acceleration, #Computational efficiency, #Hardware adaptation -
πΊπΈ Vision Token Reduction via Attention-Driven Self-Compression for Efficient Multimodal Large Language Models
[USA]
arXiv:2602.12618v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) incur significant computational cost from processing numerous vision tokens through all LLM layers. Prior pr...
Related: #Computational efficiency, #Multimodal AI