Transformer (deep learning)
Algorithm for modelling sequential data
📊 Rating
5 news mentions · 👍 0 likes · 👎 0 dislikes
📌 Topics
- Machine Learning (2)
- Attention Mechanisms (2)
- AI Research (1)
- Computer Vision (1)
- Scheduling Optimization (1)
- Machine Learning Optimization (1)
- Computational Efficiency (1)
- Sequence Modeling (1)
- Computational efficiency (1)
- Multimodal AI (1)
- Model optimization (1)
🏷️ Keywords
FlashAttention (3) · Transformer architecture (2) · super-resolution (1) · transformer (1) · neural bias (1) · rank-factorized (1) · image processing (1) · scalability (1) · RESCHED (1) · flexible job shop scheduling (1) · simplified states (1) · manufacturing optimization (1) · machine learning (1) · production planning (1) · Sparse Attention (1) · Early Stopping (1) · Online Permutation (1) · Long-Context Inference (1) · Sequence Length (1) · Computational Efficiency (1)
📖 Key Information
📰 Related News (5)
-
🇺🇸 Rank-Factorized Implicit Neural Bias: Scaling Super-Resolution Transformer with FlashAttention
arXiv:2603.06738v1 Announce Type: cross Abstract: Recent Super-Resolution~(SR) methods mainly adopt Transformers for their strong long-range modeling...
-
🇺🇸 RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States
arXiv:2603.07020v1 Announce Type: cross Abstract: Neural approaches to the Flexible Job Shop Scheduling Problem (FJSP), particularly those based on d...
-
🇺🇸 S2O: Early Stopping for Sparse Attention via Online Permutation
arXiv:2602.22575v1 Announce Type: cross Abstract: Attention scales quadratically with sequence length, fundamentally limiting long-context inference....
-
🇺🇸 HyperMLP: An Integrated Perspective for Sequence Modeling
arXiv:2602.12601v1 Announce Type: cross Abstract: Self-attention is often viewed as probabilistic query-key lookup, motivating designs that preserve ...
-
🇺🇸 Vision Token Reduction via Attention-Driven Self-Compression for Efficient Multimodal Large Language Models
arXiv:2602.12618v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) incur significant computational cost from processing numer...
🔗 Entity Intersection Graph
People and organizations frequently mentioned alongside Transformer (deep learning):
-
🌐
Early stopping · 1 shared articles
-
🌐
Computational resource · 1 shared articles
-
🌐
MLP · 1 shared articles