SP
BravenNow
Transformer (deep learning)
🌐 Entity

Transformer (deep learning)

Algorithm for modelling sequential data

📊 Rating

5 news mentions · 👍 0 likes · 👎 0 dislikes

📌 Topics

  • Machine Learning (2)
  • Attention Mechanisms (2)
  • AI Research (1)
  • Computer Vision (1)
  • Scheduling Optimization (1)
  • Machine Learning Optimization (1)
  • Computational Efficiency (1)
  • Sequence Modeling (1)
  • Computational efficiency (1)
  • Multimodal AI (1)
  • Model optimization (1)

🏷️ Keywords

FlashAttention (3) · Transformer architecture (2) · super-resolution (1) · transformer (1) · neural bias (1) · rank-factorized (1) · image processing (1) · scalability (1) · RESCHED (1) · flexible job shop scheduling (1) · simplified states (1) · manufacturing optimization (1) · machine learning (1) · production planning (1) · Sparse Attention (1) · Early Stopping (1) · Online Permutation (1) · Long-Context Inference (1) · Sequence Length (1) · Computational Efficiency (1)

📖 Key Information

In deep learning, the transformer is an artificial neural network architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished. Transformers have the advantage of having no recurrent units, therefore requiring less training time than earlier recurrent neural architectures (RNNs) such as long short-term memory (LSTM).

📰 Related News (5)

🔗 Entity Intersection Graph

Early stopping(1)Computational resource(1)MLP(1)Transformer (deep learning)

People and organizations frequently mentioned alongside Transformer (deep learning):

🔗 External Links