2/27/2026 | USA | technology | ✓ Verified - arxiv.org

Transformers converge to invariant algorithmic cores

#Transformers #Algorithmic cores #Large language models #Machine learning #Neural networks #AI interpretability #Computational invariants

📌 Key Takeaways

Transformer models converge to shared algorithmic cores despite different training runs
Algorithmic cores are compact subspaces necessary and sufficient for task performance
Research reveals low-dimensional computational invariants across different model scales
Understanding these cores could advance mechanistic interpretability in AI systems

📖 Full Retelling

Researcher Joshua S. Schiffman published a groundbreaking study on arXiv on February 26, 2026, revealing that independently trained transformer models converge to shared algorithmic cores, addressing the fundamental challenge in understanding how large language models work internally. The study, titled 'Transformers converge to invariant algorithmic cores,' tackles a central problem in machine learning: while large language models exhibit sophisticated capabilities, understanding their internal mechanisms remains difficult. Training these models selects for behavior rather than specific circuitry, meaning many different weight configurations can implement the same function. Schiffman's research extracts 'algorithmic cores' – compact subspaces that are both necessary and sufficient for task performance – demonstrating that these computational structures persist across different training runs. The research presents several compelling findings: independently trained transformers learn different weights but converge to the same algorithmic cores; Markov-chain transformers embed 3D cores in nearly orthogonal subspaces yet recover identical transition spectra; modular-addition transformers discover compact cyclic operators that later inflate, providing a predictive model of the memorization-to-generalization transition; and GPT-2 language models govern subject-verb agreement through a single axis that, when flipped, inverts grammatical number throughout generation. These results reveal low-dimensional invariants that persist across training runs and scales, suggesting transformer computations are organized around compact, shared algorithmic structures rather than implementation-specific details.

🏷️ Themes

Machine Learning, Neural Networks, AI Interpretability

📚 Related People & Topics

Neural network

Structure in biology and artificial intelligence

A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or mathematical models. While individual neurons are simple, many of them together in a network can perform complex tasks.

View Profile → Wikipedia ↗

Large language model

Type of machine learning model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...

View Profile → Wikipedia ↗

Machine learning

Study of algorithms that improve automatically through experience

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Within a subdiscipline in machine learning, advances i...

View Profile → Wikipedia ↗

Transformers

Japanese–American media franchise

Transformers is a media franchise produced by American toy company Hasbro and Japanese toy company Takara Tomy. It primarily follows the heroic Autobots and the villainous Decepticons, two alien robot factions at war that can transform into other forms, such as vehicles and animals. The franchise en...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Neural network:

🌐 Deep learning 3 shared

🌐 Mechanistic interpretability 2 shared

🌐 Interpretability 2 shared

🌐 Explainable artificial intelligence 2 shared

🌐 Mixture of experts 1 shared

View full profile

Mentioned Entities

Neural network

Structure in biology and artificial intelligence

Large language model

Type of machine learning model

Machine learning

Study of algorithms that improve automatically through experience

Transformers

Japanese–American media franchise

}

Original Source

              --> Computer Science > Machine Learning arXiv:2602.22600 [Submitted on 26 Feb 2026] Title: Transformers converge to invariant algorithmic cores Authors: Joshua S. Schiffman View a PDF of the paper titled Transformers converge to invariant algorithmic cores, by Joshua S. Schiffman View PDF HTML Abstract: Large language models exhibit sophisticated capabilities, yet understanding how they work internally remains a central challenge. A fundamental obstacle is that training selects for behavior, not circuitry, so many weight configurations can implement the same function. Which internal structures reflect the computation, and which are accidents of a particular training run? This work extracts algorithmic cores: compact subspaces necessary and sufficient for task performance. Independently trained transformers learn different weights but converge to the same cores. Markov-chain transformers embed 3D cores in nearly orthogonal subspaces yet recover identical transition spectra. Modular-addition transformers discover compact cyclic operators at grokking that later inflate, yielding a predictive model of the memorization-to-generalization transition. GPT-2 language models govern subject-verb agreement through a single axis that, when flipped, inverts grammatical number throughout generation across scales. These results reveal low-dimensional invariants that persist across training runs and scales, suggesting that transformer computations are organized around compact, shared algorithmic structures. Mechanistic interpretability could benefit from targeting such invariants -- the computational essence -- rather than implementation-specific details. Subjects: Machine Learning (cs.LG) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2602.22600 [cs.LG] (or arXiv:2602.22600v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.22600 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Joshua Schiffman [ view email ] [v1] Thu,...
            

Read full article at source

Source

arxiv.org

Transformers converge to invariant algorithmic cores

📌 Key Takeaways

📖 Full Retelling

🏷️ Themes

📚 Related People & Topics

Neural network

Large language model

Machine learning

Transformers

Entity Intersection Graph

Mentioned Entities

Neural network

Large language model

Machine learning

Transformers

Source

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine