Texo: Formula Recognition within 20M Parameters
#Texo #formula recognition #20 million parameters #model size reduction #Unity MERNet-T #PPFormulaNet-S #attentional architecture #knowledge distillation #vocabulary transfer #tokenizer transfer #real‑time inference #consumer‑grade hardware #in‑browser deployment #web application demo
📌 Key Takeaways
- Texo contains only 20 million parameters, a drastic reduction compared to larger contemporaries.
- The model matches the performance of UniMERNet‑T and PPFormulaNet‑S, the current state‑of‑the‑art models.
- Its size shrinks by 80% relative to UniMERNet‑T and 65% relative to PPFormulaNet‑S.
- Model design incorporates attentional architecture, distillation, and vocabulary/tokenizer transfer.
- Texo permits real‑time inference on standard consumer hardware and can be deployed directly in web browsers via a developed demonstration application.
📖 Full Retelling
🏷️ Themes
Artificial Intelligence, Computer Vision and Pattern Recognition, Model Compression, Real‑time Inference, Formula Recognition, In‑browser AI Deployment, Model Distillation & Transfer Learning
Entity Intersection Graph
No entity connections available yet for this article.
Deep Analysis
Why It Matters
Texo demonstrates that high-performance formula recognition can be achieved with a model as small as 20 million parameters, making real-time inference possible on consumer hardware and in browsers. This reduces deployment costs and expands accessibility for researchers and developers who previously required large models.
Context & Background
- Formula recognition is critical for digitizing scientific documents and educational materials.
- Existing state-of-the-art models like UniMERNet-T and PPFormulaNet-S have tens of millions of parameters, limiting deployment.
- Texo uses attentive design, distillation, and vocabulary transfer to cut model size by 80% and 65% respectively.
- The model supports real-time inference on consumer-grade hardware and includes an in-browser web application.
- The paper was submitted to arXiv on 19 Feb 2026 and is available under DOI 10.48550/arXiv.2602.17189.
What Happens Next
Future work may involve integrating Texo into larger document understanding pipelines and exploring further compression techniques. The authors plan to release pretrained checkpoints and a Hugging Face Space to encourage community contributions.
Frequently Asked Questions
Texo achieves comparable accuracy to larger models while using only 20 million parameters, thanks to attentive design, distillation, and vocabulary transfer.
Yes, its small size allows real-time inference on consumer-grade hardware, including laptops and even in-browser execution.
The authors have released a web application and plan to provide pretrained checkpoints and a Hugging Face Space.