SP
BravenNow
Texo: Formula Recognition within 20M Parameters
| USA | technology | ✓ Verified - arxiv.org

Texo: Formula Recognition within 20M Parameters

#Texo #formula recognition #20 million parameters #model size reduction #Unity MERNet-T #PPFormulaNet-S #attentional architecture #knowledge distillation #vocabulary transfer #tokenizer transfer #real‑time inference #consumer‑grade hardware #in‑browser deployment #web application demo

📌 Key Takeaways

  • Texo contains only 20 million parameters, a drastic reduction compared to larger contemporaries.
  • The model matches the performance of UniMERNet‑T and PPFormulaNet‑S, the current state‑of‑the‑art models.
  • Its size shrinks by 80% relative to UniMERNet‑T and 65% relative to PPFormulaNet‑S.
  • Model design incorporates attentional architecture, distillation, and vocabulary/tokenizer transfer.
  • Texo permits real‑time inference on standard consumer hardware and can be deployed directly in web browsers via a developed demonstration application.

📖 Full Retelling

Sicheng Mao, a researcher in Computer Science, announced Texo—an ultra‑compact formula recognition model based on 20 million parameters—on arXiv’s Artificial Intelligence section on 19 February 2026. The work was motivated by the need to achieve near state‑of‑the‑art performance while enabling real‑time inference on consumer‑grade devices and even in‑browser execution, thereby broadening access to advanced mathematical content recognition tools.

🏷️ Themes

Artificial Intelligence, Computer Vision and Pattern Recognition, Model Compression, Real‑time Inference, Formula Recognition, In‑browser AI Deployment, Model Distillation & Transfer Learning

Entity Intersection Graph

No entity connections available yet for this article.

Deep Analysis

Why It Matters

Texo demonstrates that high-performance formula recognition can be achieved with a model as small as 20 million parameters, making real-time inference possible on consumer hardware and in browsers. This reduces deployment costs and expands accessibility for researchers and developers who previously required large models.

Context & Background

  • Formula recognition is critical for digitizing scientific documents and educational materials.
  • Existing state-of-the-art models like UniMERNet-T and PPFormulaNet-S have tens of millions of parameters, limiting deployment.
  • Texo uses attentive design, distillation, and vocabulary transfer to cut model size by 80% and 65% respectively.
  • The model supports real-time inference on consumer-grade hardware and includes an in-browser web application.
  • The paper was submitted to arXiv on 19 Feb 2026 and is available under DOI 10.48550/arXiv.2602.17189.

What Happens Next

Future work may involve integrating Texo into larger document understanding pipelines and exploring further compression techniques. The authors plan to release pretrained checkpoints and a Hugging Face Space to encourage community contributions.

Frequently Asked Questions

What makes Texo different from other formula recognition models?

Texo achieves comparable accuracy to larger models while using only 20 million parameters, thanks to attentive design, distillation, and vocabulary transfer.

Can Texo run on a typical laptop?

Yes, its small size allows real-time inference on consumer-grade hardware, including laptops and even in-browser execution.

Is the model available for public use?

The authors have released a web application and plan to provide pretrained checkpoints and a Hugging Face Space.

Original Source
--> Computer Science > Artificial Intelligence arXiv:2602.17189 [Submitted on 19 Feb 2026] Title: Texo: Formula Recognition within 20M Parameters Authors: Sicheng Mao View a PDF of the paper titled Texo: Formula Recognition within 20M Parameters, by Sicheng Mao View PDF HTML Abstract: In this paper we present Texo, a minimalist yet highperformance formula recognition model that contains only 20 million parameters. By attentive design, distillation and transfer of the vocabulary and the tokenizer, Texo achieves comparable performance to state-of-the-art models such as UniMERNet-T and PPFormulaNet-S, while reducing the model size by 80% and 65%, respectively. This enables real-time inference on consumer-grade hardware and even in-browser deployment. We also developed a web application to demonstrate the model capabilities and facilitate its usage for end users. Subjects: Artificial Intelligence (cs.AI) ; Computer Vision and Pattern Recognition (cs.CV) Cite as: arXiv:2602.17189 [cs.AI] (or arXiv:2602.17189v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2602.17189 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Sicheng Mao [ view email ] [v1] Thu, 19 Feb 2026 09:14:32 UTC (327 KB) Full-text links: Access Paper: View a PDF of the paper titled Texo: Formula Recognition within 20M Parameters, by Sicheng Mao View PDF HTML TeX Source view license Current browse context: cs.AI < prev | next > new | recent | 2026-02 Change to browse by: cs cs.CV References & Citations NASA ADS Google Scholar Semantic Scholar export BibTeX citation Loading... BibTeX formatted citation × loading... Data provided by: Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer ( What is the Explorer? ) Connected Papers Toggle Connected Papers ( What is Connected Papers? ) Litmaps Toggle Litmaps ( What is Litmaps? ) scite.ai Toggle scite Smart Citations ( What are Smart Citations? ) ...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine