SP
BravenNow
Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG
| USA | technology | ✓ Verified - arxiv.org

Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG

#Weixi Lin #Higress‑RAG #Retrieval‑Augmented Generation #Model Context Protocol #Hybrid Retrieval #Adaptive Routing #Semantic Caching #Reciprocal Rank Fusion #Corrective RAG #enterprise AI #hallucination reduction #latency optimization #content ingestion #dense‑sparse retrieval

📌 Key Takeaways

  • The paper identifies three key bottlenecks in production‑grade RAG: low recall for complex queries, high hallucination rates, and unacceptable latency.
  • It introduces the Higress RAG MCP Server, an enterprise‑centric architecture built on the Model Context Protocol, orchestrating Adaptive Routing, Semantic Caching, Hybrid Retrieval, and Corrective RAG.
  • Innovation highlights include the Higress‑Native Splitter for structure‑aware ingestion, Reciprocal Rank Fusion to merge dense and sparse signals, and a 50 ms semantic caching mechanism with dynamic thresholding.
  • Experimental validation on domain‑specific Higress technical documentation and blogs demonstrates scalable, hallucination‑resistant performance.
  • The work advances Retrieval‑Augmented Generation toward real‑time, enterprise‑ready deployment by optimizing the entire retrieval lifecycle.

📖 Full Retelling

Weixi Lin, a researcher in Information Retrieval, submitted the paper "Higress‑RAG: A Holistic Optimization Framework for Enterprise Retrieval‑Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG" to the arXiv repository on 30 Dec 2025, proposing a comprehensive solution to the persistent challenges of low retrieval precision, hallucination, and latency in enterprise Retrieval‑Augmented Generation systems.

🏷️ Themes

Retrieval‑Augmented Generation (RAG), Enterprise Knowledge Management, Hybrid Retrieval, Adaptive Routing, Semantic Caching, Corrective RAG, Large Language Model Integration, Latency Optimization

Entity Intersection Graph

No entity connections available yet for this article.

}
Original Source
--> Computer Science > Information Retrieval arXiv:2602.23374 [Submitted on 30 Dec 2025] Title: Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG Authors: Weixi Lin View a PDF of the paper titled Higress-RAG: A Holistic Optimization Framework for Enterprise Retrieval-Augmented Generation via Dual Hybrid Retrieval, Adaptive Routing, and CRAG, by Weixi Lin View PDF HTML Abstract: The integration of Large Language Models into enterprise knowledge management systems has been catalyzed by the Retrieval-Augmented Generation paradigm, which augments parametric memory with non-parametric external data. However, the transition from proof-of-concept to production-grade RAG systems is hindered by three persistent challenges: low retrieval precision for complex queries, high rates of hallucination in the generation phase, and unacceptable latency for real-time applications. This paper presents a comprehensive analysis of the Higress RAG MCP Server, a novel, enterprise-centric architecture designed to resolve these bottlenecks through a "Full-Link Optimization" strategy. Built upon the Model Context Protocol , the system introduces a layered architecture that orchestrates a sophisticated pipeline of Adaptive Routing, Semantic Caching, Hybrid Retrieval, and Corrective RAG . We detail the technical implementation of key innovations, including the Higress-Native Splitter for structure-aware data ingestion, the application of Reciprocal Rank Fusion for merging dense and sparse retrieval signals, and a 50ms-latency Semantic Caching mechanism with dynamic thresholding. Experimental evaluations on domain-specific Higress technical documentation and blogs verify the system's architectural robustness. The results demonstrate that by optimizing the entire retrieval lifecycle - from pre-retrieval query rewriting to post-retrieval corrective evaluation - the Higress RAG system offers a scalable,...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine