SP
BravenNow
PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization
| USA | technology | ✓ Verified - arxiv.org

PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization

#PRECTR-V2 #Search Relevance #Click-Through Rate #Knowledge Distillation #Exposure Bias #User Preference Mining #Information Retrieval #Transformer Models

📌 Key Takeaways

  • PRECTR-V2 is an enhanced unified framework for search relevance matching and CTR prediction
  • It addresses three main challenges: sparse user behavior data, exposure bias, and latency constraints
  • The framework mines global relevance preferences to handle low-activity users
  • It uses hard negative samples to correct distribution mismatch in training data
  • The new model incorporates a lightweight transformer encoder optimized through LLM distillation

📖 Full Retelling

Researchers Shuzhi Cao, Rong Chen, Ailong He, Shuguang Han, and Jufeng Chen introduced PRECTR-V2, an enhanced unified framework for search relevance matching and click-through rate prediction, in a paper submitted to arXiv's Computer Science > Information Retrieval category on February 24, 2026, addressing three key limitations of their previous PRECTR model related to user behavior data scarcity, training distribution mismatch, and latency constraints. The PRECTR-V2 framework introduces three major innovations to overcome the limitations of its predecessor. First, it addresses the challenge of limited behavioral data for low-active and new users by mining global relevance preferences under specific queries, enabling effective personalized relevance modeling even in cold-start scenarios. This approach helps search systems better understand and serve users with limited interaction history. Second, the framework corrects exposure bias by constructing hard negative samples through embedding noise injection and relevance label reconstruction. By optimizing the relative ranking of these samples against positive samples via pairwise loss, PRECTR-V2 mitigates the distribution mismatch between training data from high-relevance exposures and the broader candidate space in coarse-ranking. Finally, the researchers replaced the frozen BERT encoder in the original model with a lightweight transformer-based encoder pretrained through knowledge distillation from large language models and fine-tuned on text relevance classification tasks, enabling better adaptation to CTR fine-tuning and advancing beyond traditional Emb+MLP architectures.

🏷️ Themes

Information Retrieval, Machine Learning Optimization, User Experience Enhancement

📚 Related People & Topics

Information retrieval

Finding information for an information need

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be base...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Information retrieval:

🌐 Large language model 3 shared
🌐 Artificial intelligence 2 shared
🌐 Recommender system 2 shared
🌐 Efficiency 1 shared
🌐 Transparency 1 shared
View full profile
Original Source
--> Computer Science > Information Retrieval arXiv:2602.20676 [Submitted on 24 Feb 2026] Title: PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization Authors: Shuzhi Cao , Rong Chen , Ailong He , Shuguang Han , Jufeng Chen View a PDF of the paper titled PRECTR-V2:Unified Relevance-CTR Framework with Cross-User Preference Mining, Exposure Bias Correction, and LLM-Distilled Encoder Optimization, by Shuzhi Cao and 4 other authors View PDF HTML Abstract: In search systems, effectively coordinating the two core objectives of search relevance matching and click-through rate prediction is crucial for discovering users' interests and enhancing platform revenue. In our prior work PRECTR, we proposed a unified framework to integrate these two subtasks,thereby eliminating their inconsistency and leading to mutual this http URL , our previous work still faces three main challenges. First, low-active users and new users have limited search behavioral data, making it difficult to achieve effective personalized relevance preference modeling. Second, training data for ranking models predominantly come from high-relevance exposures, creating a distribution mismatch with the broader candidate space in coarse-ranking, leading to generalization bias. Third, due to the latency constraint, the original model employs an Emb+MLP architecture with a frozen BERT encoder, which prevents joint optimization and creates misalignment between representation learning and CTR fine-tuning. To solve these issues, we further reinforce our method and propose PRECTR-V2. Specifically, we mitigate the low-activity users' sparse behavior problem by mining global relevance preferences under the specific query, which facilitates effective personalized relevance modeling for cold-start scenarios. Subsequently, we construct hard negative samples through embedding noise injection and relevance label reconstruction, and optimi...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine