#Evaluation Metrics
Latest news articles tagged with "Evaluation Metrics". Follow the timeline of events, related topics, and entities.
Articles (4)
-
πΊπΈ Evaluating Repository-level Software Documentation via Question Answering and Feature-Driven Development
[USA]
arXiv:2604.06793v1 Announce Type: cross Abstract: Software documentation is crucial for repository comprehension. While Large Language Models (LLMs) advance documentation generation from code snippet...
Related: #AI Research, #Software Engineering -
πΊπΈ Span-Level Machine Translation Meta-Evaluation
[USA]
arXiv:2603.19921v1 Announce Type: cross Abstract: Machine Translation (MT) and automatic MT evaluation have improved dramatically in recent years, enabling numerous novel applications. Automatic eval...
Related: #Machine Translation -
πΊπΈ AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents
[USA]
arXiv:2603.12564v1 Announce Type: cross Abstract: Tool-augmented LLM agents increasingly serve as multi-turn advisors in high-stakes domains, yet their evaluation relies on ranking-quality metrics th...
Related: #AI Safety -
πΊπΈ CRIMSON: A Clinically-Grounded LLM-Based Metric for Generative Radiology Report Evaluation
[USA]
arXiv:2603.06183v1 Announce Type: cross Abstract: We introduce CRIMSON, a clinically grounded evaluation framework for chest X-ray report generation that assesses reports based on diagnostic correctn...
Related: #Medical AI
Key Entities (3)
- Question answering (1 news)
- Software documentation (1 news)
- Large language model (1 news)
About the topic: Evaluation Metrics
The topic "Evaluation Metrics" aggregates 4+ news articles from various countries.