#Model Reliability
Latest news articles tagged with "Model Reliability". Follow the timeline of events, related topics, and entities.
Articles (9)
-
πΊπΈ To See or To Please: Uncovering Visual Sycophancy and Split Beliefs in VLMs
[USA]
arXiv:2603.18373v1 Announce Type: cross Abstract: When VLMs answer correctly, do they genuinely rely on visual information or exploit language shortcuts? We introduce the Tri-Layer Diagnostic Framewo...
Related: #AI Bias -
πΊπΈ Evidence-based Distributional Alignment for Large Language Models
[USA]
arXiv:2603.13305v1 Announce Type: cross Abstract: Distributional alignment enables large language models (LLMs) to predict how a target population distributes its responses across answer options, rat...
Related: #AI Alignment -
πΊπΈ The Dunning-Kruger Effect in Large Language Models: An Empirical Study of Confidence Calibration
[USA]
arXiv:2603.09985v1 Announce Type: cross Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, yet their ability to accurately assess their own confide...
Related: #AI Psychology -
πΊπΈ Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering
[USA]
arXiv:2603.06271v1 Announce Type: cross Abstract: Agentic retrieval-augmented reasoning pipelines are increasingly used to structure how large language models (LLMs) incorporate external evidence in ...
Related: #AI in Radiology -
πΊπΈ The Fragility Of Moral Judgment In Large Language Models
[USA]
arXiv:2603.05651v1 Announce Type: cross Abstract: People increasingly use large language models (LLMs) for everyday moral and interpersonal guidance, yet these systems cannot interrogate missing cont...
Related: #AI Ethics -
πΊπΈ Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory
[USA]
arXiv:2602.22345v1 Announce Type: cross Abstract: This thesis addresses two persistent and closely related challenges in modern deep learning, reliability and efficiency, through a unified framework ...
Related: #Machine Learning, #AI Efficiency -
πΊπΈ Optimization Instability in Autonomous Agentic Workflows for Clinical Symptom Detection
[USA]
arXiv:2602.16037v1 Announce Type: new Abstract: Autonomous agentic workflows that iteratively refine their own behavior hold considerable promise, yet their failure modes remain poorly characterized....
Related: #Artificial Intelligence, #Healthcare Automation, #Failure Mode Analysis -
πΊπΈ When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
[USA]
arXiv:2602.11908v2 Announce Type: replace Abstract: LLMs are widely used, yet they remain prone to factual errors that erode user trust and limit adoption in high-risk settings. One approach to mitig...
Related: #Artificial Intelligence, #Text Generation -
πΊπΈ Diverging Flows: Detecting Extrapolations in Conditional Generation
[USA]
arXiv:2602.13061v1 Announce Type: cross Abstract: The ability of Flow Matching (FM) to model complex conditional distributions has established it as the state-of-the-art for prediction tasks (e.g., r...
Related: #Machine Learning Safety, #Predictive Technology
Key Entities (4)
- Spectral analysis (1 news)
- Random matrix (1 news)
- Large language model (1 news)
- Fact (1 news)
About the topic: Model Reliability
The topic "Model Reliability" aggregates 9+ news articles from various countries.