SP
BravenNow
What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance
| USA | technology | ✓ Verified - arxiv.org

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

#Large Language Models #Hallucinations #Query Features #Linguistic Complexity #Computational Linguistics #Model Accuracy #Query Optimization

📌 Key Takeaways

  • LLM hallucinations are influenced by query structure, not just model defects
  • Deep clause nesting and underspecification increase hallucination risk
  • Clear intention grounding and answerability reduce hallucination rates
  • The research provides a framework for guided query rewriting to improve LLM performance

📖 Full Retelling

Researchers William Watson, Nicole Cho, Sumitra Ganesh, and Manuela Veloso published a groundbreaking study on February 23, 2026, analyzing how linguistic features in queries affect Large Language Model performance and hallucination rates, after examining 369,837 real-world queries to establish a 22-dimension framework for identifying query characteristics that increase or decrease the likelihood of model inaccuracies. The research challenges the conventional view that LLM hallucinations are solely defects of the model or its decoding strategy, instead demonstrating that query structure plays a crucial role in shaping model responses. By drawing on classical linguistics, the team operationalized their insight through a comprehensive feature vector covering clause complexity, lexical rarity, and various linguistic phenomena known to affect human comprehension. Their large-scale analysis revealed a consistent 'risk landscape' where certain query features correlate strongly with higher hallucination propensity, while others demonstrate protective effects. The findings establish an empirically observable query-feature representation correlated with hallucination risk, potentially paving the way for guided query rewriting and future intervention studies aimed at improving LLM reliability and accuracy.

🏷️ Themes

Artificial Intelligence, Linguistic Analysis, Model Performance

📚 Related People & Topics

Hallucination

Hallucination

Perception that only seems real

A hallucination is a perception in the absence of an external context stimulus that has the compelling sense of reality. They are distinguishable from several related phenomena, such as dreaming (REM sleep), which does not involve wakefulness; pseudohallucination, which does not mimic real perceptio...

View Profile → Wikipedia ↗

Computational linguistics

Use of computational tools for the study of linguistics

Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial int...

View Profile → Wikipedia ↗

Large language model

Type of machine learning model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Hallucination:

🌐 Computer vision 2 shared
🌐 Uncertainty quantification 1 shared
🌐 Transformer 1 shared
View full profile
Original Source
--> Computer Science > Computation and Language arXiv:2602.20300 [Submitted on 23 Feb 2026] Title: What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance Authors: William Watson , Nicole Cho , Sumitra Ganesh , Manuela Veloso View a PDF of the paper titled What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance, by William Watson and 3 other authors View PDF HTML Abstract: Large Language Model hallucinations are usually treated as defects of the model or its decoding strategy. Drawing on classical linguistics, we argue that a query's form can also shape a listener's (and model's) response. We operationalize this insight by constructing a 22-dimension query feature vector covering clause complexity, lexical rarity, and anaphora, negation, answerability, and intention grounding, all known to affect human comprehension. Using 369,837 real-world queries, we ask: Are there certain types of queries that make hallucination more likely? A large-scale analysis reveals a consistent "risk landscape": certain features such as deep clause nesting and underspecification align with higher hallucination propensity. In contrast, clear intention grounding and answerability align with lower hallucination rates. Others, including domain specificity, show mixed, dataset- and model-dependent effects. Thus, these findings establish an empirically observable query-feature representation correlated with hallucination risk, paving the way for guided query rewriting and future intervention studies. Comments: EACL 2026 Findings Subjects: Computation and Language (cs.CL) ; Artificial Intelligence (cs.AI) Cite as: arXiv:2602.20300 [cs.CL] (or arXiv:2602.20300v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2602.20300 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Nicole Cho [ view email ] [v1] Mon, 23 Feb 2026 19:30:08 UTC (5,963 KB) Full-t...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine