GhostCite: A Large-Scale Analysis of Citation Validity in the Age of Large Language Models
#Large Language Models #GhostCite #CiteVerifier #citation validity #AI hallucinations #academic writing #scholarly research #arXiv
📌 Key Takeaways
- Researchers have identified 'ghost citations' as a systemic threat to scientific integrity in the age of AI.
- A new study on arXiv quantifies how Large Language Models frequently fabricate non-existent academic references.
- The 'CiteVerifier' open-source framework was developed to automatically detect and mitigate these AI hallucinations.
- The proliferation of invalid citations risks collapsing the trust required for scientific claims and peer reviews.
📖 Full Retelling
🏷️ Themes
Artificial Intelligence, Academic Integrity, Science & Technology
📚 Related People & Topics
Large language model
Type of machine learning model
A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...
Hallucination (artificial intelligence)
Erroneous AI-generated content
In the field of artificial intelligence (AI), a hallucination or artificial hallucination (also called bullshitting, confabulation, or delusion) is a response generated by AI that contains false or misleading information presented as fact. This term draws a loose analogy with human psychology, where...
🔗 Entity Intersection Graph
Connections for Large language model:
- 🌐 Reinforcement learning (7 shared articles)
- 🌐 Machine learning (5 shared articles)
- 🌐 Theory of mind (2 shared articles)
- 🌐 Generative artificial intelligence (2 shared articles)
- 🌐 Automation (2 shared articles)
- 🌐 Rag (2 shared articles)
- 🌐 Scientific method (2 shared articles)
- 🌐 Mafia (disambiguation) (1 shared articles)
- 🌐 Robustness (1 shared articles)
- 🌐 Capture the flag (1 shared articles)
- 👤 Clinical Practice (1 shared articles)
- 🌐 Wearable computer (1 shared articles)
📄 Original Source Content
arXiv:2602.06718v1 Announce Type: cross Abstract: Citations provide the basis for trusting scientific claims; when they are invalid or fabricated, this trust collapses. With the advent of Large Language Models (LLMs), this risk has intensified: LLMs are increasingly used for academic writing, yet their tendency to fabricate citations (``ghost citations'') poses a systemic threat to citation validity. To quantify this threat and inform mitigation, we develop CiteVerifier, an open-source framew