SP
BravenNow
Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation
| USA | technology | ✓ Verified - arxiv.org

Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation

#Large Language Models #Scientific Idea Generation #Knowledge Graphs #Retrieval-Augmented Generation #GYWI #Author Collaboration #AI Research #arXiv

📌 Key Takeaways

  • Researchers developed GYWI system combining author knowledge graphs with retrieval-augmented generation
  • System includes author-centered knowledge graph construction and hybrid retrieval mechanism
  • Comprehensive evaluation across five dimensions using multiple assessment methods
  • GYWI outperformed mainstream LLMs in novelty, reliability, and relevance metrics

📖 Full Retelling

Pengzhen Xie and Huizhu Liang introduced their scientific idea generation system called GYWI in a paper submitted to arXiv on December 5, 2025, addressing the limitation of Large Language Models in providing controllable academic context and traceable inspiration pathways for scientific research. The GYWI system represents a significant advancement in how AI can assist with scientific ideation by combining author knowledge graphs with retrieval-augmented generation (RAG) techniques. The researchers first developed an author-centered knowledge graph construction method and inspiration source sampling algorithms to build an external knowledge base. This approach allows the system to provide both depth and breadth of knowledge through a hybrid retrieval mechanism that combines traditional RAG with GraphRAG. The third component of their system is a prompt optimization strategy that incorporates reinforcement learning principles to automatically guide Large Language Models in refining their outputs based on the hybrid context provided. To validate their approach, the authors constructed an evaluation dataset using papers from arXiv spanning 2018 to 2023. They developed a comprehensive evaluation methodology that included empirical automatic assessment through multiple-choice question tasks, LLM-based scoring, human evaluation, and semantic space visualization analysis. The generated ideas were assessed across five dimensions: novelty, feasibility, clarity, relevance, and significance. When tested against various state-of-the-art language models including GPT-4o, DeepSeek-V3, Qwen3-8B, and Gemini 2.5, the GYWI system demonstrated significant improvements in multiple metrics, particularly in novelty, reliability, and relevance of the generated scientific ideas.

🏷️ Themes

Artificial Intelligence, Scientific Research, Knowledge Graphs, Retrieval-Augmented Generation

📚 Related People & Topics

Large language model

Type of machine learning model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...

View Profile → Wikipedia ↗

Entity Intersection Graph

Connections for Large language model:

🌐 Educational technology 4 shared
🌐 Reinforcement learning 3 shared
🌐 Machine learning 2 shared
🌐 Artificial intelligence 2 shared
🌐 Benchmark 2 shared
View full profile
Original Source
--> Computer Science > Artificial Intelligence arXiv:2602.22215 [Submitted on 5 Dec 2025] Title: Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation Authors: Pengzhen Xie , Huizhi Liang View a PDF of the paper titled Graph Your Way to Inspiration: Integrating Co-Author Graphs with Retrieval-Augmented Generation for Large Language Model Based Scientific Idea Generation, by Pengzhen Xie and Huizhi Liang View PDF HTML Abstract: Large Language Models demonstrate potential in the field of scientific idea generation. However, the generated results often lack controllable academic context and traceable inspiration pathways. To bridge this gap, this paper proposes a scientific idea generation system called GYWI, which combines author knowledge graphs with retrieval-augmented generation to form an external knowledge base to provide controllable context and trace of inspiration path for LLMs to generate new scientific ideas. We first propose an author-centered knowledge graph construction method and inspiration source sampling algorithms to construct external knowledge base. Then, we propose a hybrid retrieval mechanism that is composed of both RAG and GraphRAG to retrieve content with both depth and breadth knowledge. It forms a hybrid context. Thirdly, we propose a Prompt optimization strategy incorporating reinforcement learning principles to automatically guide LLMs optimizing the results based on the hybrid context. To evaluate the proposed approaches, we constructed an evaluation dataset based on arXiv (2018-2023). This paper also develops a comprehensive evaluation method including empirical automatic assessment in multiple-choice question task, LLM-based scoring, human evaluation, and semantic space visualization analysis. The generated ideas are evaluated from the following five dimensions: novelty, feasibility, clarity, relevance, and significance. We conducted expe...
Read full article at source

Source

arxiv.org

More from USA

News from Other Countries

🇬🇧 United Kingdom

🇺🇦 Ukraine