Точка Синхронізації

AI Archive of Human History

Assessing Reproducibility in Evolutionary Computation: A Case Study using Human- and LLM-based Assessment
| USA | technology

Assessing Reproducibility in Evolutionary Computation: A Case Study using Human- and LLM-based Assessment

#Evolutionary Computation #Large Language Models #Experimental Protocols #Peer Review #Data Artifacts #arXiv #Open Science

📌 Key Takeaways

  • Researchers evaluated the current state of reproducibility in evolutionary computation papers.
  • The study utilized a hybrid assessment approach involving both human reviewers and Large Language Models.
  • Findings indicate that shared documentation and experimental protocols are often insufficient in existing literature.
  • The paper proposes using automated tools to enhance the transparency and verification of computational experiments.

📖 Full Retelling

A team of researchers released a comprehensive study on the arXiv preprint server in February 2025 assessing the reproducibility of published work within the field of evolutionary computation. The investigation, titled "Assessing Reproducibility in Evolutionary Computation: A Case Study using Human- and LLM-based Assessment," was conducted to address the lack of empirical evidence regarding how well algorithms and experimental protocols are documented in modern scientific literature. By utilizing both human evaluation and Large Language Models (LLMs), the authors sought to identify critical gaps in how artifacts are shared, which is essential for verifying the validity of computational experiments. The core of the research emphasizes that because evolutionary computation relies heavily on stochastic processes and complex experimental configurations, clear documentation is vital. However, the study points out that despite a growing awareness of the need for open science, many published papers still fall short of providing the necessary transparency. The researchers employed a novel methodology that compares traditional human peer-assessment against automated assessments generated by LLMs, potentially paving the way for more efficient systematic reviews of scientific integrity in the future. Beyond just identifying failures in documentation, the paper serves as a call to action for the technology and research community to standardize their reporting protocols. The findings suggest that the integration of AI-based assessment tools could become a standard part of the peer-review process to ensure that all necessary experimental parameters, source codes, and data artifacts are readily available. By highlighting these systemic issues, the study aims to foster a culture of transparency that protects the credibility of evolutionary computation as the field continues to evolve and integrate with broader artificial intelligence frameworks.

🏷️ Themes

Reproducibility, Artificial Intelligence, Research Ethics

📚 Related People & Topics

Large language model

Type of machine learning model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) that provide the c...

Wikipedia →

Peer review

Peer review

Evaluation by peers with similar expertise

Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work (peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review methods are used to maintain quality standards, improve p...

Wikipedia →

Evolutionary computation

Evolutionary computation

Trial and error problem solvers with a metaheuristic or stochastic optimization character

Evolutionary computation from computer science is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms, they are a family of population-based trial and error probl...

Wikipedia →

🔗 Entity Intersection Graph

Connections for Large language model:

View full profile →

📄 Original Source Content
arXiv:2602.07059v1 Announce Type: cross Abstract: Reproducibility is an important requirement in evolutionary computation, where results largely depend on computational experiments. In practice, reproducibility relies on how algorithms, experimental protocols, and artifacts are documented and shared. Despite growing awareness, there is still limited empirical evidence on the actual reproducibility levels of published work in the field. In this paper, we study the reproducibility practices in pa

Original source

More from USA

News from Other Countries

🇵🇱 Poland

🇬🇧 United Kingdom

🇺🇦 Ukraine

🇮🇳 India