Sakana AI's AI Scientist Creates a First-Ever Fully AI-Generated, Peer-Reviewed Publication

March 10, 2025

4 minutes

🟢easy Reading Level

The AI Scientist-v2 (improved version of the original open-source AI Scientist) recently achieved a major milestone by generating its first fully AI-created scientific paper that passed the peer-review process at an ICLR workshop, a top machine learning conference! This achievement marks the first time a fully AI-generated manuscript, created entirely without human intervention, has met the same review standards that human-authored papers undergo.

"This breakthrough shows that AI can independently generate research meeting the rigorous standards of peer review—a promising step toward a future where AI-driven discoveries enhance human innovation."
— Sakana AI Team, the creators behind AI Scientist

What Happened?

The AI Scientist-v2 took on the challenge of producing a scientific publication from scratch. Given only a broad research topic relevant to the workshop, the system independently:

  • Formulated a scientific hypothesis,
  • Proposed experimental designs,
  • Developed and refined code,
  • Conducted experiments,
  • Analyzed and visualized data, and
  • Authored a complete manuscript from title to references.

This process resulted in a paper titled “Compositional Regularization: Unexpected Obstacles in Enhancing Neural Network Generalization.” The manuscript, which reported a negative result in innovating regularization methods for neural networks, achieved an average reviewer score of 6.33, above the acceptance threshold for the workshop track.

The Evaluation Process

The experiment involved submitting three entirely AI-generated papers to an ICLR workshop that specializes in exploring the limitations and practical challenges of deep learning. Reviewers were informed about the possibility of AI-generated content but were not told which submissions were produced by The AI Scientist.

Out of the three, only one paper cleared the bar for acceptance. Although the organizers planned to withdraw the paper post-review due to unresolved questions about the publication of AI-generated work, the fact that it scored competitively against human-written submissions is a promising sign.

Key Points from the Evaluation:

  • End-to-end generation: The AI Scientist-v2 generated every aspect of the paper—from hypothesis to final formatting—without human edits.
  • Review outcomes: The accepted paper received scores of 6, 7, and 6 in different review rounds, indicating a quality on par with many human-authored papers.
  • Double-blind process: Reviewers engaged with the paper under standard conference guidelines, underscoring that the quality of AI-generated work can be rigorously evaluated.

Transparency, Ethics, and the Future of AI-Generated Research

The project was executed with full transparency and ethical oversight. With support from both ICLR leadership and the University of British Columbia's IRB, the experiment adhered to a strict ethical code. The decision to withdraw the paper post-review reflects a broader community debate: Should AI-generated manuscripts be published alongside human-written research, or should they be marked distinctly?

This experiment sparks important conversations about:

  • Transparency: How much information about the AI generation process should be disclosed?
  • Ethical considerations: When and how should AI contributions be acknowledged in scientific research?
  • Standards for peer review: Whether the scientific merit of AI-generated work should be judged on its own, independent of its origin.

Challenges and Limitations

Despite the promising results, the experiment highlighted several challenges:

  • The AI Scientist-v2 occasionally produced citation errors and formatting issues. Comprehensive code reviews were required to ensure the reproducibility of its experimental results.
  • The paper was accepted in a workshop track, which typically has higher acceptance rates than main conference tracks. Human reviewers, when applying stricter criteria, concluded that the work still needs refinement for top-tier conference publication.
  • The performance of The AI Scientist is closely tied to the capabilities of underlying large language models. As these models evolve, the quality of AI-generated manuscripts is expected to improve significantly.

Collaborative Efforts and Future Directions

This achievement was the result of a collaborative effort among researchers from the University of British Columbia and the University of Oxford, among others. The team plans to present a detailed talk at the upcoming ICLR workshop to share insights, discuss challenges, and outline future improvements for The AI Scientist.

Conclusion

The successful peer-review of a fully AI-generated scientific paper represents a transformative moment for both AI and the broader scientific community. While challenges remain in ensuring reliability, ethical transparency, and consistency, the breakthrough demonstrates that AI systems can generate research that meets established academic standards.

Looking ahead, the potential of AI-generated research is immense. Future iterations of The AI Scientist are expected to produce even higher-quality papers that could one day rival those authored by leading human researchers.

Valeriia Kuka

Valeriia Kuka, Head of Content at Learn Prompting, is passionate about making AI and ML accessible. Valeriia previously grew a 60K+ follower AI-focused social media account, earning reposts from Stanford NLP, Amazon Research, Hugging Face, and AI researchers. She has also worked with AI/ML newsletters and global communities with 100K+ members and authored clear and concise explainers and historical articles.


© 2025 Learn Prompting. All rights reserved.