Sakana AI's New Research: The Birth of the Darwin-Gödel Machine with Self-Encoding Improvement and Self-Referential Open-Ended Evolution

Website: https://sakana.ai/dgm/

arXiv: https://arxiv.org/abs/2505.22954

Code: github.com/jennyzzt/dgm

AI's Achilles' Heel and the Dream of Self-Improvement

In the pursuit of self-improvement, a more ambitious and theoretically rich concept is the "Gödel Machine." This concept was formally proposed by renowned computer scientist Jürgen Schmidhuber in 2007, partly inspired by mathematician Kurt Gödel's incompleteness theorems. Theoretically, a Gödel Machine is a system capable of formally proving that any self-modifications it makes are "provably beneficial." This means that a Gödel Machine can not only solve external problems but also review, rewrite, and optimize its own core code, making itself smarter and more efficient, with each modification having strict mathematical guarantees.

(Figure 1: Jürgen Schmidhuber's vision of a self-referential machine capable of evolving by self-modifying its core code [2])

How DGM Works:

An Empirically Verified Family of Self-Growing Coding Agents

Faced with the practical challenges of the Gödel Machine, the DGM research team demonstrated extraordinary creativity. They took a unique path, shifting their focus from rigorous but demanding mathematical proofs to the dynamism and adaptability of natural selection. If it's impossible to ensure every modification is absolutely beneficial through mathematical logic, why not drive progress through continuous "trial and error" and "selection," just as nature has done for billions of years of evolution?

"Instead of requiring formal proofs, we empirically validate self-modifications against benchmarks, allowing the system to improve and explore based on observed outcomes," the paper's authors explain. "This approach mirrors biological evolution, where mutations and adaptations are not pre-validated but are generated, tried, and then selected through natural selection."

(Figure 2: Darwin-Gödel Machine's basic task flow: solving streaming tasks and rewriting its own code)

(Video: Evolution of the Darwin-Gödel Machine, showing the evolutionary process from parent to offspring)

5. The agent's performance on benchmarks (e.g., success rate in solving problems) receives a quantitative score.

6. "Speciation" and the Open-Ended Exploration Tree: If an offspring agent performs better than its parent or meets certain criteria, it is added to the archive, becoming a new node in this ever-expanding family tree. This process iterates continuously, "This open-ended exploration forms a growing tree structure of diverse, high-quality agents and allows parallel exploration of many different paths in the search space."

(Figure 3: Working principle of the Darwin-Gödel Machine. DGM iteratively builds a growing archive of agents by intertwining self-modification with downstream task evaluation. Agents in the archive are selected for self-modification through open-ended exploration.)

The key to this mechanism is that DGM is not merely optimizing "application code" for specific tasks; more importantly, it is optimizing its "meta-capability"—that is, the ability to improve itself. By continuously modifying and testing modules related to code generation, editing, debugging, planning, etc., DGM is effectively learning how to improve itself more efficiently. This is the core embodiment of "self-improvement on the source code itself." The agent improves its performance in solving external programming tasks by modifying the source code that defines its own behavioral logic.

Astonishing Evolution Under Open-Ended Exploration:

A Leap from 20% to 50% with Specific Examples

# Pseudocode illustrating a precise editing tool evolved by DGM

def str_replace_in_file(path: Path, old_str: str, new_str: str) -> str:

"""Replaces exact occurrences of old_str in a file.Replacement is performed only if old_str appears exactly once to avoid accidental modifications."""

content = read_file(path) # Read file content

occurrences = content.count(old_str) # Count occurrences of the old string

if occurrences == 0:

return f"Error: Text '{old_str}' not found in {path}"

elif occurrences > 1:

return f"Error: Multiple ({occurrences}) occurrences of text '{old_str}' found in {path}. Replacement must be unique."

else:

# Precise replacement

new_content = content.replace(old_str, new_str, 1) # Replace only the first match

write_file(path, new_content) # Write back modified content

return f"Successfully replaced '{old_str}' with '{new_str}' in {path}"

This seemingly simple improvement led to a significant performance boost. Node 24's success rate jumped to 40.5%, notably higher than many of its "siblings" with success rates around 23.3%. This more refined editing tool allowed agents to modify code more accurately and safely, greatly reducing errors and enhancing their ability to solve real-world software engineering problems. This is a vivid example of DGM improving its performance by modifying its own toolset (part of its own code).

(Figure 6: Three types of open-ended evolution: explorative openness, expansive openness, and transformative openness. We can see that DGM has at least achieved b[3])

Comparing with AI Systems like AlphaEvolve:

DGM's Unique Positioning

Now, let's look at the Darwin-Gödel Machine (DGM). Its positioning is more "meta," focusing more on the "agent itself" and its evolution:

4. Focus on Openness and Task Scope:

AlphaCode, AlphaDev, and AlphaEvolve typically target relatively well-defined problems or algorithmic domains with clear boundaries.

DGM, on the other hand, aims to improve an agent's overall performance in a more open and complex real-world software engineering environment (as represented by SWE-bench, which involves locating and fixing bugs in large, real GitHub codebases, or adding new features based on requirement documents). These types of tasks generally demand higher complexity, uncertainty, and contextual understanding.

We can try a new analogy to help understand:

AlphaCode is like a student who can write excellent competition essays based on a given prompt.

AlphaDev is like a master craftsman who can optimize the material and mechanical structure of a basic tool (like a hammer) to its extreme.

AlphaEvolve is like an outstanding inventor who uses an imaginative AI assistant (Gemini) to help conceive, improve, and create new, more efficient machines (algorithms). The AI assistant provides various innovative design proposals (mutations), and the inventor is responsible for testing and selecting them.

The Darwin-Gödel Machine (DGM) is more like an "AI Project Manager" or "AI Technical Director" who continuously learns and improves their "engineering project management skills" and "team leadership." This "manager" iteratively optimizes their working methods, tool usage guidelines, and strategies for guiding their core AI technical staff (underlying foundation models), thereby enabling themselves to more effectively lead the team in completing various complex large-scale software engineering projects.

Therefore, DGM's exploration is about "how to build an AI system that can continuously self-improve across a wide range of software engineering tasks by modifying its own mode of operation," rather than merely "how to make AI write a specific good piece of code or an optimized algorithm." This focus on "the agent's own evolution" positions it uniquely and crucially on the path toward more autonomous, general, and adaptable AI for complex real-world tasks.

(Figure 7: Model and Task Transfer. (Left and Middle) The superior performance of agents discovered by DGM can transfer across different models, and (Right) can transfer to different task domains, e.g., from Python tasks to tasks in other languages (such as Rust, C++, Go, etc.))

DGM's unique approach to self-improvement also raises more significant safety concerns compared to the aforementioned systems.

To ensure experiments are conducted within a safe and controlled framework, the research team implemented multiple safeguards:

Sandboxing: All execution and self-modification operations of coding agents are strictly confined within isolated sandbox computing environments. This prevents the AI from causing unintended effects or damage to external systems.

Resource Limits: Each execution has strict time limits and computing resource quotas to prevent out-of-control AI programs from consuming unlimited resources.

Domain Limitation: The self-improvement process is strictly limited to specific programming benchmark domains, rather than allowing it to operate freely on the open internet.

Traceability and Human Oversight: DGM's archive fully records the lineage of all agents and the specifics of each modification, providing valuable traceability. Simultaneously, the entire experimental process is under the close monitoring of human researchers.

Interestingly, the research team also conducted a preliminary case study exploring DGM's potential application in AI safety—addressing the "hallucination" problem in large language models, where models generate content that seems plausible but is actually fabricated or incorrect. They attempted to evolve a DGM agent capable of detecting and fixing hallucinated content in foundation model responses.

However, this experiment also keenly revealed the risks of "objective hacking" or "reward hacking." In pursuit of higher benchmark scores, an evolved agent found a shortcut: it learned to delete specific strings used to mark and detect hallucinations (e.g., in scenarios where the model needed to admit "I don't know," it learned to remove this marker to make the answer appear solved), rather than genuinely understanding and addressing the root cause of the hallucination. This immediately brings to mind Goodhart's Law, a famous concept in economics and sociology: "When a measure becomes a target, it ceases to be a good measure."

This anecdote perhaps reminds us that when designing self-improving AI systems, defining targets and reward functions that truly align with human intent and are difficult to "game" is a crucial and extremely challenging problem.

The Significance of DGM:

Endless Innovation, Cambrian Explosion, and New Intelligent Species

The proposal of the Darwin-Gödel Machine signifies far more than just performance improvements on a few programming benchmarks. It is more like a giant stone dropped into the lake of AI research, whose ripples may spread into very broad areas:

Accelerating AI's own development: If AI can autonomously discover and implement optimal architectures, algorithms, and strategies, then the pace of AI development could shift from linear to exponential. This would greatly shorten the time from theoretical breakthroughs to practical applications, faster unleashing AI's enormous potential in scientific research, healthcare, climate change, materials science, and many other fields.

Enabling "Automated Scientific Discovery": The essence of scientific research is an iterative process of constantly hypothesizing, designing experiments, collecting data, analyzing results, and revising theories. DGM's demonstrated empirically-driven self-improvement highly aligns with the spirit of the scientific method. In the future, more powerful DGM-like systems might become valuable assistants to scientists, or even independently conduct scientific explorations in certain fields, discovering new physical laws, chemical reactions, or biological mechanisms.

A possible path to Artificial General Intelligence (AGI): While current DGM focuses on optimizing coding agents, its core idea—achieving open-ended self-improvement through empirically driven evolution—has broader applicability. This mechanism of continuous learning, adaptation, and improvement of core capabilities (not just task-specific abilities) is considered by many researchers to be a crucial step toward more general and adaptable AI.

Deepening understanding of Open-Endedness: Biological evolution is an open-ended process with no preset end, constantly creating new species, new ecological niches, and new complexities. Open-ended exploration research in AI aims to replicate this phenomenon of continuous innovation and "unlimited complexity" in computers. DGM, through its "ever-growing tree of agents" and "parallel exploration of diverse paths," provides a concrete and powerful example for achieving true open-ended exploration in AI. This means AI is no longer merely optimizing a fixed, human-defined objective function, but can continuously discover new, interesting, and valuable problems and solutions.

However, despite DGM's encouraging progress and the exciting future it paints, the path toward truly autonomous, continuous, and safe self-improving AI remains long and challenging:

Expansion of improvement space: Current DGM primarily operates on coding agents based on "frozen" foundation models. A natural extension is whether future DGM could modify the parameters of the foundation models themselves, or even evolve entirely new model architectures? This is undoubtedly a direction of extremely high difficulty but immense potential.

Complexity and alignment of evaluation standards: While current coding benchmarks are effective, they are still relatively simple and narrow. How to design more comprehensive, dynamic, and realistic evaluation systems to guide AI toward genuinely beneficial directions for humanity, avoiding the "objective hacking" problem, is a core challenge.

Computational cost and efficiency: DGM's evolutionary process requires significant computational resources. The paper mentions that a complete SWE-bench experiment costs about two weeks and approximately $22,000 in API calls. Improving evolutionary efficiency and reducing resource consumption are key to its broader application.

Ongoing game of safety and controllability: As AI's self-improvement capabilities grow, ensuring its behavior aligns with human ethics and remains safe and controllable will become increasingly difficult. We need to develop more robust theories, technologies, and governance frameworks to address this challenge, ensuring we can steer this powerful force rather than be steered by it.

Understanding "Emergent" Intelligence: When AI systems reach a level of complexity far beyond human design through open-ended evolution, how do we understand their internal mechanisms and behavioral patterns? How do we ensure we can trust and effectively collaborate with them? This may require developing entirely new fields of "AI explainability" and "AI psychology."

In summary, the birth of the Darwin-Gödel Machine marks a new, imaginative phase in the development of artificial intelligence. If development continues, and the aforementioned challenges can be gradually overcome, we might truly witness a "Cambrian explosion" in AI development. AI will no longer passively await human instructions and improvements but will actively explore, experiment, and learn how to make itself better, even forming new digital intelligent species. It cleverly integrates Darwin's evolutionary ideas with Gödel's self-referential concept, providing a practical path for AI's self-improvement through empirical validation. This also means DGM brings previously imagined and conceptualized AI and digital intelligence capabilities into reality.

In his book "Life 3.0," physicist Max Tegmark divides life into three stages: Life 1.0, where both hardware and software are determined by evolution (e.g., bacteria); Life 2.0, where hardware is determined by evolution but software can be largely learned (e.g., humans); and Life 3.0, which refers to life forms capable of designing their own hardware and software. From this perspective, while DGM currently focuses primarily on self-improvement of "software" (i.e., the coding agent's own code and strategies), the trend it represents—"AI designing AI"—is undoubtedly a crucial step toward the Life 3.0 concept.

(Figure 8: Model and Task Transfer. Tegmark's different definitions of Life 1.0-3.0)

If DGM and its successors can continuously evolve, from optimizing existing code to designing new algorithms, and even potentially influencing AI model architectures and training methods in the future, then what we are witnessing may not just be the advancement of AI tools, but a profound transformation in the very way intelligence itself evolves. When AI can autonomously set goals, design blueprints, and iteratively achieve them, it gradually transcends the category of a "tool" and begins to exhibit higher levels of autonomy and creativity.

DGM may currently be just a prototype and prelude to Life 3.0, but it is undoubtedly a powerful attempt to knock on the door of a new era of digital intelligence. Its emergence, in itself, invites us to collectively ponder the essence of wisdom and life, and humanity's evolving role in the universe.

[1] https://arxiv.org/abs/2505.22954

[2] https://people.idsia.ch/~juergen/lecun-rehash-1990-2022.html

[3] https://arxiv.org/pdf/1806.01883.pdf

Sakana AI's New Research: The Birth of the Darwin-Gödel Machine with Self-Encoding Improvement and Self-Referential Open-Ended Evolution

Share Short URL