No Priors: Artificial Intelligence | Technology | Startups - Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog
Episode Date: June 26, 2025Much of the scientific process involves searching. But rather than continue to rely on the luck of discovery, Google DeepMind has engineered a more efficient AI agent that mines complex spaces to faci...litate scientific breakthroughs. Sarah Guo speaks with Pushmeet Kohli, VP of Science and Strategic Initiatives, and research scientist Matej Balog at Google DeepMind about AlphaEvolve, an autonomous coding agent they developed that finds new algorithms through evolutionary search. Pushmeet and Matej talk about how AlphaEvolve tackles the problem of matrix multiplication efficiency, scaling and iteration in problem solving, and whether or not this means we are at self-improving AI. Together, they also explore the implications AlphaEvolve has to other sciences beyond mathematics and computer science. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @pushmeet | @matejbalog Chapters: 00:00 Pushmeet Kohli and Matej Balog Introduction 0:48 Origin of AlphaEvolve 02:31 AlphaEvolve’s Progression from AlphaGo and AlphaTensor 08:02 The Open Problem of Matrix Multiplication Efficiency 11:18 How AlphaEvolve Evolves Code 14:43 Scaling and Predicting Iterations 16:52 Implications for Coding Agents 19:42 Overcoming Limits of Automated Evaluators 25:21 Are We At Self-Improving AI? 28:10 Effects on Scientific Discovery and Mathematics 31:50 Role of Human Scientists with AlphaEvolve 38:30 Making AlphaEvolve Broadly Accessible 40:18 Applying AlphaEvolve Within Google 41:39 Conclusion
Transcript
Discussion (0)
Hi, listeners, and welcome back to No Pryors.
Today, we're joined by two of the key folks behind one of the most compelling developments
in AI this year, Alpha Evolve.
Pushmit Koli and Matei Balag worked on this autonomous coding agent that uses Gemini models
and evolutionary search to discover new algorithms.
It marks a major leap in AI's ability to contribute to core computer science and math,
and perhaps science is beyond that.
It's not just a stochastic pair.
or a boilerplate generator, it has shown what you might consider technical creativity
in the way that Move 37 did with AlphaGo, something humans hadn't done before, even in thousands
of years of play. It might even be a real step on the path to self-improving AI.
Pushmeet, Matti, thank you so much for being here.
Thank you for having us.
It's a pleasure.
Congratulations on the success and the launch of Alpha Evolve.
Can you give me a brief description of what it is broadly?
Yeah, so in maybe one sentence, Alpha Evolve is an Aalphalve.
A.I. Coding agent that is able to discover new algorithms that are able to make new discoveries
on open scientific problems. And at the same time, those algorithms can be so practical that they
are already deployed in key parts of Google's own infrastructure.
What is the origin story of working on this particular form of coding agent or this problem
statement?
So we are not new to this space of algorithm discovery. As you might know, the mission of all
of DeepMind is to build AI responsibly to benefit humanity and the way our particular team
has been doing it for years now is to look for ways how AI can discover new algorithms. New algorithms
are everywhere around us. So this is a very important question and can have very high impact
when we can discover algorithms that solve important computational problems with higher efficiency
than what we have been able to do so far. And kind of the first breakthrough we had in this space
was in 2022 when we released a system called Alpha Tensor.
And so that was a system that was an AI system using reinforcement learning
that for a very specific but fundamental computational tasks,
so multiplying matrices, for the first time,
showed that AI agents can discover better algorithms than what humans had been able to do before them.
So this was the first system that gave weight to this idea that indeed with AI,
will be able to go into this superhuman region of algorithms that we as humans have not been able to discover ourselves.
How do you differentiate Alpha Evolve from like Alpha Tensor and Fun Search and some other projects in the sort of lineage of this?
One way to also describe what we have done is if you look back at the history of Deep Mind
and see a number of sort of projects that have come even before we started working on computer science.
our earlier sort of work, and if we go back to the project on AlphaGo, where the AlphaGo
agent was able to beat the World Go champion in the Game of Go.
And the remarkable sort of thing in that agent was that it was able to explore this amazingly large
search space of all possible sort of Go positions in a situation.
efficient manner that it can sort of come up with what is the optimal move at that time right
and really surprised people both go professionals as well as scientists scientists believe that
that event would come much much later because it was a very hard problem and so what that
gave evidence for is that is the ability of these large-scale neural networks
based systems to be able to reason and do very efficient exploration in these large search
spaces and come up with amazing new insights about the particular domain.
And in the game of Go, I mean, there is this move called Move 37, which is a very creative
new move that the agent discovered that was not in the Go literature, right?
that really surprised the Goh professionals.
So in some sense, we asked ourselves the question that if you have an agent which can do
very efficient search in the domain of Go, why can't you use the same kind of philosophy
to search for algorithms in the space of algorithms?
And in fact, that sort of was the underlying basis of the work on our first sort of attempt
at that problem, which culminated in alpha-tensor.
So how we structured the algorithmic discovery problem is we looked at first a very important
problem, and that problem was matrix multiplication.
It is a problem that is ubiquitous in computer science.
It's one of the key fundamental operators that underlies not only computer science,
was also neural networks and machine learning and AI.
We said, can we find a way to improve matrix multiplication algorithms?
So there's a history of metric multiplication, which is very interesting for people who
might be interested in it.
Even though it's such a fundamental operator, people thought that the complexity or the time
it takes to multiply two matrices is order cube.
And around 50 years back, more than 50 years back now, a German mathematician Strasson came up with this very counterintuitive construction, which showed that, in fact, the complexity was not N to the par three, or what's not cubic, where N is the sort of dimensionality of the matrix, it's lower.
And so, and that was a very counterintuitive sort of result.
and but it stayed for more than 50 years
and until sort of alpha-tensor came up
and we said, well, can we actually improve this result?
And remarkably, we were able to show that.
That alpha-tensor, while by having this amazing ability
to do search in this very large space,
even much larger than the space of possible go-moves,
was able to come up with this amazingly new,
algorithm which
improve things. But then the question
was, well, we now
have proved the thesis that
you have these super
intelligent agent which can
go beyond what human
computer scientists have
been able to do. But can
we generalize them?
This is sort of alpha tensor was very smart
but was only
purposefully constructed for
the matrix multiplication problem.
Can we have
build a agent that is more general, both more general in the sense of it can handle more
general problems, but can also search in the space more naturally in the space of programs
rather than in the space of very specific sort of operations that were required for matrix
modification. And that was the origin of sort of the first attempt of us with Fun Search,
which was an LLM-based agent, which for the first time, by searching in the space of programs,
showed that you can come up with completely new solutions, and made the first scientific
discovery from NLM. And AlphaEvol is basically an extension.
of that. I'm very inspired by the idea, I think as many people are, that AI will actually have
creativity, does actually have technical creativity, as you are describing as one way to conceptualize
this, where you're, you know, outside of the patterns that we already know as engineers. I want to go
back to some of the mechanics here and the limits to generalization and how to think about
automated evaluators and a lot of different topics. But, you know, when you think about these
problems that are clearly, like, economically valuable and interesting, like matrix multiplication,
the potential efficiency of it. What is your intuition for why, you know, those solutions have
not been found before? Is it simply like the search space is too large or people in this field
were complacent in that they believed a certain solution was like the maximum efficiency or
because clearly there's value to be had here?
of opinion on this is basically that if you look at the structure of the algorithm what Strassen
produced was quite sort of ingenious. It was not a natural thing that you would sort of think of
and that is that was for only two by two matrices. As you sort of go to larger sizes, the space is so
huge. The constructions are not sort of something which is very natural. These are very involved
and intricate sort of constructions that would be very hard to discover by chance.
So it's quite interesting that it has this very special structure, but it's not something
that comes naturally to a human computer scientist.
Just to add to that, so I definitely agree the search space is just unbelievably vast.
The solutions are maybe non-intuitive.
And the third thing I want to emphasize is that I really believe the people who worked on this in the past were definitely not complacent.
And in fact, the problems we chose to apply alpha evolved to in the first instance, both on the scientific side and the practical side,
we deliberately chose problems which have been worked on for a very long time by the very best people.
So on the scientific side, since we're talking about matrix multiplication, this has been a known open problem for decades.
and then many people have been working on it.
And similarly for the practical applications that we mentioned in our Alpha-Evolve release
in key parts of Google's infrastructure, again, these are things that have been heavily
optimized inside Google because they are so important.
And so by having a system like Alpha-Evolve or any other discover something new on these
problems, I think that's as strong a demonstration as I can imagine of the fact that this is indeed
something that is new because no one found it before. And also it is something that was not
easy to discover because those results stood for such a long time and have been worked on by
such strong people. I noted that this is not a comment on the broad efforts of the computer
science industry to date on matrix multiplication or data center optimization. I think this is a
good moment to try to demystify what's happening under the hood for a broader set of people. Can you walk us
through a concrete example of how Alpha-Evolve actually evolves code and say, let's take the
example of trying to optimize data center scheduling, right? What does the step-by-step process
look like from initial random code to final solution that saves millions of dollars of power?
I can look you through that. So the user of a system like Alpha-Evolve, they basically specify
what is the problem that they are trying to solve. So that's the most important thing. And you
specify it by providing what is called an evaluation function. What this function does is
whenever there is a proposed solution for solving the problem, you're able to tell how good this
solution is. So you basically define what makes a good solution. For discovering an algorithm for
scheduling jobs on a data center, this evaluation function could be something like a simulator of
jobs in a data center. Given an algorithm for doing the scheduling, it simulates how good this
algorithm is. So that's what the user provides. And this is a simulator you already had.
Yes. So that's a simulator that we already had. And I would say it's something that is
quite natural to have in many, in many domains, because whenever you want to innovate on
something, you need to have a way of telling, okay, is the innovation actually good or not?
So it's a very natural object to have at least in principle. So you define the what by providing
the evaluation function. And then alpha evolve fills in the how. So that, that, that,
That's the job of our system.
And you can do it in two fairly different ways.
One is you tell AlphaEval, I have no idea how to solve this problem.
Let's start completely from scratch.
And let's try to be creative and come up with something completely new.
So that's one option you can take.
Another option you can take is, actually, we have already worked on this problem for a really
long time.
Here is a very strong initial solution that we can provide to the system.
And you can start from here.
And that's what we did for the application to discovering new algorithms.
for scheduling jobs in a data center.
So AlphaEvolve takes this initial solution.
And then on a high level,
it combines the creative power of large language models
to propose creative new ways how to improve that solution,
the strictness of the evaluation function provided by the user
that is able to actually filter out the things that work
from the ones that don't.
And then this is wrapped inside an evolutionary algorithm
that makes sure that we can discover the whole space,
of algorithms in that region so that we don't commit to a very specific type of solution
early on, but instead we maintain a diverse pool of potential solutions.
Over time, maybe we combine ideas from different solutions that are already strong
until we actually have an algorithm that's so strong that we are happy to deploy it
to a critical part of Google's infrastructure, let's see.
And intuitively, not in the machine learning sense, but in the evolution sense,
you have different generations where you're getting closer to an
optimal solution. Yeah, that's right. Like you would expect that in each iteration of evolution,
what you're doing is you are looking at the previous iteration, looking at maybe the strongest
solutions you have, and then trying to be creative about how can I combine ideas from those
solutions or maybe bring in completely new ideas to come up with something even better. And so,
yes, each generation gets stronger and stronger. How much scaling are we talking about? Like,
is there a way to predict how many generations it takes, or how do you, you know, constrain the,
number of iterations that the model can use.
So there are two parts to your question.
One is about, okay, how does scaling work and then how can you predict it?
So for the first part, this is actually a really nice feature of AlphaEvolve that it can adapt
to the difficulty of the problem.
If you ask AlphaEvolve to find a solution to a problem that's actually unexpectedly easy,
then it will just do it very, very quickly.
Like almost immediately you will have the solution.
But if you ask it a problem that's really, really difficult,
And by really, really difficult, I mean like really difficult, maybe an open question that has stood for decades in the sciences or you want the practical algorithm for a really high value application in Google.
Then you would of course expect this is not an easy problem.
You might need to spend longer time considering different solutions, exploring the space, combining ideas.
But what's really nice about AlphaEvolve is that it is able to sustain this scaling in a way that it keeps improving over time.
and it keeps improving for so long that you can make discoveries on this level of difficulty,
like breaking decades-old scientific challenges or discovering high-value algorithms.
Now, I know it maybe sounds trivial that if you wait longer, you get better results,
but in practice, that's actually like a really difficult thing to build automated agents
that are able to sustain this continual improvement without plateauing quite early.
This is, I think, a nice feature.
There was a second part of the question about predicting how many iterations you will need.
So that is something that is actually not so easy because it's like asking a priori, do you know how difficult this question is going to be?
And especially in the sciences, that's something that often has a very surprising answer.
Very trivial questions can turn out to be extremely, extremely difficult and vice versa.
But the nice thing is that you have continual improvement if you run this system.
And as long as you can run it, you can expect to get better and better.
the results, and you just have to see where this gets you.
If you think about the coding agents that general developers have access to and are increasingly
using today, one frustration with them is on relatively trivial problems. It is set out to
do autonomously. It will get lost and blow itself up, or plateau, as you said, in frustrating
ways. Can you talk about if you think there are implications from AlphaVov to these other
general coding agents? While large segment models and coding agents are getting much better in
their understanding of code, they're not perfect, right? So they do make mistakes. The other sort
of element is to think about what is the task that these agents have been assigned. Mostly,
if you are asking an agent to solve a particular task or write a particular program, you are
providing a specification. You are specifying the tasks either in natural language or you're
saying, well, I'm trying to do something completed. Right. So it's not a complete characterization
of what you want. It's a partial specification of what you want and the agent then try to
solve the problem and might get lucky and might get the right result or they might
hallucinate and get the wrong result and the issue is how do you know what so whether the result
is right or wrong and that depends on having a good evaluator that's how alpha evolve solves the
problem so in some sense we are able to leverage the hallucinations for a beneficial purpose right
So the creativity and the wrong answers that Alpha Evolp can somehow come up with,
how do we know that they're wrong?
They might be very good.
We just don't see them in that way.
And which is why the role of the evaluator is really important.
And how do we even do the evaluation is very important.
Because when you come up with a new idea, should you try to explore that idea much further?
Or how deep should you go?
into stress testing that idea.
Should you try that idea out on a few different instances or sort of a thousand different
instances or really stress test that the idea actually works for the whole thing?
This is one of the interesting parts of Alpha Evolve.
Getting that balance right is really important so that you can look at where are the creative
solutions, how can you sort of filter out the ones that are promising and then use them later
to refine the search process to get the final solution?
If evaluation functions, automated evaluators are really like such a limiting constraint here
in terms of what we can get agents to do, any intuition from this project or others on how to
overcome that, like can models get good at helping us create automated evaluators? Should we
imagine simulators that are better for lots of different domains.
If I, you know, lame product manager putting in incomplete natural language spec to
coding agent, should I work with an assistant to like complete that spec?
Do I use traces?
How do you think that gets solved?
That's a really, really great question.
And I think you can view it from two perspectives that I think will happen at the same time.
So one is that, yes, currently the strict evaluation function plays a key role.
in AlphaEvolve.
And one takeaway you can take from this thinking about the future is that it shows the really
high value of having these evaluators available.
Because in many cases, it might be that you have a really important problem, but you don't
actually have a very precise definition of what makes for a good solution.
And one takeaway you can have from a system like this is that if you actually do build
a very precise evaluation function, then this unlocks the possibility of having an agent
like AlphaEvolve discover something that's way beyond what, let's say, humans have been able
to discover, or your best developers have been able to discover.
So that's one takeaway.
But the other takeaway that I'm maybe even more excited about from the research perspective
is that we don't actually think this is a conceptual limitation.
So today we have, this was maybe the easiest way to get into this game of discovering new
things by looking at problems that already come with these very precise evaluation functions.
So that's just a natural first step to take.
But I do believe that this assumption can be relaxed in very significant ways.
And in particular, you already mentioned one example where maybe language models themselves
will be able to evaluate whether proposed solutions look promising or not or whether they fail
in some particular ways.
And indeed, there is a parallel work from deep mind as well called AI co-scientist,
which demonstrates this very clearly, that if you propose ideas in natural,
language, then you can get language models to provide meaningful critiques and identify
the ones that work from the ones that don't. So I really do see a lot of hope on relaxing
this assumption. And then even in between these two extremes of strict evaluation that
exactly tells you how good a solution is on one end and then natural language evaluation
by a language model on the other end, there is a continual spectrum of simulators and auxiliary
evaluation functions, which are maybe not perfect, but as long as they are correlated with the true
signal, then we can build the algorithmic scaffolding of the evolutionary algorithm around this
in such a way that we still make meaningful progress. And maybe it will take a few more iterations,
but we can still go really, really far. So just to add what Monta is sort of mentioned,
I think one of the takeaways is basically that LLM-based agents like AlphaEBalls, especially
when we structure them in this way
with population-based
sort of search
with the evolutionary approaches
they are extremely effective
in searching. They can search
very convincingly and very
effectively in very large
spaces and come up with
very counterintuitive
new solutions for
important problems. Problems that
we have studied for many, many
years and sometimes in in some cases decades so that's one the other sort of element of the evaluator
like as much mentioned there is work on using other sources of for evaluation so you don't have
the perfect evaluator even for alpha evolve even if you have a simulator that's not a perfect
evaluator right because you are sort of going to evaluate things on a specific distribution of
problem instances. You might want to sort of approve certain properties of the solution,
right? You might want to say that the solution always has certain performance. So if you want
to prove certain properties of the solution, that might require sort of other work, right? You
might have to have a proof agent which sort of tries to approve certain properties of the solution.
while on the other hand, you have these LLM-based evaluators which can look at the solution
and you don't have, nobody has built a simulator, but they can just have a guess on how good
that solution is.
And in fact, that approach also works very well.
And we have shown that this AI co-scientist, which we have used for hypothesis generation,
it basically uses a multi-agent sort of set up and where LLMs themselves are able to sort of figure
out that certain hypotheses are better in terms of novelty and significance and impact and
should be propagated.
And that whole process ends up, and this might be surprising and counterintuitive to
some, producing much, much, much better results than the base large language model.
So you are really able to discover new information beyond what the large language model itself
alone was able to produce.
That begs the question, which I think is like one of the biggest meta questions proposed
by this sort of work, which is like, do we get self-improving AI, right?
One of the things you demonstrated with Alpha-Evolve is you can optimize the systems used
to train Alpha-Bol, right?
So you have this, you know, 23% speed up in part of the training infrastructure, if I recall
correctly, are we now witnessing the early stages of recursive self-improvement in AI and, you
what do you think the implications are, if that's true?
I think in some senses, sort of yes, but at the moment, what we have seen is basically
improvements in computation time.
So what AlphaEvolve has been able to do is basically make training more efficient.
But you can ask the question, can you make the training, can you improve the training
process such that the underlying model is not only sort of, uh, uh, uh,
trained faster, but is actually fundamentally better in certain cognitive tasks. And that is something
that has to be validated still, right? But it is a direction that is definitely very appealing and
something that is being sort of actively sort of looked at by many people. Do you have a reason to
believe it won't work? It should work. But as we sort of mentioned, that having good evaluators is an
important element, right? And so having a sort of evaluator which can say this proposal that you
have just suggested for me to improve the training process will yield a good result. So if you
have that kind of evaluator, then it will work. But there is no reason why such an evaluator does
not exist, but we need to sort of work on building such evaluation functions. Maybe just one
one thing to add to it is that I would also agree that we are maybe seeing the first sign of
self-improvement, but one also needs to be very specific about what we have shown so far,
like as Pushmitt mentioned, it's the speeding up the training of the next generation of the
Gemini model. So the feedback loop is fairly long, at least currently, maybe on the order of months.
But there is, you can call it self-improvement for sure. Maybe the big question that many people
are curious about is how does this extrapolate into the future? And you can,
have different types of self-improvement. One is where you get maybe just a one-off benefit.
Like the model improves itself once and that's it. Another one is okay, the model keeps improving
itself continuously, but maybe the improvements get marginally smaller and smaller and smaller
and you converge to some limit. Or maybe the improvements will keep accumulating up and up and up.
And that's a big open question that we don't have an answer to today.
Let's take that projection to other fields, and obviously these are all interrelated.
But one of the things you're really excited about is just how AI applies to these sciences.
When you think about new mathematical constructions, improve solutions to, you know, open problems or problems that looked solved, you know, to humanity 50 years ago, what do you think the implication is in different fields?
Like, is it a fundamental shift in how scientific discovery or mathematics?
gets done? First of all, yes, I'm super excited working in this area of using AI to accelerate
the sciences, because in a way, it's the most exciting application of AI that I can imagine. Like,
what could be more valuable or exciting to advancing the frontiers of human knowledge? So, yes,
that is definitely there. And then, of course, in different fields of science, the speed of progress
or the advance you get from AI might be slightly different. So in Alpha Evol,
we've primarily focused on mathematics and computer science because these are the domains where
it's the easiest to get these automated evaluation functions.
Like you often get them basically for free.
That's not to say that you cannot get them in other branches of science, but in maths and computer
science, it's just, they're just most common.
If you think about biology or chemistry, you want to design a molecule, then you can have an evaluation
function again in the form of a simulator or a predictive model that given a candidate molecule
will make a meaningful prediction about, okay, is this actually going to work in practice?
And then if you are in this regime, then again, Alpha Evolve would be applicable.
And we are only talking about the version of Alpha Evolve that we have built today.
And these are problems that we can address today.
But we don't think that the journey of AlphaEvolve finishes here.
We have many ideas about how to make this system more powerful and more broadly applicable.
And I'm fairly confident that we will see many applications across many branches of science.
And then this is only talking about AlphaEvolved.
There are many other agents, Bushmeet mentioned, AI co-scientist and many others that I'm sure we'll keep transforming how science is being done across the whole spectrum.
Yeah, so I think broadly, if you look at it, right?
science is, a lot of science involves searching, right?
Searching for the right idea, searching for the right construction,
searching for the right sort of solution, the right drug candidate, and so on.
And in some sense, like what scientists have been trying to do is sort of somehow
make that process repeatable, right?
At the moment, there is still sort of an element of
serendipity to some of the discoveries, but we are,
we move towards sort of rational material discovery or rational drug discovery, you are sort of seeing
computational approaches and very systematic evaluations playing a much more important role in
many areas of science. And I think as that work propagates, you will have systems like Alpha
Eval which will be able to search in those spaces and use these evaluations much more effectively.
So it's like you can sort of see this as a tool that will give scientists a superpower in their ability to search over very complex and sometimes counterintuitive sort of solutions based in.
When I think about one logical extension to this approach, it is, let's say, like automated evaluation in the real world, right?
So lab, assay, you know, a bunch of robotic arms doing experimentation if you're screening molecules or something.
What do you think the role, let's just say like very near term, if that vision is true of the human scientists or engineer is?
Is it the problem framing, like determining the evaluation?
Is it constraining the like giving some intuition for like a starting point or a search space?
Like, what should the human scientists be good at from here?
There are many sort of elements, right?
First of all, as we have been talking about a lot, the role of the evaluation function, right?
So that needs to be defined.
Like, what do we really, how do we want to assess these solutions?
But then there are many other sort of elements as well, right?
When we are trying to find a solution, it has to have certain properties.
What are those properties, right?
giving hints,
giving sort of,
for example,
if you're trying to discover a new drug,
you want to make sure that that drug
sort of treats the disease
but does not kill the patient,
right?
It has sort of,
its side effects are low,
right?
Or it can be,
what is the delivery mechanism for it?
So there are so many different
requirements that a solution might want,
that might need to satisfy.
And some of them are encoded in the evaluator, in function evaluator.
And some of them, you might want to heart constrain them in the solution, right?
And so can you specify those so that an agent like Alpha Evolve can take that into account
while it is thinking about how it exposed the search space or how it constructs the solutions
that it will sort of generate?
These are all sort of very interesting places where human input might be,
required, but especially as we look at many different types of domains. So yeah, I think we should
definitely see this as an amazing tool for scientists, for computer scientists, mathematicians,
and this is, in fact, this has been sort of our experience as well, that in the right hands,
it is a very powerful tool, right? So like mathematicians who have tried to explore it and
and they have been able to specify what are the solutions that,
what are the types of solutions that they're looking for?
They can be much more productive and much more sort of effective in finding the solutions.
I just wanted to highlight that even though we have been describing AlphiBolb as this kind
of autonomous agent that does things on its own, actually in practice using this agent
often turns out to be surprisingly collaborative.
And we have seen this in particular with mathematicians that we have collaborated with.
And there are a few reasons for this.
But one is that AlphaEvolve is an agent that doesn't just give you the solution.
It searches for an algorithm that constructs that solution.
And so depending on how you set up your problem definition,
often it's actually the algorithm that's even more valuable than the solution itself.
Because the algorithm, it tells you how to construct the solution.
So that means you understand what are the ideas that go into building that solution.
And maybe especially or definitely it's true in mathematics,
that's what people really care about,
to understand the nature of our universe and build up the understanding of fundamental ideas.
And so it's actually often not interesting almost at all what the solution is,
but what you care about is how you build it.
And so we had a first-hand experience collaborating with,
multiple mathematicians and it's been really fascinating to see where we would share with them
the output from AlphaEvolve and they'll be like really fascinated looking at the code that it found
and trying to understand okay what is it actually doing and then understanding oh okay this this is doing
this is doing that and now I can see why if you put it together then it leads to a really good
solution yeah I can also confirm from my own personal experience that looking at the at the code or the
algorithms that the system finds. It's often a really interesting experience because it's code
that kind of looks human-like. It's something that you could have written, but would you have
thought of writing it in exactly this way and then trying to understand of, okay, what exactly
is it doing? That's a really interesting experience. But at the same time, it's one of the
key strengths of the system, not only for scientific applications where you can look at the code
and get some understanding out of it,
but also for many of the practical applications,
it's hugely valuable that the artifact you get out of Alpha Evolve
is a piece of code, and then you deploy that piece of code.
And so before you do that, experts, engineers who have worked on that system
can visually inspect that piece of code, understand it,
and make the final decision of whether it's going to be deployed.
So it's in a completely different league from, let's say,
considering using a neural network to make decisions in some production system where you kind of need to
trust that the neural network is going to always behave in the way that you hope it will. With the
code, you can look at it, understand it, and make the decision yourself. I might add that
basically not all code is interpretable by humans, right? The solutions and the programs that
alpha-called finds are sort of interpretable by human programmers. So this
is going to be a very interesting area of work in the future as to when you find these
solutions, what can we learn from them? This was very interesting, like as Mate was sort of
mentioning, this was a very interesting experience that we had working with Jordan Ellenberg
in the first, in the earlier version of Alpha Valve, when you're working on the capset problem.
The programs that it discovered had very interesting symmetries that that mathematicians did not
know about. And so not only the solution was mathematically interesting, but like the actual
sort of construction, but the algorithm for producing that construction had the structure of it
was interesting in itself. For listeners who are thinking about accessibility or implications
for themselves where they're not professional mathematicians in collaboration with Alpha
evolve. What are the considerations in making some of these capabilities more broadly available?
We want to make these capabilities accessible to as many people as we can to the wider community.
Now, we have started a trust-retester program where we have asked people to submit proposals.
And what we intend to do with that program is to figure out what are the right ways in which people
can really leverage Alpha-Evolve.
So we have internally used it across Google,
but as you know, it requires certain things,
sort of the need for a function evaluator.
As part of the Trusted TestR program,
we are going to be evaluating Alpha-Evolve
on a bunch of different types of applications,
and that will inform our future release strategy
as to how do we make it more broadly applicable.
The second sort of element is that
not only you need the evaluator, but you also need a significant amount of computational
resources, right? Because it's not just one single LLM call. It requires a significant amount
of function evaluation depending on the difficulty of the problem. If it's a easy problem,
then you can do it very quickly. But if you really are going for some very hard problems with
a very large extended search space and you want to spend a significant amount of time searching
over it, then how do you build the overall system that people can sort of can use effectively
and efficiently? That's the other sort of thing that we'll be thinking about.
Last question for you both. Is there practical application within Google that you think will
be interesting that you haven't tried Alpha evolve on yet? In this white paper, we try to think
about holistically, when we look at the computational infrastructure of Google, what are the
the key parts in this infrastructure to demonstrate that AlphaEvolve can make discoveries across
the stack, not only in one part of it, and that it can make discoveries that are highly valuable.
And so we try to cover the entire spectrum. So we show that AlphaEvolve can improve the efficiency
of the data center, it can contribute to hardware design, and it can contribute to improving
the efficiency of most important pieces of software that are being run inside Google. And one intention
here was to demonstrate that this is a really versatile tool that you can apply across the
spectrum. And as Pushmitt was saying, this is a tool that is already available inside Google
and it is being used for many, many problems. There are quite a few exciting ones. I'm not ready
to share about the particulars yet, but as you can imagine, there is so many exciting computational
problems in a place like Google within AI and also outside. That, yeah, I'm sure that
there will be many, many really cool results coming in the future.
I think that's a great note to end on Pushmeet, Mate, anything we didn't cover?
No, I think that will great.
Thank you guys so much for being here. Congrats.
Okay, great.
Thank you very much.
Find us on Twitter at NoPriarsPod.
Subscribe to our YouTube channel if you want to see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts for every episode at Node
dash priors.com.