No Priors: Artificial Intelligence | Technology | Startups - Meet AlphaEvolve: The Autonomous Agent That Discovers Algorithms Better Than Humans With Google DeepMind’s Pushmeet Kohli and Matej Balog

Episode Date: June 26, 2025

Much of the scientific process involves searching. But rather than continue to rely on the luck of discovery, Google DeepMind has engineered a more efficient AI agent that mines complex spaces to faci...litate scientific breakthroughs. Sarah Guo speaks with Pushmeet Kohli, VP of Science and Strategic Initiatives, and research scientist Matej Balog at Google DeepMind about AlphaEvolve, an autonomous coding agent they developed that finds new algorithms through evolutionary search. Pushmeet and Matej talk about how AlphaEvolve tackles the problem of matrix multiplication efficiency, scaling and iteration in problem solving, and whether or not this means we are at self-improving AI. Together, they also explore the implications AlphaEvolve has to other sciences beyond mathematics and computer science. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @pushmeet | @matejbalog Chapters: 00:00 Pushmeet Kohli and Matej Balog Introduction 0:48 Origin of AlphaEvolve 02:31 AlphaEvolve’s Progression from AlphaGo and AlphaTensor 08:02 The Open Problem of Matrix Multiplication Efficiency 11:18 How AlphaEvolve Evolves Code 14:43 Scaling and Predicting Iterations 16:52 Implications for Coding Agents 19:42 Overcoming Limits of Automated Evaluators 25:21 Are We At Self-Improving AI? 28:10 Effects on Scientific Discovery and Mathematics 31:50 Role of Human Scientists with AlphaEvolve 38:30 Making AlphaEvolve Broadly Accessible 40:18 Applying AlphaEvolve Within Google 41:39 Conclusion

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, listeners, and welcome back to No Pryors. Today, we're joined by two of the key folks behind one of the most compelling developments in AI this year, Alpha Evolve. Pushmit Koli and Matei Balag worked on this autonomous coding agent that uses Gemini models and evolutionary search to discover new algorithms. It marks a major leap in AI's ability to contribute to core computer science and math, and perhaps science is beyond that. It's not just a stochastic pair.
Starting point is 00:00:30 or a boilerplate generator, it has shown what you might consider technical creativity in the way that Move 37 did with AlphaGo, something humans hadn't done before, even in thousands of years of play. It might even be a real step on the path to self-improving AI. Pushmeet, Matti, thank you so much for being here. Thank you for having us. It's a pleasure. Congratulations on the success and the launch of Alpha Evolve. Can you give me a brief description of what it is broadly?
Starting point is 00:00:56 Yeah, so in maybe one sentence, Alpha Evolve is an Aalphalve. A.I. Coding agent that is able to discover new algorithms that are able to make new discoveries on open scientific problems. And at the same time, those algorithms can be so practical that they are already deployed in key parts of Google's own infrastructure. What is the origin story of working on this particular form of coding agent or this problem statement? So we are not new to this space of algorithm discovery. As you might know, the mission of all of DeepMind is to build AI responsibly to benefit humanity and the way our particular team
Starting point is 00:01:33 has been doing it for years now is to look for ways how AI can discover new algorithms. New algorithms are everywhere around us. So this is a very important question and can have very high impact when we can discover algorithms that solve important computational problems with higher efficiency than what we have been able to do so far. And kind of the first breakthrough we had in this space was in 2022 when we released a system called Alpha Tensor. And so that was a system that was an AI system using reinforcement learning that for a very specific but fundamental computational tasks, so multiplying matrices, for the first time,
Starting point is 00:02:12 showed that AI agents can discover better algorithms than what humans had been able to do before them. So this was the first system that gave weight to this idea that indeed with AI, will be able to go into this superhuman region of algorithms that we as humans have not been able to discover ourselves. How do you differentiate Alpha Evolve from like Alpha Tensor and Fun Search and some other projects in the sort of lineage of this? One way to also describe what we have done is if you look back at the history of Deep Mind and see a number of sort of projects that have come even before we started working on computer science. our earlier sort of work, and if we go back to the project on AlphaGo, where the AlphaGo agent was able to beat the World Go champion in the Game of Go.
Starting point is 00:03:08 And the remarkable sort of thing in that agent was that it was able to explore this amazingly large search space of all possible sort of Go positions in a situation. efficient manner that it can sort of come up with what is the optimal move at that time right and really surprised people both go professionals as well as scientists scientists believe that that event would come much much later because it was a very hard problem and so what that gave evidence for is that is the ability of these large-scale neural networks based systems to be able to reason and do very efficient exploration in these large search spaces and come up with amazing new insights about the particular domain.
Starting point is 00:04:07 And in the game of Go, I mean, there is this move called Move 37, which is a very creative new move that the agent discovered that was not in the Go literature, right? that really surprised the Goh professionals. So in some sense, we asked ourselves the question that if you have an agent which can do very efficient search in the domain of Go, why can't you use the same kind of philosophy to search for algorithms in the space of algorithms? And in fact, that sort of was the underlying basis of the work on our first sort of attempt at that problem, which culminated in alpha-tensor.
Starting point is 00:04:53 So how we structured the algorithmic discovery problem is we looked at first a very important problem, and that problem was matrix multiplication. It is a problem that is ubiquitous in computer science. It's one of the key fundamental operators that underlies not only computer science, was also neural networks and machine learning and AI. We said, can we find a way to improve matrix multiplication algorithms? So there's a history of metric multiplication, which is very interesting for people who might be interested in it.
Starting point is 00:05:33 Even though it's such a fundamental operator, people thought that the complexity or the time it takes to multiply two matrices is order cube. And around 50 years back, more than 50 years back now, a German mathematician Strasson came up with this very counterintuitive construction, which showed that, in fact, the complexity was not N to the par three, or what's not cubic, where N is the sort of dimensionality of the matrix, it's lower. And so, and that was a very counterintuitive sort of result. and but it stayed for more than 50 years and until sort of alpha-tensor came up and we said, well, can we actually improve this result? And remarkably, we were able to show that.
Starting point is 00:06:24 That alpha-tensor, while by having this amazing ability to do search in this very large space, even much larger than the space of possible go-moves, was able to come up with this amazingly new, algorithm which improve things. But then the question was, well, we now have proved the thesis that
Starting point is 00:06:47 you have these super intelligent agent which can go beyond what human computer scientists have been able to do. But can we generalize them? This is sort of alpha tensor was very smart but was only
Starting point is 00:07:03 purposefully constructed for the matrix multiplication problem. Can we have build a agent that is more general, both more general in the sense of it can handle more general problems, but can also search in the space more naturally in the space of programs rather than in the space of very specific sort of operations that were required for matrix modification. And that was the origin of sort of the first attempt of us with Fun Search, which was an LLM-based agent, which for the first time, by searching in the space of programs,
Starting point is 00:07:47 showed that you can come up with completely new solutions, and made the first scientific discovery from NLM. And AlphaEvol is basically an extension. of that. I'm very inspired by the idea, I think as many people are, that AI will actually have creativity, does actually have technical creativity, as you are describing as one way to conceptualize this, where you're, you know, outside of the patterns that we already know as engineers. I want to go back to some of the mechanics here and the limits to generalization and how to think about automated evaluators and a lot of different topics. But, you know, when you think about these problems that are clearly, like, economically valuable and interesting, like matrix multiplication,
Starting point is 00:08:37 the potential efficiency of it. What is your intuition for why, you know, those solutions have not been found before? Is it simply like the search space is too large or people in this field were complacent in that they believed a certain solution was like the maximum efficiency or because clearly there's value to be had here? of opinion on this is basically that if you look at the structure of the algorithm what Strassen produced was quite sort of ingenious. It was not a natural thing that you would sort of think of and that is that was for only two by two matrices. As you sort of go to larger sizes, the space is so huge. The constructions are not sort of something which is very natural. These are very involved
Starting point is 00:09:29 and intricate sort of constructions that would be very hard to discover by chance. So it's quite interesting that it has this very special structure, but it's not something that comes naturally to a human computer scientist. Just to add to that, so I definitely agree the search space is just unbelievably vast. The solutions are maybe non-intuitive. And the third thing I want to emphasize is that I really believe the people who worked on this in the past were definitely not complacent. And in fact, the problems we chose to apply alpha evolved to in the first instance, both on the scientific side and the practical side, we deliberately chose problems which have been worked on for a very long time by the very best people.
Starting point is 00:10:18 So on the scientific side, since we're talking about matrix multiplication, this has been a known open problem for decades. and then many people have been working on it. And similarly for the practical applications that we mentioned in our Alpha-Evolve release in key parts of Google's infrastructure, again, these are things that have been heavily optimized inside Google because they are so important. And so by having a system like Alpha-Evolve or any other discover something new on these problems, I think that's as strong a demonstration as I can imagine of the fact that this is indeed something that is new because no one found it before. And also it is something that was not
Starting point is 00:11:00 easy to discover because those results stood for such a long time and have been worked on by such strong people. I noted that this is not a comment on the broad efforts of the computer science industry to date on matrix multiplication or data center optimization. I think this is a good moment to try to demystify what's happening under the hood for a broader set of people. Can you walk us through a concrete example of how Alpha-Evolve actually evolves code and say, let's take the example of trying to optimize data center scheduling, right? What does the step-by-step process look like from initial random code to final solution that saves millions of dollars of power? I can look you through that. So the user of a system like Alpha-Evolve, they basically specify
Starting point is 00:11:48 what is the problem that they are trying to solve. So that's the most important thing. And you specify it by providing what is called an evaluation function. What this function does is whenever there is a proposed solution for solving the problem, you're able to tell how good this solution is. So you basically define what makes a good solution. For discovering an algorithm for scheduling jobs on a data center, this evaluation function could be something like a simulator of jobs in a data center. Given an algorithm for doing the scheduling, it simulates how good this algorithm is. So that's what the user provides. And this is a simulator you already had. Yes. So that's a simulator that we already had. And I would say it's something that is
Starting point is 00:12:30 quite natural to have in many, in many domains, because whenever you want to innovate on something, you need to have a way of telling, okay, is the innovation actually good or not? So it's a very natural object to have at least in principle. So you define the what by providing the evaluation function. And then alpha evolve fills in the how. So that, that, that, That's the job of our system. And you can do it in two fairly different ways. One is you tell AlphaEval, I have no idea how to solve this problem. Let's start completely from scratch.
Starting point is 00:13:01 And let's try to be creative and come up with something completely new. So that's one option you can take. Another option you can take is, actually, we have already worked on this problem for a really long time. Here is a very strong initial solution that we can provide to the system. And you can start from here. And that's what we did for the application to discovering new algorithms. for scheduling jobs in a data center.
Starting point is 00:13:24 So AlphaEvolve takes this initial solution. And then on a high level, it combines the creative power of large language models to propose creative new ways how to improve that solution, the strictness of the evaluation function provided by the user that is able to actually filter out the things that work from the ones that don't. And then this is wrapped inside an evolutionary algorithm
Starting point is 00:13:47 that makes sure that we can discover the whole space, of algorithms in that region so that we don't commit to a very specific type of solution early on, but instead we maintain a diverse pool of potential solutions. Over time, maybe we combine ideas from different solutions that are already strong until we actually have an algorithm that's so strong that we are happy to deploy it to a critical part of Google's infrastructure, let's see. And intuitively, not in the machine learning sense, but in the evolution sense, you have different generations where you're getting closer to an
Starting point is 00:14:20 optimal solution. Yeah, that's right. Like you would expect that in each iteration of evolution, what you're doing is you are looking at the previous iteration, looking at maybe the strongest solutions you have, and then trying to be creative about how can I combine ideas from those solutions or maybe bring in completely new ideas to come up with something even better. And so, yes, each generation gets stronger and stronger. How much scaling are we talking about? Like, is there a way to predict how many generations it takes, or how do you, you know, constrain the, number of iterations that the model can use. So there are two parts to your question.
Starting point is 00:14:55 One is about, okay, how does scaling work and then how can you predict it? So for the first part, this is actually a really nice feature of AlphaEvolve that it can adapt to the difficulty of the problem. If you ask AlphaEvolve to find a solution to a problem that's actually unexpectedly easy, then it will just do it very, very quickly. Like almost immediately you will have the solution. But if you ask it a problem that's really, really difficult, And by really, really difficult, I mean like really difficult, maybe an open question that has stood for decades in the sciences or you want the practical algorithm for a really high value application in Google.
Starting point is 00:15:33 Then you would of course expect this is not an easy problem. You might need to spend longer time considering different solutions, exploring the space, combining ideas. But what's really nice about AlphaEvolve is that it is able to sustain this scaling in a way that it keeps improving over time. and it keeps improving for so long that you can make discoveries on this level of difficulty, like breaking decades-old scientific challenges or discovering high-value algorithms. Now, I know it maybe sounds trivial that if you wait longer, you get better results, but in practice, that's actually like a really difficult thing to build automated agents that are able to sustain this continual improvement without plateauing quite early.
Starting point is 00:16:17 This is, I think, a nice feature. There was a second part of the question about predicting how many iterations you will need. So that is something that is actually not so easy because it's like asking a priori, do you know how difficult this question is going to be? And especially in the sciences, that's something that often has a very surprising answer. Very trivial questions can turn out to be extremely, extremely difficult and vice versa. But the nice thing is that you have continual improvement if you run this system. And as long as you can run it, you can expect to get better and better. the results, and you just have to see where this gets you.
Starting point is 00:16:52 If you think about the coding agents that general developers have access to and are increasingly using today, one frustration with them is on relatively trivial problems. It is set out to do autonomously. It will get lost and blow itself up, or plateau, as you said, in frustrating ways. Can you talk about if you think there are implications from AlphaVov to these other general coding agents? While large segment models and coding agents are getting much better in their understanding of code, they're not perfect, right? So they do make mistakes. The other sort of element is to think about what is the task that these agents have been assigned. Mostly, if you are asking an agent to solve a particular task or write a particular program, you are
Starting point is 00:17:48 providing a specification. You are specifying the tasks either in natural language or you're saying, well, I'm trying to do something completed. Right. So it's not a complete characterization of what you want. It's a partial specification of what you want and the agent then try to solve the problem and might get lucky and might get the right result or they might hallucinate and get the wrong result and the issue is how do you know what so whether the result is right or wrong and that depends on having a good evaluator that's how alpha evolve solves the problem so in some sense we are able to leverage the hallucinations for a beneficial purpose right So the creativity and the wrong answers that Alpha Evolp can somehow come up with,
Starting point is 00:18:41 how do we know that they're wrong? They might be very good. We just don't see them in that way. And which is why the role of the evaluator is really important. And how do we even do the evaluation is very important. Because when you come up with a new idea, should you try to explore that idea much further? Or how deep should you go? into stress testing that idea.
Starting point is 00:19:08 Should you try that idea out on a few different instances or sort of a thousand different instances or really stress test that the idea actually works for the whole thing? This is one of the interesting parts of Alpha Evolve. Getting that balance right is really important so that you can look at where are the creative solutions, how can you sort of filter out the ones that are promising and then use them later to refine the search process to get the final solution? If evaluation functions, automated evaluators are really like such a limiting constraint here in terms of what we can get agents to do, any intuition from this project or others on how to
Starting point is 00:19:56 overcome that, like can models get good at helping us create automated evaluators? Should we imagine simulators that are better for lots of different domains. If I, you know, lame product manager putting in incomplete natural language spec to coding agent, should I work with an assistant to like complete that spec? Do I use traces? How do you think that gets solved? That's a really, really great question. And I think you can view it from two perspectives that I think will happen at the same time.
Starting point is 00:20:26 So one is that, yes, currently the strict evaluation function plays a key role. in AlphaEvolve. And one takeaway you can take from this thinking about the future is that it shows the really high value of having these evaluators available. Because in many cases, it might be that you have a really important problem, but you don't actually have a very precise definition of what makes for a good solution. And one takeaway you can have from a system like this is that if you actually do build a very precise evaluation function, then this unlocks the possibility of having an agent
Starting point is 00:21:00 like AlphaEvolve discover something that's way beyond what, let's say, humans have been able to discover, or your best developers have been able to discover. So that's one takeaway. But the other takeaway that I'm maybe even more excited about from the research perspective is that we don't actually think this is a conceptual limitation. So today we have, this was maybe the easiest way to get into this game of discovering new things by looking at problems that already come with these very precise evaluation functions. So that's just a natural first step to take.
Starting point is 00:21:32 But I do believe that this assumption can be relaxed in very significant ways. And in particular, you already mentioned one example where maybe language models themselves will be able to evaluate whether proposed solutions look promising or not or whether they fail in some particular ways. And indeed, there is a parallel work from deep mind as well called AI co-scientist, which demonstrates this very clearly, that if you propose ideas in natural, language, then you can get language models to provide meaningful critiques and identify the ones that work from the ones that don't. So I really do see a lot of hope on relaxing
Starting point is 00:22:10 this assumption. And then even in between these two extremes of strict evaluation that exactly tells you how good a solution is on one end and then natural language evaluation by a language model on the other end, there is a continual spectrum of simulators and auxiliary evaluation functions, which are maybe not perfect, but as long as they are correlated with the true signal, then we can build the algorithmic scaffolding of the evolutionary algorithm around this in such a way that we still make meaningful progress. And maybe it will take a few more iterations, but we can still go really, really far. So just to add what Monta is sort of mentioned, I think one of the takeaways is basically that LLM-based agents like AlphaEBalls, especially
Starting point is 00:22:58 when we structure them in this way with population-based sort of search with the evolutionary approaches they are extremely effective in searching. They can search very convincingly and very effectively in very large
Starting point is 00:23:16 spaces and come up with very counterintuitive new solutions for important problems. Problems that we have studied for many, many years and sometimes in in some cases decades so that's one the other sort of element of the evaluator like as much mentioned there is work on using other sources of for evaluation so you don't have the perfect evaluator even for alpha evolve even if you have a simulator that's not a perfect
Starting point is 00:23:49 evaluator right because you are sort of going to evaluate things on a specific distribution of problem instances. You might want to sort of approve certain properties of the solution, right? You might want to say that the solution always has certain performance. So if you want to prove certain properties of the solution, that might require sort of other work, right? You might have to have a proof agent which sort of tries to approve certain properties of the solution. while on the other hand, you have these LLM-based evaluators which can look at the solution and you don't have, nobody has built a simulator, but they can just have a guess on how good that solution is.
Starting point is 00:24:33 And in fact, that approach also works very well. And we have shown that this AI co-scientist, which we have used for hypothesis generation, it basically uses a multi-agent sort of set up and where LLMs themselves are able to sort of figure out that certain hypotheses are better in terms of novelty and significance and impact and should be propagated. And that whole process ends up, and this might be surprising and counterintuitive to some, producing much, much, much better results than the base large language model. So you are really able to discover new information beyond what the large language model itself
Starting point is 00:25:19 alone was able to produce. That begs the question, which I think is like one of the biggest meta questions proposed by this sort of work, which is like, do we get self-improving AI, right? One of the things you demonstrated with Alpha-Evolve is you can optimize the systems used to train Alpha-Bol, right? So you have this, you know, 23% speed up in part of the training infrastructure, if I recall correctly, are we now witnessing the early stages of recursive self-improvement in AI and, you what do you think the implications are, if that's true?
Starting point is 00:25:52 I think in some senses, sort of yes, but at the moment, what we have seen is basically improvements in computation time. So what AlphaEvolve has been able to do is basically make training more efficient. But you can ask the question, can you make the training, can you improve the training process such that the underlying model is not only sort of, uh, uh, uh, trained faster, but is actually fundamentally better in certain cognitive tasks. And that is something that has to be validated still, right? But it is a direction that is definitely very appealing and something that is being sort of actively sort of looked at by many people. Do you have a reason to
Starting point is 00:26:38 believe it won't work? It should work. But as we sort of mentioned, that having good evaluators is an important element, right? And so having a sort of evaluator which can say this proposal that you have just suggested for me to improve the training process will yield a good result. So if you have that kind of evaluator, then it will work. But there is no reason why such an evaluator does not exist, but we need to sort of work on building such evaluation functions. Maybe just one one thing to add to it is that I would also agree that we are maybe seeing the first sign of self-improvement, but one also needs to be very specific about what we have shown so far, like as Pushmitt mentioned, it's the speeding up the training of the next generation of the
Starting point is 00:27:28 Gemini model. So the feedback loop is fairly long, at least currently, maybe on the order of months. But there is, you can call it self-improvement for sure. Maybe the big question that many people are curious about is how does this extrapolate into the future? And you can, have different types of self-improvement. One is where you get maybe just a one-off benefit. Like the model improves itself once and that's it. Another one is okay, the model keeps improving itself continuously, but maybe the improvements get marginally smaller and smaller and smaller and you converge to some limit. Or maybe the improvements will keep accumulating up and up and up. And that's a big open question that we don't have an answer to today.
Starting point is 00:28:10 Let's take that projection to other fields, and obviously these are all interrelated. But one of the things you're really excited about is just how AI applies to these sciences. When you think about new mathematical constructions, improve solutions to, you know, open problems or problems that looked solved, you know, to humanity 50 years ago, what do you think the implication is in different fields? Like, is it a fundamental shift in how scientific discovery or mathematics? gets done? First of all, yes, I'm super excited working in this area of using AI to accelerate the sciences, because in a way, it's the most exciting application of AI that I can imagine. Like, what could be more valuable or exciting to advancing the frontiers of human knowledge? So, yes, that is definitely there. And then, of course, in different fields of science, the speed of progress
Starting point is 00:29:05 or the advance you get from AI might be slightly different. So in Alpha Evol, we've primarily focused on mathematics and computer science because these are the domains where it's the easiest to get these automated evaluation functions. Like you often get them basically for free. That's not to say that you cannot get them in other branches of science, but in maths and computer science, it's just, they're just most common. If you think about biology or chemistry, you want to design a molecule, then you can have an evaluation function again in the form of a simulator or a predictive model that given a candidate molecule
Starting point is 00:29:43 will make a meaningful prediction about, okay, is this actually going to work in practice? And then if you are in this regime, then again, Alpha Evolve would be applicable. And we are only talking about the version of Alpha Evolve that we have built today. And these are problems that we can address today. But we don't think that the journey of AlphaEvolve finishes here. We have many ideas about how to make this system more powerful and more broadly applicable. And I'm fairly confident that we will see many applications across many branches of science. And then this is only talking about AlphaEvolved.
Starting point is 00:30:21 There are many other agents, Bushmeet mentioned, AI co-scientist and many others that I'm sure we'll keep transforming how science is being done across the whole spectrum. Yeah, so I think broadly, if you look at it, right? science is, a lot of science involves searching, right? Searching for the right idea, searching for the right construction, searching for the right sort of solution, the right drug candidate, and so on. And in some sense, like what scientists have been trying to do is sort of somehow make that process repeatable, right? At the moment, there is still sort of an element of
Starting point is 00:30:59 serendipity to some of the discoveries, but we are, we move towards sort of rational material discovery or rational drug discovery, you are sort of seeing computational approaches and very systematic evaluations playing a much more important role in many areas of science. And I think as that work propagates, you will have systems like Alpha Eval which will be able to search in those spaces and use these evaluations much more effectively. So it's like you can sort of see this as a tool that will give scientists a superpower in their ability to search over very complex and sometimes counterintuitive sort of solutions based in. When I think about one logical extension to this approach, it is, let's say, like automated evaluation in the real world, right? So lab, assay, you know, a bunch of robotic arms doing experimentation if you're screening molecules or something.
Starting point is 00:32:07 What do you think the role, let's just say like very near term, if that vision is true of the human scientists or engineer is? Is it the problem framing, like determining the evaluation? Is it constraining the like giving some intuition for like a starting point or a search space? Like, what should the human scientists be good at from here? There are many sort of elements, right? First of all, as we have been talking about a lot, the role of the evaluation function, right? So that needs to be defined. Like, what do we really, how do we want to assess these solutions?
Starting point is 00:32:43 But then there are many other sort of elements as well, right? When we are trying to find a solution, it has to have certain properties. What are those properties, right? giving hints, giving sort of, for example, if you're trying to discover a new drug, you want to make sure that that drug
Starting point is 00:33:02 sort of treats the disease but does not kill the patient, right? It has sort of, its side effects are low, right? Or it can be, what is the delivery mechanism for it?
Starting point is 00:33:14 So there are so many different requirements that a solution might want, that might need to satisfy. And some of them are encoded in the evaluator, in function evaluator. And some of them, you might want to heart constrain them in the solution, right? And so can you specify those so that an agent like Alpha Evolve can take that into account while it is thinking about how it exposed the search space or how it constructs the solutions that it will sort of generate?
Starting point is 00:33:49 These are all sort of very interesting places where human input might be, required, but especially as we look at many different types of domains. So yeah, I think we should definitely see this as an amazing tool for scientists, for computer scientists, mathematicians, and this is, in fact, this has been sort of our experience as well, that in the right hands, it is a very powerful tool, right? So like mathematicians who have tried to explore it and and they have been able to specify what are the solutions that, what are the types of solutions that they're looking for? They can be much more productive and much more sort of effective in finding the solutions.
Starting point is 00:34:35 I just wanted to highlight that even though we have been describing AlphiBolb as this kind of autonomous agent that does things on its own, actually in practice using this agent often turns out to be surprisingly collaborative. And we have seen this in particular with mathematicians that we have collaborated with. And there are a few reasons for this. But one is that AlphaEvolve is an agent that doesn't just give you the solution. It searches for an algorithm that constructs that solution. And so depending on how you set up your problem definition,
Starting point is 00:35:11 often it's actually the algorithm that's even more valuable than the solution itself. Because the algorithm, it tells you how to construct the solution. So that means you understand what are the ideas that go into building that solution. And maybe especially or definitely it's true in mathematics, that's what people really care about, to understand the nature of our universe and build up the understanding of fundamental ideas. And so it's actually often not interesting almost at all what the solution is, but what you care about is how you build it.
Starting point is 00:35:46 And so we had a first-hand experience collaborating with, multiple mathematicians and it's been really fascinating to see where we would share with them the output from AlphaEvolve and they'll be like really fascinated looking at the code that it found and trying to understand okay what is it actually doing and then understanding oh okay this this is doing this is doing that and now I can see why if you put it together then it leads to a really good solution yeah I can also confirm from my own personal experience that looking at the at the code or the algorithms that the system finds. It's often a really interesting experience because it's code that kind of looks human-like. It's something that you could have written, but would you have
Starting point is 00:36:30 thought of writing it in exactly this way and then trying to understand of, okay, what exactly is it doing? That's a really interesting experience. But at the same time, it's one of the key strengths of the system, not only for scientific applications where you can look at the code and get some understanding out of it, but also for many of the practical applications, it's hugely valuable that the artifact you get out of Alpha Evolve is a piece of code, and then you deploy that piece of code. And so before you do that, experts, engineers who have worked on that system
Starting point is 00:37:04 can visually inspect that piece of code, understand it, and make the final decision of whether it's going to be deployed. So it's in a completely different league from, let's say, considering using a neural network to make decisions in some production system where you kind of need to trust that the neural network is going to always behave in the way that you hope it will. With the code, you can look at it, understand it, and make the decision yourself. I might add that basically not all code is interpretable by humans, right? The solutions and the programs that alpha-called finds are sort of interpretable by human programmers. So this
Starting point is 00:37:44 is going to be a very interesting area of work in the future as to when you find these solutions, what can we learn from them? This was very interesting, like as Mate was sort of mentioning, this was a very interesting experience that we had working with Jordan Ellenberg in the first, in the earlier version of Alpha Valve, when you're working on the capset problem. The programs that it discovered had very interesting symmetries that that mathematicians did not know about. And so not only the solution was mathematically interesting, but like the actual sort of construction, but the algorithm for producing that construction had the structure of it was interesting in itself. For listeners who are thinking about accessibility or implications
Starting point is 00:38:33 for themselves where they're not professional mathematicians in collaboration with Alpha evolve. What are the considerations in making some of these capabilities more broadly available? We want to make these capabilities accessible to as many people as we can to the wider community. Now, we have started a trust-retester program where we have asked people to submit proposals. And what we intend to do with that program is to figure out what are the right ways in which people can really leverage Alpha-Evolve. So we have internally used it across Google, but as you know, it requires certain things,
Starting point is 00:39:17 sort of the need for a function evaluator. As part of the Trusted TestR program, we are going to be evaluating Alpha-Evolve on a bunch of different types of applications, and that will inform our future release strategy as to how do we make it more broadly applicable. The second sort of element is that not only you need the evaluator, but you also need a significant amount of computational
Starting point is 00:39:44 resources, right? Because it's not just one single LLM call. It requires a significant amount of function evaluation depending on the difficulty of the problem. If it's a easy problem, then you can do it very quickly. But if you really are going for some very hard problems with a very large extended search space and you want to spend a significant amount of time searching over it, then how do you build the overall system that people can sort of can use effectively and efficiently? That's the other sort of thing that we'll be thinking about. Last question for you both. Is there practical application within Google that you think will be interesting that you haven't tried Alpha evolve on yet? In this white paper, we try to think
Starting point is 00:40:29 about holistically, when we look at the computational infrastructure of Google, what are the the key parts in this infrastructure to demonstrate that AlphaEvolve can make discoveries across the stack, not only in one part of it, and that it can make discoveries that are highly valuable. And so we try to cover the entire spectrum. So we show that AlphaEvolve can improve the efficiency of the data center, it can contribute to hardware design, and it can contribute to improving the efficiency of most important pieces of software that are being run inside Google. And one intention here was to demonstrate that this is a really versatile tool that you can apply across the spectrum. And as Pushmitt was saying, this is a tool that is already available inside Google
Starting point is 00:41:14 and it is being used for many, many problems. There are quite a few exciting ones. I'm not ready to share about the particulars yet, but as you can imagine, there is so many exciting computational problems in a place like Google within AI and also outside. That, yeah, I'm sure that there will be many, many really cool results coming in the future. I think that's a great note to end on Pushmeet, Mate, anything we didn't cover? No, I think that will great. Thank you guys so much for being here. Congrats. Okay, great.
Starting point is 00:41:47 Thank you very much. Find us on Twitter at NoPriarsPod. Subscribe to our YouTube channel if you want to see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at Node dash priors.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.