No Priors: Artificial Intelligence | Technology | Startups - AI and the Future of Math, with DeepMind’s AlphaProof Team
Episode Date: November 14, 2024In this week’s episode of No Priors, Sarah and Elad sit down with the Google DeepMind team behind AlphaProof, Laurent Sartran, Rishi Mehta, and Thomas Hubert. AlphaProof is a new reinforcement learn...ing-based system for formal math reasoning that recently reached a silver-medal standard in solving International Mathematical Olympiad problems. They dive deep into AI and its role in solving complex mathematical problems, featuring insights into AlphaProof and its capabilities. They cover its functionality, unique strengths in reasoning, and the challenges it faces as it scales. The conversation also explores the motivations behind AI in math, practical applications, and how verifiability and human input come into play within a reinforcement learning approach. The DeepMind team shares advice and future perspectives on where math and AI are headed. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Rishicomplex | @LaurentSartran | @ThomasHubert Show Notes: 0:00 Personal introductions 2:19 Achieving silver medal in IMO competition 3:52 How AlphaProof works 5:56 AlphaProof’s strengths within mathematical reasoning 8:56 Challenges in scaling AlphaProof 13:40 Why solve math? 17:50 Pursuing knowledge versus practical applications 21:30 Insights on verifying correctness within reinforcement learning 28:27 How AI could foster more collaboration among mathematicians 30:28 Surprising insights from AI proof generation 34:17 Future of math and AI: advice for math enthusiasts and researchers
Transcript
Discussion (0)
Hi, listeners, and welcome to No Pryors.
Today we have Thomas Hubert, Rishi Mehta, and Lawrence Sartrand from DeepMind's Alpha Proof team.
Alpha Proof is a new AI system that can find and verify mathematical proofs, building on DeepMind's earlier successes in chess and go to tackle one of AI's greatest challenges, mathematical reasoning.
In today's episode, we'll explore how Alpha Proof works its implications for math and AI.
more about test time RL and what this reveals about machine learning's capability to reason rigorously.
Really happy to have you guys. Welcome.
Thank you for having us, yeah.
Maybe you can start by just talking a little bit about your backgrounds and how you came to be working on Alpha Proof together.
I'm Rishi. I was one of the tech leads on Alpha Proof. I've been working in computer science and machine learning for a while.
I'm a chess player and I came across the Alpha Zero paper and saw some of the chess games that that agent produced.
and I found it really inspiring.
And I thought, like, this is the kind of thing I need to work on.
Like, coming up with something beautiful and superhuman and almost alien felt magical.
It came over to DeepMind and the Alpha Cedar team, which Thomas was leading, was working on math.
And that's how I got into MAPT.
My background, so, yeah, started working in industry in my early career.
I worked on a many detection computer networks.
I worked on ad-targeting and switched to AI research.
And there a constant interest of mine has been systems that
can spend more computers to either tackle harder problems
or to think more.
And math seem to be a perfect domain for that.
Yeah, on my side, I was actually a Go player.
So instead of doing programming since the age of 10,
I was actually playing Go.
And I played a lot of goal during my youth.
And then at some point, it was also my dad's dream
to build a computer goal program.
And so I was kind of figuring out,
like, what do I need to know to be able to be
the computer goal program?
And then I realized that maybe it was being built at that time.
And so that's how I discovered DeepMind and how
I discovered AI.
And that's how I joined the company.
And I've been kind of involved with AlphaGo and Alpha Zero,
MuZero, this line of work.
And recently we had worked on Alpha code and Alpha Tensor.
So you know, like that's way before,
change the pt, but we already knew that transformers were kind of changing a little bit how things
were done. And so we, you know, I found that in mass, yes, you could get this perfect verifiability.
And with alpha code, we realized we can generate a lot of good code. And so it was very natural at
that time to think about the potential there was for mathematics.
Can you contrast? So maybe just as context, like for any listener who wasn't a super cool
mathlet like me. IMO is the sort of oldest and most prestigious math competition for young people.
There's a set of six problems. They feel impossibly hard. And Alphalproof had this really amazing
results of solving four of the six problems this year. Can you talk a little bit about math and the
IMO in particular as a problem relative to gameplay and other search problems like chess?
I think, you know, like first, there's a big difference is that in board games, you play against someone, and that's a lot of fun in the board games.
And for instance, when we did this alpha-go or alpha-zero algorithms, we could really have this thing, this thing about self-play.
You could always play against someone who is exactly just your strengths, and that proved to be a powerful idea.
When you're kind of trying to learn to do mass, in some sense, you know, you don't really have an opponent.
You just have to think about it.
And mass is a bit special in a sense that it's almost like a purely cognitive kind of thing
where you can just, you know, the only thing you can do is to think more.
You know, maybe you can't really go into the real world and run an experiment.
I guess mathematician says that, you know, sometimes it's a good thing to take a nap
and that your unconscious self kind of think about the problem and there's a good way to come up with new ideas.
But it's a lot about thinking.
And so, you know, like when you're confronted with a really hard problem,
there is a whole question about how do you go and try to solve it?
Maybe this is a good time to ask you to describe for our listeners,
most of whom are technical or somewhere in the tech field,
but also a broader business audience.
Like, you know, how does Al-A-proof work overall architecturally?
Yes, sure.
So, Alpha-proof is based on this thing called Alpha-Zero.
So maybe let me start there.
Alpha-Zero, as this program we developed to, you know,
like, be basically kind of solve, kind of perfect information on board games.
And the way it works is it's also based on a reinforcement learning algorithm.
So you can think of Alpha Zero as maybe three components, like one, a neural network,
two kind of large-scale reinforcement learning, basically learning from trial and errors.
And three, it also has like this planning and kind of search component to it,
to try to, given the current situation, trying to search for kind of the best answer.
And it turned out that, you know, like maybe we weren't very much,
that when we are doing chess, but if you can handle kind of infinite action spaces,
then instead of looking for a chess move, you can look for, for instance, a line of a proof.
And so that's what we try to do when we started off a proof.
It's basically can we look, our action spaces to generate lines of proofs.
And maybe very importantly, we used formal language to do that.
So basically another way to say that is we use kind of code to write math.
and it's become quite popular recently.
And the advantage of that is that once the proof is complete,
then the machine would give you a signal back to say,
yes, your proof is correct or not.
And so we could search for kind of correct proofs.
Once we find a correct proof, we can learn from it,
and get a better neural network and kind of go into this self-improving loop
of tackling harder and harder problems and learning more and more about mass.
So at a very high-level picture, that's kind of how it works and how kind of the ideas behind Office Zero are adapted to the mass.
One of the things that was an interesting tidbit from your announcement was, you know, if there's a series of problems, solves one problems within minutes, three days to solve other problems.
Is there a way you can characterize like the search space for math overall or what alpha proof is better at in terms of domains than others, like types of reasoning within math?
The search space and math is quite large compared to something like chess or other board games.
So, you know, some people might think of writing math proofs as like picking from a bag of known tricks at each step.
But in fact, like there are many proofs where like you've got to come up with some non-triple constructions, like you've got to invent a function out of thin air or you've got to like come up with like some way to manipulate.
Even if you're just rewriting an expression, you can rewrite it in infinite ways.
and there's only like a few ways you could rewrite it to actually make progress on the proof.
Sometimes thinking about some novel problem requires like decades of theory building
and approaching it from an entirely new perspective to arrive at the angle that helps you solve it.
And so in that sense, you know, the reason why there are like a lot of math problems
that are like very simple to state but have stood the test of time in that they've been unsolved for centuries even,
is that the search space is not sort of easy to navigate in most cases.
I think what Alphabroof is good at amongst the IMO categories is it's largely good.
So the IMO problems have come in four categories.
So there's algebra, number theory, combinatorics, and geometry.
The two that it's strongest at are algebra and numbers theory.
It's relatively weaker at combinatorics, although it can do quite good at some IMO combinatorics problems.
And we didn't apply it at geometry at this year's contest.
and one of the ways in which it navigates this massive search space
is via an idea that we came up with which we call test time RL.
So this is an idea where like, let's say you're confronted with the problem
that you don't know how to solve and you know you can do some search
with what you know right now and you're not able to find a proof to it.
What the agent then does is it constructs many variations of the problem
in the vicinity of that problem and attempts to solve all of them.
and if it manages to solve any of them,
it learns from that experience
and sort of comes closer and closer
to solving the original problem.
And you can think of this as like
when confronted with the new scientific problem,
many of your priors that you've developed
from other problems may not directly apply to it.
Instead, you've got to do like very bespoke experiments
on this problem itself to learn something about it
and then he'll climb your way towards the solution of this problem.
And so this process kind of mimics that.
And so when you talk about, you know,
these, like, the problems that are solved out to three days.
Like, these are the problems that had this, in this loop of, like, we can't solve it
with the base network, but we propose many, many variations.
We try all of them.
We get slightly better, slightly better, slightly better, and then we hill clam our way to
the final solution after, like, days of inference time compute.
Or what are the limitations or approaches to scaling or, you know, increasing the set
of problems that alpha-proof can solve?
And I guess there's two dimensions to that.
To your point, there's a set of areas where it already does quite.
well, algebra and other areas.
And then there's a set of areas where you may need new forms of training data or other things
like certain aspects of geometry or, you know, potentially other areas of mathematics.
I'm a little bit curious how you think about what is necessary to increase performance
dramatically from here.
And obviously what you've done is incredibly impressive.
I'm just sort of curious, like, how you think about what are the obstacles to future growth?
Maybe the main thing that Alpha Proof doesn't do is theory.
building, it doesn't imagine its theory,
not for profoundly equipped this.
Number theory wouldn't be able to come up
with the complex analysis necessary to derive results
in number of theories.
And it's one of the main things that we don't even
we generate variants and get better by solving these
and that enables us to tackle the original problems of interest.
Theory building is a component that is
that would be required to go further.
Maybe another dimension.
Some problems like combinatorics can be stated in a somewhat
obfuscated manner.
That doesn't mean that all combinatorics problems,
just as a minor communitarics problems are all about counting the number of things.
For instance, if you have 40 socks in the drawer,
how many pairs do you have?
It can be stated in some obfuscated way and how to
translate them in lean is a major difficulty and then how to solve then
the is a bit, it's a bit unwieldy.
How to best cycle this is still an open question.
I think there is a link a little bit to your theory building because, for instance,
you know, if there were, in some sense, part of the difficulty comes from the fact that
the tools you need to express those things don't actually exist currently in the,
in the library that we're using.
And so, for instance, if you had like some kind of things that expressed, you know,
what is an agent, what is a,
an environment, what is a strategy, then for instance, some most of the, you know, a lot of
the combinatorics problems would be much easier to state. But because that doesn't exist at the
moment in some sense, kind of the auto formalization would need to kind of come up with all these
things beforehand before actually kind of formalizing the problem. And so that's kind of,
maybe that's kind of linked to the fact that, you know, at some point you will need to be able
to develop new definitions, new mathematical objects, kind of come up with their properties and
their proofs, et cetera, et cetera.
So I guess I think it was 1900 when David Hilbert posed his famous 23 problems that
kind of defined a lot of the big areas that at the time he felt were important to mathematics
and the sort of unsolved things that were important or topical.
Or is there some Turing test equivalent, I guess, as a generic question.
It's an excellent question.
That totally defined kind of, you know, like the mathematics in the 20th century
and maybe, you know, the million-eum problems.
the seven millennium problems of which one has been solved so far is another attempt at defining
kind of what you know what could be one trajectory of of mathematics for the 21st century what our
chances of solving that that's like kind of that's like really hard question because we know you know
our brain kind of thinks linearly and but we know from our experience and working on the eye that
we should actually try to think exponentially things might might change pretty quickly um
At the same time, you know, I think we have no idea how hard maybe something like the Riemann hypothesis is, is it, you know, is it two orders of magnitude, three orders?
Maybe it's like 10 orders of magnitude away from what we can do, right?
Because we don't have an existence kind of proof, it's much harder to kind of have an estimate of how hard this actually is.
But I imagine that, you know, solving that problem will involve like kind of creating brand new mass, brand new theories and things like this.
So if we want to take alpha proof all the way there,
I think that's a capability we need to,
either it emerges, it's possible because to prove harder problems,
you start to need to be able to introduce this new kind of mathematical objects,
to decompose the problem in some problems.
And we see that already happening for the IMO at a small scale.
But it's very possible that you could emerge from just trying to solve problems.
Or potentially, you know, we have to kind of explain
specifically think more about what does it mean to build theories and how can we encourage
both to do these kind of things.
Yeah, just for our listener friends who don't spend as much time on the unsolved math
problems, just to contextualize a little bit, the Remain hypothesis is, it essentially
predicts that prime numbers follow a specific pattern, like distribution, and it has important
implications.
Maybe we could go from there to thinking a little bit about when you were,
working, beginning to work on math as a search space, there are like lots of interesting
hypotheses for like why go work on math at all. One is like the field itself, right? And so
be curious if any of you have problems you want to work on. Another is the premise that
like this sort of advancement in reasoning will allow us, it will transfer to other domains,
be they, you know, science or what we think of as more like language base, like less
verifiable non-code, non-rules-based domains as well.
Where do you all want to take this?
I'm not really a mathematician, so I didn't have like a problem I absolutely wanted to
solve, but it's been my favorite subject when I was a student.
And I think, yeah, you're absolutely right.
There are like, you know, at least two kind of main reasons why you would want to
potentially spend a lot of time on mass.
One is that, I guess, it's been described as the language of the universe, and, you know,
it's been extremely powerful to both describe and predicts the natural world.
And of course, to shape it and you see that, you know, basically we see mass being at the core
of all the technology we're using now.
So kind of having a good understanding of mass is probably kind of very important to understand
our current world. And then of course you could, as you alluded, to make an argument that,
you know, kind of solving mass or, you know, like, which requires, you know, reasoning,
generalization, abstraction, all these things that we think about when we're talking about
something like a cognitive HGI would be, is a path to, you know, go towards HGI, and that could
really help him kind of the development of HGI. So I think at least for me, you know, these are maybe
the two main reasons as to, you know, why Mass is a particular interesting, you know,
topic to try to solve in a general way, even though, you know, like Mass, okay, let's say it's still
kind of a constraint domain, but this kind of unbounded complexity, and actually it's representing
a lot of stuff. But it might be worse kind of making a really kind of focused effort on it
because of all the potential implications it may have. Again, that's something about my own
motivation. I've always been interested in the question of making systems that get better by
thinking more. I looked at this in mesh translation and text diffusion.
navigating agents that discover the environment where they do better by thinking more.
Math seems to be, in particular, proving statements, so the core of alpha proof.
Seems to be one of the last big challenge domains where I still had a long way to go.
And one of the key ingredients would be figuring out a way of thinking more.
I said that I can be searching more.
That can be a test immoral and maybe the next step of thinking more
will be thinking so much that the agent comes up with its own theories.
So math has a test bit for thinking more.
What was my motivation?
One of the big reasons to perceive AI at all is to like uncover the secrets of the universe,
you know, like the deep questions of like why do things work the way they do
and why are we here and you know, why am I conscious and what's going on?
And I guess like math, like solving these pure math problems,
it feels like one domain,
where we're already, like, there's, like, a large number of people who are already just doing this.
They're seeing math as a search for truth.
Like, you know, people are pursuing these problems, not because they have any applications often,
but more that it's like, you know, what actually is the answer?
Why does this thing work this way?
Something like you, in solving the remand hypothesis feels like it has the flavor of, like, you know,
a sort of pure search for truth, which is just quite a feeling.
I guess, like, related to that, you know, there's a long history in mathematics of people,
or in science in general or science and mathematics of doing things for their own sake
or for the pursuit of knowledge for its own sake.
And, you know, Sarah mentioned as an example, number of theory, and its applications in
cryptography, zero knowledge proofs are propping up in different ways in cryptocurrencies,
you know, group theory and algebra propped up in quantum mechanics over time and was sort
of developed beforehand and then applied and developed further there.
Are there specific areas that you're most excited about from an applications area?
for some of the work that you're doing?
Or is it mainly doing it for the love of the theoretical part of it?
It's a good question.
I think we can, maybe, you know, maybe the answer depends on each one of us.
I think one thing that I'd be motivated about is to learn from mathematicians, you know,
what they are interested in and what they find interesting in the current, you know,
world of mathematics.
And so, for instance, you know, I've been reading about this thing called the Lang Lang Langs program
that is trying to connect different areas of math,
And it was described to me a little bit as kind of, you know,
just like in physics where you're trying to look for this unified theory of physics.
It's like trying this unified theory of mathematics where maybe, you know,
we have number theory here and we have like geometry there.
And, and, but maybe there is kind of something behind that, you know,
that is more unifying.
So I personally like those abstract ideas, even though maybe I don't understand them very well.
But yeah, I'd be very motivated to just see, you know, like, you know,
and what mathematicians care about, and they might disagree between themselves as well.
I think they would disagree, right?
If you look at the historical examples, like when Vombelli starts working on imaginary numbers,
he gets completely ridiculed by, I don't know, the more important mathematicians at the time.
And many years later, we get alternating current and the ability to describe electricity.
And so I think it's really interesting as a question.
Like, if you have the machine, the machine is able to develop its own theorems, like, where do you point it?
for interest or usefulness eventually.
It's a really big question.
To that, one domain I'm particularly excited is code verification.
That is, at the moment, when we write software, we write the code and rewrite tests.
And when the tests fast, we are reasonably happy.
Everyone's in a wild bugs that pass the test or even security issues.
It would be much better if we could express encode the properties that the algorithm is supposed to verify.
and to prove in code that the algorithm does verify these properties.
So it's already done for very critical domain like Avionics, cryptography,
where it's very important, but it has to be done by humans.
I think the software industry would be in a much better place
if verification was much more common,
and if we remove the bottleneck, which is humans writing those proofs.
So if humans could be enabled by specialized tools,
we could handle the miniature of these proofs.
I think we'd have done a major stepfons.
I think another application area I'm excited about
is the transfer of this technology to many other domains.
There are two kinds of players who take.
One is just the transfer of the mathematical reasoning skill
that this agent acquires via this kind of training
to many other domains.
And as we see, math is critical for engineering and science and you name it.
But also, like, you know, some of the,
the sort of tech we developed here of like, you know, like scaling RL and like figuring out
how to spend a lot of inference time compute stuff like this feels like it's quite generally
applicable to many other problems. I'm going to ask a question that is more, more conjecture, right?
We've been talking about math and code and domains which are not easily formalizable,
but are formalizable and verifiable. How much do you think this applies in the language domain
can NAI make something funnier, and it's easier to tell if something is funny than to, like,
write a good stand-up set, right? And so I think it's kind of an instructive example,
but like, do you have any intuition for how to take some of these learnings around the ability
to verify correctness or quality, you know, within the R.L approach?
I guess, you know, there are, so as you said, you know, there are domains where there is a ground
truths and there are domains where they aren't a electron trust and so when they
aren't and you know like for instance funny is probably kind of you know a human kind
of it's a it's a fuzzy human concept and maybe there were kind of aliens out
there they would have a different sense of humor and so for those kind of
problems I think that you know the only way you can kind of get your branding is
through like the humans are your grounding for other domains maybe you know
the real world is some sort of grounding.
And then, you know, you have to go to the real world to get your branding.
But basically, the kind of the R.L would allow you to say, well, you know, like, where does
my reward come from?
I guess sometimes it comes from the real world.
Sometimes you can perfectly round it.
And sometimes, you know, like the reward comes from the human.
And I think that's fine.
There is, there are some techniques, basically, that we develop in a sense that, you know,
of course, when you can kind of machine check things, you can, well,
at a scale that is quite different from, you know,
if you have to ask humans in general,
even though we are quite a lot of humans.
So there's still probably quite a lot of scaling that is possible,
especially for things like this where, you know, like humor,
where you could, you know, everyone should have a say about what is funny or not.
So definitely, you know, for instance, the RL framework,
I think is still a great framework to think about those questions.
And part of, you know, what
makes RL work that should kind of also transfer.
And maybe the things where we rely on perfect verification,
that's maybe they wouldn't transfer to that particular question.
Where is there a room for, I suppose, like human rating or input in terms of the explicit
description or labeling of their reasoning in domains that are more verifiable, right?
Like if you have unlimited access to Terry Tao to do labeling for you, is that useful?
Or, you know, how should we think of that versus, like, just do more search and work on better, like, formalization?
The way alpha proof currently operates there is that it discovers its own proofs, and when they are valid, it lands from them and develop its own style, which has been commented upon as looking, yeah, quite alien.
So with unlimited access to Territah would create proofs that are correct with some niceness as percised by this human.
And we could start optimizing amongst the set of valid proofs towards proofs that look nice for purposes of interpretability for purposes of teaching.
So there was definitely still room in this space of valid proofs for proofs that
that humans might prefer.
That's actually like a pretty damning statement to some degree, right?
Because it's just a, it's just like preference versus perhaps like, you know,
capability advancement.
Like what do I care if the proof looks alien if we have new knowledge?
Interpretability seems useful in this case, but.
Yes, I'm not saying that's the only angle for sure.
With more supervised data, we can avoid the exploration problem.
And we could translate all the pros that are unknown to man.
And that certainly would make the agent much better.
That would save us years of compute for, for sure.
I'm not saying that interpretability is the only way we could use.
We could use the mathematician, for sure.
But it could provide us a signal that we couldn't get otherwise, presumably.
With time, compute, theory building, we might be able to,
should maybe have rediscovered proofs that are already know.
I think it's an interesting question because it like,
it highlights the sort of complementarity of specialist human data and like RL data.
And I think it's especially prominent with like LLMs,
where like LLMs have generated,
have these like very strong human priors because they're pre-trained on a lot of human data.
But then when we do RL with them,
they have an opportunity to take these human priors and build on them
and develop their own styles of doing things like Aval Proof has done.
And I think one thing we've seen in this project is that often, like, the small amounts of precious human data can be really useful to exceed the agent's behavior and sort of, you know, get it from like a complete zero state to like somewhere much higher where it can play with the environment in a much more efficient way.
And then beyond that, it can take it from there and, you know, perhaps match or exceed the human, but with its own style.
So that's how, I guess, like, that feels to me, like, that's probably going to be the way forward with RL for LMs in general, is that, like, the specialist humans are going to serve to, like, get the LLM from, like, just a bunch of weights that knows how to do nothing to, like, something that, like, surprisingly strong.
And then the RL is going to take you from there to, like, something that's, like, superhuman.
I think it was supposed to be point career, somebody who was the last mathematician who knew all of mathematics or at least, you know, understood big portions of it.
And suddenly you have an AI or a system that could potentially encapsulate all of mathematics in a single program.
And it could really be used as something that could help check a proof.
It could help sort of push a mathematician forward in their own research.
You know, often when somebody proves something now, if they're in a more side field,
there aren't that many people who can actually verify or check the proof that they've done.
How many mathematicians have access right now to alpha proof?
How do you think about engaging with the mathematics community about day-to-day usage of
this pretty amazing set of advancements.
So at the moment, you know,
mathematicians don't have access to alpha proof.
And to be honest with you, you know,
like we at the moment, we can't rival at all
with someone like Territale.
We, I think we demonstrated that,
what we've demonstrated is that we can learn
general mathematics, almost from scratch
and arrive at kind of an impressive high school level.
And then we need to grow our knowledge base
so that we become useful.
But I don't know.
if you've seen kind of
Terry Tau's kind of recent
kind of interviews over the last year
and he's been saying that, you know, one
thing that is interesting
in mass is that collaborations
has been relatively small. A lot of papers
are one, two author paper, and
max five authors. And it's been because
it's been very hard to collaborate with
more mathematicians because you need to check
what they are doing. And so
it's very time consuming. But if you
instead rely on a formal system
to check everyone else's work, then
you could do a little bit like an astronomy where you could have an amateur kind of living in the
middle of maybe nowhere and you've never met.
And then you wouldn't have to trust him like you could trust in some sense the machine to check
the work.
And if the machine says it's a correct proof, then it's a correct proof.
And then you can kind of start to work with many, many more people in some sense, kind of
that's cool because you can kind of start to think about making collaborations between
between many humans,
but also potentially
AI's, right?
And so that could be,
you know,
that's one way
we are thinking about it as well.
It's like,
oh, can we,
can we,
could we potentially collaborate
to those projects?
And at the moment,
the way,
the way we think about it is,
you know,
we think more of ourselves
as some kind of,
it would be great
if we could be like a graduate student
and kind of the mathematicians
could give us kind of the questions.
I think that's maybe,
you know,
related to theory and building
at the moment,
But we're not really good at asking the right questions.
And there's definitely kind of we would like to rely on the whole mathematical community
to kind of ask the questions that they care about and see if we can help even a little bit
answering those questions.
I was also a Go player growing up.
And so, you know, seeing the least little match and like very, as you described, alien moves
and go was incredibly interesting to me too.
Is there something that has felt alien or surprising in terms of like an alpha proof
so far that you can talk about?
Yeah, sure.
So let me show you guys the proof that or part of the proof that Alphabroof
Proof came up with for Problem 6 at the IMO.
I think it'll be much more easy to understand if I flashed it on the screen.
So this is Problem 6.
So the IMO typically has six problems.
and problems three and six are the hardest ones.
And so this is supposed to be one of the hardest problems of this year's IMO.
But it actually turned out to be one of the hardest problems ever
because only five out of the 500-odd contestants managed to solve it.
So, you know, it is really hard.
So we can just go over very quickly what this question is saying.
And I won't go through the whole proof because it's probably incomprehensible
in the time that we have.
But like I'll point out like one cool construction that I'll have.
who came up with. So the question talks about
like the definition of a function
being aquasulian. So aquasulian
is not really a mathematical term. This is just a
tongue-in-cheek reference to the fact that
Baath, where this contest was being held,
used to be called aquasuli's by the
Romans. But anyway, so they've given
us two equations that
characterize a aquasulian function.
And we've got to show that
there exists some integer C
such that for any aquacillian function
F, there are at most C different
rational numbers of this form.
So this is a bit complicated to understand, but essentially this form is the sort of even part of a function.
And what it's trying to say is that you could like take any acquisition function and give it many rational numbers and figure out this expression for many rational numbers are and you might get like infinite values or you might get five values or three values.
So it's asking you to prove that there is some bound C beyond which you cannot have more unique values and also asking you to find what this bound C is.
So often, so that's a question, what is C?
So often when you're trying to answer a question like this,
one strategy you can use is to show an upper bound for C
and a lower bound for C, and then show that those two converge,
and then you find C.
So I'm going to pick up in the proof from the point
where AFA proof had already proved that C is less than or equal to two.
That's itself quite an interesting proof,
and you can look at it on our blog post if you're interested.
But at this point in the proof, Azo proof has to decide
is C 1 or 2?
And the way it's going to have to decide that
is by finding...
So it's trivial to show that C can be 1.
What's hard to show is like,
is there any acquisitive in function F such that C is 2?
And, you know, you can pause and try and find functions
F that follow these properties
and see if you can find one
for which you can find two different values
of this FFR plus FF minus are expressed.
But one thing that's interesting is like Tim Gowers, who was one of our judges and is also a
Fields Medalist, tried this question for a couple hours, and he couldn't find the construction
for function that had this property, which is, of course, not to say that Alpha Proof is better
than Tim Gowers at math, but it's just, it just highlights how hard this part of the problem
is. And Alpha Proof came up with this construction, which is this function, F of X equals
minus X plus two seal of X, which is this funky-looking function that I've plotted over here.
And interestingly, this function, if you plug in R equal to minus one and R equal to half,
you get two different values.
And so you can tell that C must be equal to two.
So in my view, this is maybe the hardest problem that Alpha Proof ever solved.
And it was a really cool, funky construction that relies on this seal function.
Very cool.
Can we start with advice for people who are working on math today and investing time in it,
given alpha-proof and related models.
Formal math is going to be an increasingly important thing going forward,
not just because AI is going to be really a very, very strong tool within formal math,
but also because, like, it's emerging as a way of collaboration between mathematicians
and many prominent mathematicians are adopting it.
And, you know, it's still a small minority of the mathematical community that operates in Lean,
but it's a growing minority.
And AI will only accelerate that.
And so I guess I'm not a mathematician, so who am I to advise mathematicians?
But like, I guess my advice from the outside would be like learn lean as early as you can.
I realize as well it's a great tool for education because, you know, like most of the proofs are written at the level of abstraction of the person who writes a proof.
And so for instance, you know, if that person thinks that step is trivial, then, you know, you wouldn't have any explanation about it.
But if you have that proof in a formal language, then you can zoom in, you know, at the level of detail that is adapted for you.
And so that's one thing.
You can learn much more at your own level.
And the second thing is you can actually do self-play with yourself in a sense that, you know, maybe you are trying to learn about a very abstract part of mass.
And that's, you know, the usual thing is, you know, it's difficult to explain mass because it's quite abstract.
And then maybe you can't tell whether what you're doing is correct or not.
And if you are doing it on paper, then maybe the next best thing you can do is to try to find someone or your teacher,
then that might take you a lot of time, right, to get some feedback.
But if you are using a formal language, then, you know, you would know exactly what's left to be to be proved.
If you've done the thing correctly or if you've kind of fooled yourself a little bit.
So even for educational purposes, I think it might be very interesting.
It's not really advice. It's more speculation.
I said that there were two types of mathematicians, the ones who build theories and the one who proved theorems.
I understand that in order to prove theorem, sometimes you have to build a whole theories, as well as done for the last theorem.
But it seems that there is more room for human creativity, human taste, human scale, in building theories than in the proof part.
One thing that I kind of struggle with here is it is increasingly true across a bunch of domains that are being, where the hill is being climbed with AI, I suppose, that taste seems to matter increasingly much.
It could be Terry Tao taste in problems because if you're working on arithmetic progression in primes, it actually leads to, like, advancements in methods and tools.
But I actually think it's really, one, it's challenging because, like, telling somebody go develop.
a good taste is a more nuanced, you know, directive than make sure you learn lean, right?
But two, it's also pretty interesting and parallel to the work you guys have been doing
in that to some degree, you know, if you look at the behavior of where you're applying computation,
it's like, oh, well, why do you work on, you know, abstractly some problems around this area
because it will improve your ability to solve the problem, right?
And so I think maybe, as you said, we'll all do a little bit more test time RL.
I don't know if we'll all develop taste, even if that looks useful.
I saw this from somewhere, which is like, as machines get better at finding the answers,
like we're going to have to get better at finding the questions.
And, you know, like these systems don't have a, you know, their own sort of notion of what questions are interesting.
And given a large question, how do I break it down into smaller questions that might be interesting?
And that's probably where, like, not just in math, but in many domains, like, even as a sort of software engineer, right, like, I find that, like, more and more my job is, like, not figuring out these sort of small, low-level details, but, you know, making, like, good high-level decisions that tools can help you with. And that's probably going to be true of more and more domains.
Thanks so much for doing this. It's great to meet all of you, and I appreciate you doing this well.
Find us on Twitter at No Pryor's Pod. Subscribe to our YouTube channel. If you want to see our faces, follow the show on.
Apple Podcasts, Spotify, or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts for every episode at no-dash priors.com.