The Infra Pod - Building a model that can prove theorems (with Shubo from Axiom Math)

Episode Date: June 4, 2026

What happens when you combine world-class mathematicians with cutting-edge AI systems? In this episode, Ian Livingston (CEO of Keycard) and Timothy Chen (GP at Essence VC) sits down with Shubo, CTO of... Axiom Math, to explore the emerging world of AI-driven mathematical reasoning and formal verification. From proving theorems in Lean to scaling software verification for the agentic coding era, Shubo lays out a compelling vision for why mathematical infrastructure matters more than ever. The conversation also ventures into homomorphic encryption, multi-party computation, Navier-Stokes, and why the next frontier of computing might just be running on math we haven't discovered yet. [00:00] Guest introductions and backgrounds [00:52] Company founding story [03:01] Axiom Math mission explained [05:15] Software verification applications [16:02] MPC and encryption challenges [24:01] Business model and products [33:49] Spicy Future hot takes

Transcript
Discussion (0)
Starting point is 00:00:03 Welcome back to InfraPod. This is a 10 from Essence and Ian let's go. Hey, it was Ian Livingston, CEO and co-founder of Keycard, making it possible for you adopt agents and don't losing your shirt or your database. I couldn't be more excited today than to talk to Shubo, CTO of Axiomath, building foundational mathematical infrastructure for the agentic world. Shubo, tell us about yourself, tell us about how this company got started and why in the world are you doing what you're doing? So as you said, I'm Shubo. I have a long history in both AI and GPUs. I started working on GPUs, I don't know, 21 years back before people had any inclination of what GPUs were and then started working on AI in 2014. So that's around 12 years now. The company sort of started through an accidental meeting at a cafe. I met Karina, who is the founder, three years back. kept in touch over the years.
Starting point is 00:01:03 One of the projects I was working on at Facebook slash meta, where I spent eight plus years, was can we build AI systems that produce code that don't need any human to sort of review it? This actually started working very well in the domain of testing, writing testing code. And then I was quite amped about the impact of software engineering, even in 2023, which was Stone Age by AI era.
Starting point is 00:01:34 And then we were also seeing that mathematicians are starting to use programming languages. So both of us were thinking in 2025, like this may be the right time to start a company that sort of brings computer science and math together to build a sort of mathematical superintelligence essentially. Yeah, and that's kind of how we got started, quite a number of, I would say, lucky accidents along the way.
Starting point is 00:02:02 And what made you want to do this? That's the real question. Because this whole, like, building a company, building an infrastructure, this whole thing is kind of a wild thing to embark upon. So was there like a moment? Was there an insight? What was it? You know what this is going to do next?
Starting point is 00:02:17 Yeah. I mean, I've been in quote-unquote big tech for a long time, almost 27 years. I've been sort of keeping – I've been kind of – of going to school, but also having, you know, lots of, like doing a lot of internships. So I had to put in both camps. And I had never actually done a startup before and felt that well. I got nothing to lose. I have a lot of experience in big tech and big tech and big research labs.
Starting point is 00:02:45 I've been in like seven different research labs in the industry. So I thought this would be a fun thing to try out. And then had a really good sort of partnership with Karino over the years. like we geek out on various nerdy topics and thought like this would be a fun thing to try out. Cool. And so what is it exactly that Axiomath does? Like what are you trying to do for the world and why? So we believe that more math is good for the world
Starting point is 00:03:12 because almost everything, at least in computer science, that's some sort of like not just computer science, but other domains of science as well, has some sort of mathematical foundation to it. And we want to accelerate the base of creation of mathematical knowledge. And math is a unique field where the creation of knowledge and the verification of knowledge
Starting point is 00:03:35 is actually all abstract logical things. So you can do that on a computer, which is unlike a lot of other domains of science where in physics, for example, you can come up with a theory, but then you actually have to validate with experiments, which kind of slows the cycle down. So in math, you can have this self-improvement loop. And obviously, AI is empowering everything.
Starting point is 00:03:57 And we thought if AI can both create knowledge and verify knowledge, this is a perfect domain to do it. And the world can benefit with a lot more math. And along the way, we'll figure out things to solve with all of the knowledge we built. I think for all the engineers, we all have to learn math somewhat, right? I feel like maybe from this point on, the math understanding is getting less and less required, probably for coding. but I assume everybody would know why math is important, at least for coding. But when you are jumping into
Starting point is 00:04:29 building a language model that can do research and solving, I think there's probably the premise that this is such an important, just even like a value or asset that has wide implications, right, to do this. And we never seen a model this at all. I think pre-JGBT, we don't even know how good a transfer model is. So I feel like that's probably where we are. So maybe can we give us a glimpse of like,
Starting point is 00:04:52 I know he's just mentioned you're still figuring out what the possible things this can be applied to. But what are some sample of possible things that you guys are excited about? Like if we can get to a certain state of power that we can start to solve these type of mathematical medical problems, what are some of the possible implications that you think most people don't even know they exist? You guys are excited about. Yeah, like one of the things that we have been looking at is verifying and properties of software. and fundamentally proving some properties of software. And various other companies are in this as well, not just us.
Starting point is 00:05:28 And there are some benchmarks around that, you know, test the formal verification has been an old field in software that has never kind of really taken off because it's actually a very, very hard problem. What it really does is it takes a piece of software, it takes some desire on the person who has written the software about what invariance that person thinks that the software should have and then sort of formally proving it.
Starting point is 00:05:51 it. So it turns out that the systems that we have built for proving math theorems also do very well in the verification benchmarks. So now we are thinking a step ahead of like, so today, the effort to produce new software is falling because, you know, previously humans had to write code, now humans resisted by very powerful agents. And you are now thinking, well, if the world has way more software, maybe if there's a way to scale out formally proving some properties of the software, that would be quite helpful. And a lot of things are actually just software. Like, if you think of like a design of a hardware, for example, that is essentially written in some kind of language. So we can verify properties there as well. And domains where the cost of mistake is high is kind of
Starting point is 00:06:40 where we start. And then we can see how far we can expand from there. And my interest in it starts from testing and a lot of work on L-LM-driven testing of software and then formally proving a property is like the more rigorous form of testing in some sense. So that's super interesting, because to me that's maybe a little counterintuitive. Because I feel like there is already functional proof languages and verifiable languages, like the Lien and others, the TLA Plus
Starting point is 00:07:08 and all the variances, right? And so I know these exist already. It's just, I feel like it's so hard to use. It's so hard to create the functional. specs and proofs and to represent all the possible states these programs can be at
Starting point is 00:07:23 and it's such a hard thing to do. Most people, even at the hand coding days, only really core, hardcore systems have the people that motivate to write them and maintain them, right? And so I will assume for all the vibe coding outputs we're getting now,
Starting point is 00:07:39 to get more verifiable software is improving the ability to write the specs, not creating more mathematical proofs or models. And so maybe help us make the connection. If you have the LMs, like you just mentioned, can do the mathematical proofs, how does it actually get into the verification of software? Have you guys thought about the possible glue here?
Starting point is 00:08:03 Or there's still more things like to learn? Yeah. I mean, a lot of glue has to be written. And hopefully this work will get easier because models are very, very powerful. The first clue has to be, how can I go from a spec to a formal invariant? So that is one part that one has to do. The other part would be taking a piece of software and then producing theorems that he would need to prove or even inputs to SMT solvers that he would need to do to actually prove the property.
Starting point is 00:08:34 So that is another clue that needs to be written, which is essentially semantically or syntactically embedding a language like, say, it's C or something or Rust. another formal language in Lean. But once these glues are done, what would happen is you would produce a number of theorems that the AI-based troopers would prove. So AI, in some sense, has a role to play in both writing the glue, and then once the glue is written,
Starting point is 00:08:59 then you have the theorems that you need to prove, AI has a role there as well. So I think every part, given keynote talks at conferences before in the software engineering domain, And what I tell people is look at everything that we have done in Software Engineering for the last 50 years and then rethink through the lens of like, can we do it again a lot easier because we have AI. Some projects you see already, I don't know if you've seen that the Lean FRO along with Benevolent AI,
Starting point is 00:09:30 they've started a project to formalize the Signal Protocol, which is part of the Signal app. And they're also trying to build these kind of full cycle things. And you'll see more and more of this, I think, because of barriers, as you correctly said, people would spend years trying to formalize a protocol by hand. Those barriers are actually rapidly coming down, both on the glue side of things and also like the proof side of things, once the glue produces the proofs that you have to prove.
Starting point is 00:09:59 And so the broad scope of this is like how do we have formal verifiers of scale, right? Like that's what we're trying to create. So there's like an LOM system combined with like formal verification with mathematical proofs. Is that the outcome we're trying to, to get here? Yeah. I mean, ultimately, the thing that was Tim was pointing out is you would have some code written in language, let's say C or something. And then you would need to have some, and you have to have a spec or an invariant.
Starting point is 00:10:24 And you have to make sure that it's easy to write the spec on an invariant. And then you have to, that glue will produce a lot of mathematical proofs. And then you have to have another system that will actually prove them. And LM has a role to play in all three of these. So right now, where we have made and others have made the most progress. We have very strong provers now. Obviously, the power will go up. Like, we can't, there's a lot of things that these proveers cannot do.
Starting point is 00:10:48 But if you look at the rate of progress, that has happened. Like, even last year, late last year, people were thinking, oh, Patnam is a challenge. Now, nobody talks about Patna. Patnam is done. As we have seen in our own company that we are proving things at Ken, who is our, you know, chief mathematician, is a very well-known number theorist, basically. I don't think I could have done these stuff. And obviously, the progress is jagged.
Starting point is 00:11:12 In some branches of mathematics, we have made more progress than others. But the progress is, I don't think anybody can forecast what's going to happen at the end of this year. I mean, that's incredible. And then the primary driver here is like, underneath the whole, the primary driver of the expansion is the fact that we have these large-lange models that are good at guessing, that then allows to drive the equation of home-approves and discovery. Is that, that's the essence of what's going on here? Yeah, I mean, the models have gotten really good at reasoning. the models have gotten really good at code,
Starting point is 00:11:40 and then formal languages essentially bridge between math and computer science. If models get very good at code, then he can sort of express more math in them. Agents have gotten a lot better. You have much better tools. So it's a combination of, like, Axel, the kind of tools that we have released is something that we use in our own systems for both RL and our agentic systems as well. Other people use it, too. If you go to X, people will say, oh, I used Axel with Claude,
Starting point is 00:12:08 and I proved this theorem. So if we see progress in all of these things and this sort of form a feedback loop in some sense, I mean, as people use agents more, they basically generate more reasoning data, which gets fed into the models. So it becomes like a nice sort of a loop. In some sense, a data flywheel. As the capability increases, more people use it, that creates a data flywheel, stuff gets better. Now, obviously, it doesn't get better uniformly in math, like some places like number theory, we have way more progress, not so much in differential equations,
Starting point is 00:12:41 but I think it's a matter of time. And so I have no intuition about how has one even start to even think about training a model like this. I'm sure there's a bunch of secret sauce and stuff, but maybe conceptually can explain to us, instead of all the traditional ways of how one thought of training a transformer model and how we got all the crazy language models we have, is mathematical models fundamentally similar, but maybe a little different, or totally different here?
Starting point is 00:13:10 They're not that fundamentally different, honestly. One of the key insights in 2025 is if Mathematicians is starting to write Lean, that means that we can use techniques that we have seen in Transformers working in other coding languages and trained models here. Math is also quite interesting that Matt does reasoning in two different representations. So one is the traditional English way that we study. in like middle school, high school, college, etc, where we write English language proofs. But math also has a representation as a program.
Starting point is 00:13:40 So what you do is some of the training happens in this reasoning space, where it's just a lot of proofs written in English and other reasoning happens in the domain of lien. And when you actually prove, what you do is the agentic system that sort of binds them together, like all these models and tools, they will do some amount of reasoning using reasoning models in English or natural language and some amount of reasoning in the formal language. Math is interesting that way that we express reasoning in math both in a natural language and as a program,
Starting point is 00:14:14 which is quite interesting. And this is my gut field that, so math also has this challenge called auto-formalization. So if you have a lot of body of math written in English, can you create a program out of it? This is actually quite hard because the abstraction level of natural language is actually much higher than the abstraction.
Starting point is 00:14:33 instruction level of, say, lean. But if you can do this, and this is something that us and a bunch of other people are actually working on, that may be one way to think about if I have a spec, can I generate a program for it? Because it is somewhat of a similar problem. Obviously, a math theorem is a very formal spec, but that may be one step.
Starting point is 00:14:55 That is really interesting. So conceptually, it's still similar training of language models, but it looks like the datasets and how you combine. the instructions set for this is very particular, is the national language, the lean program, and the ability to kind of make that connection. And the agents have to do a lot of search and backtrack as well, because if you think of a proof, a proof is basically a very large tree. On our blog, we have kind of shown diagrams of this proof trees from, say, Putnam.
Starting point is 00:15:24 So what you're doing when you're essentially proving, you're taking a goal and you're decomposing down into subgoals and decomposing down into subgoals. So there's a lot of planning going on, which used to be like the traditional AI, right, like 20 years back when you asked what is AI, it'll be a lot of like backtracking, search and all that. So these proof systems, which are a combination of agents and models, like that does a lot of search and backtrack as well, which is quite required for my math. Essentially what you're doing is you're building these very large proof trees. So I have a very abstract question because this is such an interesting area.
Starting point is 00:16:02 So one of the things I did in 2018 is I worked on multi-party computation for secure encrypting machine learning. And at the time, and I think this is still true, and I believe it's still true of home warfare encryption as well. It's like you basically have this very interesting idea, which is that we can keep something private as long as we can trade, like basically manage who is the total information, right? So if we ensure that no one has all of the parts at any one time or enough of the parts to guess the rest parts, then we can. can have this weird protocol. It kind of allows us to compute on data without any one person ever being able to know what the data is that's being computed upon.
Starting point is 00:16:40 It's a fascinating idea. But the limitation at the time was like every time you tried to introduce a new operation, you basically hit up the fundamentals of like, oh, we just haven't invented the math to support that concept yet. And I'm curious if you think that the primary outcome of work like this, whether it's homework encryption, multi-party, there's lots of different areas of mathematics.
Starting point is 00:17:02 where we could have fundamentally better security, things would be faster, more performant, whatever, if we just had a better solution in the math and solution to the problem space. My startup and time kit privacy was just we just hit the wall, where it's like, well, you can't really do exponentials. They're just too complicated. And so if you're doing certain types of activation functions, the machine learning model, many have to do a simulation,
Starting point is 00:17:28 then you had an air explosion, then you had all this drift, and it just became impractical. So I'm quite curious, like, how you think about what you're working on and how it fits into the larger picture of, like, what solving some of these fundamental issues in mathematics do, like how it drives the broader technology ecosystem. I laughed because I worked on MPC and Home Arbic encryption for two and a half years. So I don't know if you've heard of this thing called Krypton that I helped build from Facebook because at Facebook, we were – and this was right after Cambridge Analytica. So one of the research efforts that I and a couple of others were starting to build was like, what if all of Facebook data was private? Can we train on it?
Starting point is 00:18:09 And we built frameworks. And I worked on this field for two and a half years. And then I commiserate with your pain. You know, we can do exponentials, but, you know, they have to be represented in extremely large integers. And when people say that, you know, neural networks are very compute hungry, I basically say, well, wait till you hear about homomorphic encryption. I think I did the math at that point. They're like a 10 to the part six times more computer and communication hungry
Starting point is 00:18:41 than what we have today. The first thing I should say is I'm not an information theory, so I don't know if there is some upper bound or lower bound of like, well, if you need this privacy guarantee, there is no way your keys can be smaller than this. And if there are some fundamental theorems like that, then obviously what we do will not make any difference. But if there is no limit, and I hear that in cryptography,
Starting point is 00:19:06 there are a lot of things that are sort of taken for granted, but not really proven. Then there is likely hope. Maybe there are better ways to approximate exponentials. Maybe we can use much smaller. Like, really, the problem becomes that 32-bit integers are just not enough for MPC. You have to have 256-bit. And 256-bit integer math is extremely slow,
Starting point is 00:19:28 because it's very hard to build a 256-bit multiply unit. So maybe there's a fundamental mathematical thing to be unlocked where we say, well, noise won't blow up as it does today with 32-bit keys, something like that. Like that would be magical, right? And this is, in some sense, you know, I talked about verification as something that we are looking at in your term, but if you look at the blue sky or where the company wants to be,
Starting point is 00:19:56 is like if it can come up with these new theorems or theories or that will actually help solve a practical problem. There is still a jump that you have to make. And a lot of the times you see that sometimes it takes 300 years for a map theory being able to make into practice because there was a lot of dots that need to be connected. Maybe that will shorten. I mean, that is still a jump that has to be made. Just you look at cryptography, right?
Starting point is 00:20:24 I mean, Hardy, who did all of the work on the hardness of prime number factorization, that work really came to the masses in terms of SSL and e-commerce. So it had to wait, I don't know, in 90 years to actually find a practical problem to solve. That said, I think, like, we do have some practical problems we know of, right? If you go into a survey of cryptography and a survey of machine learning, They're just upper bounds to like of what we understand and can do. And it's not like finding it'll in the haystack with these things. We know, oh, this problem is this limitation?
Starting point is 00:21:03 Because we've, you know, enough PhDs and enough, you know, research money has gone into trying to like actually do something with it for so long that if we were to solve, you know, this fundamental problem here, here's what it's all unlocks. And unlike with quantum, which is like, well, we're cool. We can go to quantum computer, but what can we actually do? Well, maybe we'll build, you know, solve, uh, elect cryptography or something. These have like real practical
Starting point is 00:21:26 understood use case, if true. And I think I spent about two years as well, by the way, an MPC and Homeworker. And then I said, I cannot do this work on this problem set anymore. This is insane. This is going nowhere and I walked from it. But it's cool.
Starting point is 00:21:43 But like not a practical problem to solve. But those are the types of problems that can be unlocked. I'm quite curious, you know, I think a lot, I read what you were doing as a company. I said, okay, this is cool, but what's the product. Is the idea that this is like, you're going to build basically a brain that can find and solve new mathematical problems, provide verification and proofs of that, and then basically you're like next gen RSA. Like, is that the idea? Is like, what's the business? What's the
Starting point is 00:22:11 productization of the work that you're doing? And ultimately, at the end of the day, drive sort of like the research to business forward. Yeah, I think there's both a short term and a long term side of it. The short term would be, there's a lot of places where we are writing more code and we don't know what the code is doing. And is there a way to actually guarantee what this code is doing? And this problem will get, I think, worse and worse. So one of my friends at this very insightful take that if you look at hardware, for example, if you take the total hardware cycle where, like, from designing something to actually shipping something to TSM to build a chip, I think that takes about, I don't know, three to four years, something like that, maybe five years.
Starting point is 00:22:53 A big chunk of that time is spent in actually testing, but not so much in design, because if something goes wrong, things are going to be bad. And one of my friends has a stake that software is going to look something similar because we are producing so much code and we don't really know what that. Nobody can even review, review code. Like even in our own company, the bottleneck sometimes becomes review. So maybe the way software development would be well, you just write code and use it for like, I don't know, two months and you test for 10 months, just like we do in hardware. And if we can bring that down through verification, that's a big market in the short term at least over the next, I don't know, two to five years, something like that.
Starting point is 00:23:36 And beyond that would be going after problems that, you know, the pain that we both had. Well, we know that we need smaller keys or looking at errors and see if we can make a lot of intractable problems that we want to make tractable. And we know the value that will unlock if we solve an intractable problem like that. So I'm very curious because I think you guys have, haven't raised as much like Open AI does to be truly general, trained open crawl type of style model. But you also have enough resources. you no longer have to just focus on one very small set of problem as well. And I worked on ZK Proof before, which is not an encryption side of things, but it is also such a mathematical heavy limitation of trying to change your code into mathematical equations
Starting point is 00:24:27 so that can actually provide a proven, right? So I feel like there's many, many problems you can solve around code, actually, if you can able to get a model that has the ability to translate either the natural language or to code into mathematical ways. I guess I'm curious. The question is more like, kind of how Anthropic learned that code is accidentally a good use case for them. They actually didn't know.
Starting point is 00:24:49 And I think in your case, I'm not sure is the plan is to just train as much as possible with a certain working corpus of data or however you guys are training that. And let's learn about the use cases even within code afterwards. Or there's more targeted like problem sets or domains. You guys are more focused. even within code specific. It's harder to tell. Are you betting on emerging capabilities in some sorts
Starting point is 00:25:16 or actually have a much more narrow set of things you try to focus on? I think we do a little bit of both. By the way, I think I'm dropping new. It's funny that when I started working on testing in Facebook and we were looking at various models, even then it was very clear that the first Sonnet model, I don't know if you remember, the CloudSsonnet model, was actually really good at coding.
Starting point is 00:25:39 it wasn't really good conversation where it was very, very good at coding. So the way we operate here is the DNA of the company is math. So one thing that Ken does is really sort of pick a few matte problems that we think are not tractable now, but we'll be tractable using these systems in like two to three months. So we don't sort of train explicitly for that, but they use the systems that we build. So we get like some amount of feedback, not as like a gradient, like gradient that we explicitly back brought through, but we pick up a couple of math challenges and we try to push our systems to do better or then on those math challenges.
Starting point is 00:26:21 And from Ken's experience and Ken's team's experience, we get a sense of like, should we get more of this type of data or should we get more of that type of data? So that in some sense is like going after a few problems. and we feel that going after those few problems is kind of a rising tide that lifts all boats. The fact that these systems are also good at software verification is something that we found out by accident in February. So, yeah, so that may be the quote-unquote emergent capability
Starting point is 00:26:50 that you are speaking of. So we kind of do both because I think it is important. As I said, our company's DNA is math. Like, that's not going to go away. And we really need to push on math problems. And after we did well in Putnam every month, we had been, I think, proving four conjectors or so. Obviously, these are not as complicated as Fermat's last theorem. But then the bar is going up.
Starting point is 00:27:13 So now Ken and his team are working on something more substantial because that's how you sort of lift the bar. Yeah. It's actually really interesting talent pool you can pull from, right? Because I think, you know, besides just joining hedge funds as a very lucrative possible place to be. Yeah. There's actually a very unique talent pool that. I don't think most LLM model companies are all just looking for CS researchers and LLN researchers primarily. This is by design.
Starting point is 00:27:40 And again, this is a bet. Our bet is bringing together computer science and math positions in one place will help us unlock something valuable. We hire math additions. And I would say in terms of hiring bar or hiring bar for math predictions are actually hired than those of computer scientists in some sense. I mean, the math positions that work for us are truly world-class. They're famous. And we are very blessed to obviously have them. And even in the lean community, I think the talent pool of like mathematicians who write lean that sort of set, I think we have, you know, very high bar and core math-lib contributors work for us.
Starting point is 00:28:23 To get into sort of like the hard challenges you have to solve to even train a model like this. What are maybe like hard technical things you guys need to solve that's related to building a mathematical model? It could be math related, it could be lean related, whatever that could be. And I'm also very curious, what some hard problems these mathematicians have to come into solve. Because obviously not here to just do math research to do more cryptography. It's not a research lab for public goods or even for particular domain. Now you're into a very different type of business. what are the type of challenges they have to solve? You can talk about me both from a technical and mathematical sense.
Starting point is 00:29:03 Yeah, so for the first one, I think the problem that I am most excited about and that has a lot of implications, and we are trying to solve it. Again, it's not easy to solve and it is a challenge. It's the thing I told you that math has two different representations. So we have done math and English and all kinds of natural language from the birth of history, basically, birth of civilization. civilization. And there are perhaps tens of millions of proofs of like that's the entire body of math knowledge humans are produced. And then there's a formalized part of it. And each one of those
Starting point is 00:29:36 theorems has a counterparted program. But on the program side, we only about 300,000. So how do you go between a somewhat abstract representation of formal logic to a much more like something that a computer can run? How do you scale that up? This is, I think, one of the biggest challenges for the company. And if you can do that, you unlock as you have a reasoning trace that's in a natural language and a reasoning trace in a programming language that is one-to-one. And no other field has it, like not even computer science. Because in computer science, we don't have these kind of specs in many places.
Starting point is 00:30:15 And if you have that, I think that is the key to training the next kind of systems that will be really, really powerful. This is something that we are working on, something other people are also working on. But it is not an easy problem for various reasons. One of the reasons being there are many ways to define a mathematical object. There's one way of, you know, and what is the right way to do it for a computer? There's a lot of different ways to do it. The way you reason in a program is not really the way you reason when you write proofs. How do you marry them?
Starting point is 00:30:49 I feel that if you can solve it, you can also solve this sort of glue that you're talking about. I have a spec. How do I get code? Similar-ish problem. For the challenges that math positions are working on, I wish I could say. I think some of the new challenges are working on are I think quite interesting.
Starting point is 00:31:06 They're kind of like long-standing conjectors, but I don't want to give away like Ken's, you know, secret sauce, not secret sauce, but if we make progress, then I think it'll be a nice result. But I think on the math side, what is challenging is, as I kind of told you about the proof trees, right?
Starting point is 00:31:25 So if you look at Putnam, the proof tree is around like 100 nodes or 200 nodes or something. Some of the stuff that we have done after that are 5,000 nodes. And if you want to go into, say, you know, formalizing Fermaz Las Theorem or even proving Fermat's Law theorem, I don't even know how many nodes that ProofTree has. It's hard for an agentic system and with a bunch of models and everything to manage that much planning and execution. And say you have a system that does it. can humans interpret it?
Starting point is 00:31:57 That's another sort of a problem. How do humans interface with a system where, you know, proofs get this long is another interesting problem. And sort of a meta problem is like, how do you choose the right? Because the systems are increasing in like very different ways in different domains of math. How do you even choose a problem? Because if you choose a problem that is too hard,
Starting point is 00:32:21 then, you know, you want to build something useful. And if you choose a problem that is very easy, then you don't input the system. How do you choose the right problem is hard? And the one thing that I haven't talked about that is kind of beyond just proving is conjecturing, which is something that, you know, even humans, I mean, if you ask somebody, how do you come up with this conjecture? I don't think they are able to tell you. And like, AI hasn't actually made a lot of progress in coming up with good conjectures.
Starting point is 00:32:48 And I think that's another thing that we really need to push on and the community hasn't. Because if you cannot come up with new conjectures that is useful, then you don't close the loop. And then how do you encode things like useful and taste? These are always hard to kind of encode in some sort of like principled, objective way. But you can come up with lots of conjectors. But if the conjectors are like they don't expand the field, then they're not a super useful conjecture. Like you remember, you'll be conjectors like Formas last theorem, and after they're proved, they're useful because It was like, once you prove it, people come up with like, oh, you know, all this new insight I got,
Starting point is 00:33:27 I connected this field with that field, with the third field, and the process of doing so, the, you know, math expanded because all these new connections were formed. But how do you encode that? That is also very, very challenging. That is super cool. Yeah, that's actually really, really cool. So we're going to jump in our favorites. We'll be jumping to our favorite section of our podcast called a Spice.
Starting point is 00:33:53 future. So you know what's coming. You got to give us a spicy hot take. And it could be about anything, but probably ideally something, a NAR, L, or math, whatever. What is your spicy hot take things you believe in that most people don't believe in yet? That is very, very, very hard. I don't know. I think a lot of people think that there won't be any software engineers. I think that, I mean, I've come from, you know, when I started working on, AI-driven software engineering in 2023. There was a lot of skepticism at large companies. They were like, okay, well, these things are toys.
Starting point is 00:34:36 Software engineers will write better code. And I think that view has shifted quite a bit now, and maybe shifted all the way to the extreme, that we won't need any software engineer. I think software engineers will always be there. Like, what we do will change. I don't know if it's a hot take or not, but I think everybody is now crying,
Starting point is 00:34:53 bloom and doom is like the entire industry will go bust. and I hear this question from a lot of people and like, should my kid or nephew or niece or something should even study computer science? But I think these are just new tools. We will adapt to the new tools. We'll figure out ways to do things with these new tools. That is kind of maybe hot take, number one.
Starting point is 00:35:14 Number two is, I don't think anybody has figured out the economics of the models. It isn't clear to me how the profit gets divided among the different players. Like now what is happening is all the profits are in the hardware space, like not just GPUs, but DRAM, SSD. You know, if you're constructing a portfolio, looking back, your best holdings would be all kind of component manufacturers,
Starting point is 00:35:42 which is, you know, a little bit strange, right? Even all the way down to like Catterpillar is doing really well because we need more excavators. And nobody knows how the profit will get shared among the different, the other layers of a stack that will evolve. I don't know. I mean, for anything to be sustainable, it has to be divided somewhat sustainably.
Starting point is 00:36:05 Now, one thing that I wish, kind of like Ian, MPC is actually something that I hope will work. I love the idea that nobody owns compute and data. Like, it is sort of this fabric that is like an ether, like spread all over, right? And then everybody gets to contribute compute and data, and then in some secure way, you get utility out of it, out of this ether.
Starting point is 00:36:32 Yeah, I mean, I think that is the ideal future. I mean, I really like the idea, but again, like many of these ideas, like maybe how AI used to be in the 50s or 60s, its future hasn't come yet, but I hope that comes. Like, yeah. I think untruth is like generation. Like what we are missing, if we were to step back, it's the cost, not only it's the cost to experiment, but it's the rate of experimentation, right,
Starting point is 00:37:01 or the limiting factors, even in mathematics, like the cost to ideate on a potential solution, right? And then the cost of to develop a potential theorem, and then the cost of actually, like, testing it, for a very long time has been very, very high, right? Most of mathematics has been invented on a whiteboard. Yeah. We're now entering a phase where we can use computers to help,
Starting point is 00:37:25 help us. This has always been true to a certain extent as long as we're entering a new phase of computing, where we can use computers to help us reason in a way that we couldn't before. And I'm actually very, very bullish on the future. There's so many problems that were limited by human's linear cognitive ability and not linear in the sense of like, you know, there's a lot of different ways of thinking, but linear in the sense of like, well, we do this and they have to figure this out, they do this. And it's a very, it's a very single track. It was very expensive, very high risk. very low reward. And now what we're effectively doing is we're changing, we're turning that cost upside down, what's the electron cost? How many electrons are to generate? And how many disks have to
Starting point is 00:38:06 spin and how much silicon has to exist such that we can, you know, ideate on problems and simulate problems and find solutions in a space that wasn't possible. And mathematics is just one example. So I'm very bullish. If we do solve MPC, not to be confused, MCP, that's the most native encryption method for a federated internet, right? It's the most native way. It's most native to the concept of the internet and a free and open internet, one where like you actually control your data, you control the keys all the time in a way that, you know, undoes many of the problems that the current internet we have actually has
Starting point is 00:38:42 and has led from the result of the rise of advertisers, the effect that ads from the internet. So I'm very bullish. But the one last question I have for you before we go, because Tim's giving me the look. other than MPC multi-party computation, is there an area of mathematics you're most excited about solving the next five years? You think is tangibly capable of being solved, and you think will have the biggest impact? I think a lot of people, and myself included,
Starting point is 00:39:08 I think the Navy Stokes problem, which is part of the Clay Price. I think if you solve that, will completely change simulation, and simulation drives so many different things in our world. that and the other thing would be being able to do molecular simulation on a much more higher fidelity. Like, the closer we can bring how biological systems work to their simulation, the better off we are. Like today, in drug discovery, for example, the gates are like actual doing the tests.
Starting point is 00:39:41 Like, you have to do a single-cell test and multi-cell tests and obviously test on a human and all of that. If we can simulate with super high fidelity on how these systems actually work, I think that would be like a game changer. This is super fascinating. I think there's so much we can actually keep adding on to this discussion, but just enter the time.
Starting point is 00:40:01 Where do folks even learn more about you or XMath? Do you have social channels or places we can find you? Yeah, we are on Twitter, and obviously we are on our website. We put out blogs very frequently, our social channels are quite active on all the cool sort of results and research that we put out. Yeah, I mean, find us on our website at axiomat.ai where I think we post a new math results and we try to make the results accessible because that is kind of one of the things that we really
Starting point is 00:40:32 care about, that the math proofs that we put out are accessible somewhat to the general public. We also put out some of the software we do, Explorer, which is open source and axles on the services that we all use and math traditions love to use. Amazing. Well, Shula, it was such a pleasure having you on the podcast. Thank you for coming on. And I'm sure everyone else was listening was informed as I was today. So this was great. Thank you, Tim and Yen. And it was really nice finding that connection with MPC. I was not expecting that.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.