Hard Fork - Where Is All the A.I.-Driven Scientific Progress?

Episode Date: December 26, 2025

The leaders of the biggest A.I. labs argue that artificial intelligence will usher in a new era of scientific discovery, which will help us cure diseases and accelerate our ability to address the clim...ate crisis. But what has A.I. actually done for science so far?To understand, we asked Sam Rodriques, a scientist turned technologist who is developing A.I. tools for scientific research through his nonprofit FutureHouse and a for-profit spinoff, Edison Scientific. Edison recently released Kosmos — an A.I. agent, or A.I. scientist to use the company’s language, that it says can accomplish six months of doctoral or postdoctoral-level research in a single 12-hour run.Sam walks us through how Kosmos works, and why tools like it could dramatically speed up data analysis. But he also discusses why some of the most audacious claims about A.I. curing disease are unrealistic, as well as what bottlenecks still stand in the way of a true A.I.-accelerated future.Guest: Sam Rodriques, founder and chief executive of FutureHouse and Edison Scientific Additional Reading: The Quest for A.I. ‘Scientific Superintelligence’Top A.I. Researchers Leave OpenAI, Google and Meta for New Start-Up We want to hear from you. Email us at hardfork@nytimes.com. Find “Hard Fork” on YouTube and TikTok. Subscribe today at nytimes.com/podcasts or on Apple Podcasts and Spotify. You can also subscribe via your favorite podcast app here https://www.nytimes.com/activate-access/audio?source=podcatcher. For more podcasts and narrated articles, download The New York Times app at nytimes.com/app.

Transcript
Discussion (0)
Starting point is 00:00:00 Cynthia Reevo is the best singer in the world She's incredible I don't know what it is about her voice But it like brings me to tears Like every single fucking time I'm here She's the most incredibly like emotional voice I don't know I was like trying to figure out what it was
Starting point is 00:00:15 But it's just like she's just the best Because she obviously she has the power But there's all these like textures in there Did you see the design to go viral clip Of her visiting her old school? I obviously lost my shit I was The absolute best
Starting point is 00:00:32 is that the students start singing and they just sound like shit It was like my nightmare Just imagine you're one of these kids You're not that It's just like after school class You know it's a little club
Starting point is 00:00:46 You're just doing it for a little bit of enrichment And like you're just kind of plodding along Trying to get through the day And then fucking Cynthia Riva shows up And they're like all right kid You're up next What do you got? No thanks
Starting point is 00:00:56 No it was so sweet I would throw up. It was so sweet. I'm Kevin Russo Tech columnist at the New York Times. I'm Casey Noon from Platformer. And this is hard for! This week, Future House CEO Sam Rodriguez joins us in the studio
Starting point is 00:01:13 to separate the hype from the reality of AI science. Well, Casey, it's time for some science. Yeah, give me a second, Kevin. I'm just going to put on my lab coat here, get up my butts and burner, and see what you got cooking for us today. So I have been obsessed with this question
Starting point is 00:01:42 of what AI is and isn't doing for science and scientific discovery. Obviously, this is something we hear a lot about from the leaders of the big AI companies, people like Dario Amadeh, Sam Altman, Demasasas. They have all been saying things in recent months, about how close they believe we are to solving new scientific problems and curing diseases and fixing the climate with all of these new AI tools that they're building. And some of that is
Starting point is 00:02:10 obviously hype, or at least has the sort of markings of hype. But there's actually a lot of real stuff going on in AI and science that I just do not feel personally qualified to evaluate. Yeah. And I would also say that science has become one of the main way is that the leaders of these tech companies want us to evaluate them. Because whenever, one of their models does something horrible, the message we basically get back in response is, don't worry, we're about to cure cancer. Just hang on tight. I know that this chatbot might be driving you to madness, but if you can just give us a few more releases, we're going to do some really good stuff. Yes, and this is something that we're also hearing now from the U.S.
Starting point is 00:02:45 government. The Genesis mission was announced by the White House just before Thanksgiving. That is what they're calling a dedicated, coordinated national effort to unleash a new age of AI, Accelerating Innovation and Discovery that can solve the most challenging problem. of this century. I thought the Genesis mission was just them trying to get Phil Collins to play the White House Christmas party.
Starting point is 00:03:04 I guess not. And so today we have brought in a bona fide scientist to help us understand which of the sort of scientific discoveries and possibilities out there are real and which are not. We need an expert with a broad focus,
Starting point is 00:03:20 someone tracking the impact of AI, not just on biotech or drug discovery, but across the different sciences. And Casey, we have found the perfect person. Let's hear about him. Sam Rodriguez is the co-founder and CEO of Future House and Edison Scientific, which is a San Francisco-based, I guess it's both a non-profit and a for-profit. Where have I heard that before? Yes, come back when he has his board coup.
Starting point is 00:03:45 Future House is the nonprofit. Edison Scientific is the for-profit that spun out of it. I've been to their office in Dog Patch. It's really fun. It sort of feels like a kind of wacky, mad scientist lab. They've got all these, like, you know, sort of lab machines. that I don't understand, you know, people running around in lab coats, and they're all talking about AI, and it just feels like kind of a cool place to be. And they are building what Sam calls
Starting point is 00:04:06 an AI scientist, which is an AI agent that can do sort of parts of the process of scientific research. And Sam was also himself a scientist. He has a PhD in physics from MIT, and before he launched Future House, he spent several years running an applied biotech lab. So he has sort of seen this stuff happening from a couple different angles. Yeah, and today we want to talk to him about what he is up to, but also kind of get his vision of the entire landscape. Tell us what is working, what isn't, where's the hype, where's the real stuff? Sam has a lot to say about it. Yes, and I think it's fair to say that Sam is on the more optimistic end of the spectrum of
Starting point is 00:04:41 beliefs about what AI will do for science. But as you'll hear in our conversation, he's more skeptical than some of the most optimistic people who are claiming that will cure all disease in five or ten years. Yeah, if you've been craving a little bit of cold water for the wildest projections, he has some of that to offer you. So let's bring him in. When we come back, we'll be joined by Sam Rodriguez. Sam Rodriguez, welcome to Hard Fork.
Starting point is 00:05:33 Hello, thank you. So we have brought you here today to be our science expert, our guide to the biggest recent AI-powered breakthroughs that are happening in science. This is an area that I sort of understand in an ambient way is important and there are big things happening, but neither of us are scientists, although I did make a killer baking soda volcano in elementary school. So we have so much to talk about today. But before we get into some of the particulars,
Starting point is 00:05:59 I want to ask you about your project that you've been working on. Last month, the commercial arm of your nonprofit, which is called Edison Scientific, launched a new AI scientist called Cosmos, that you say can accomplish work equivalent to six months of a PhD or postdoctoral scientist in a single run of this model. Tell us about how Cosmos works and where that six-month number comes from.
Starting point is 00:06:21 Yeah, yeah, exactly. And actually, I will just start out by saying that When I got that six-month number, my reaction originally was like, there is no way that this is true, right? And we've now measured in a bunch of different ways. I can walk you guys through that. But basically, just to take a step back. So we've been working for two years on figuring out how to build an AI scientist. And the concept here is there's so much more science that we can do than we have scientists, right?
Starting point is 00:06:45 And so how do we scale up science? And the thing that is that happened with Cosmos, that is pretty cool, is cosmos is like the first. thing that I think that we've made that actually really feels like an AI scientist when you're working with it, right, which is to say that you go in, you give it a research objective, it goes away and it comes back with insights that are actually like really deep and interesting and sometimes wrong, but about 80% of the time right, which is like kind of similar to like if you ask a human to go away and do something, comes back like similar percentage of the time is right. And it's like, it's a kind of new experience working with it. So that's all,
Starting point is 00:07:23 That's very exciting. The six-month number, specifically, the way that we measured this was we had a bunch of academic collaborators, you know, scientists who had done a bunch of science previously, that they had not published yet. And we basically gave the same research directive and the same data set to the AI, to Cosmos, and we ask it, you know, to go away and just make new discoveries. And it would come back and it had found the same things that the researchers had found overnight. And then you go and you ask the researchers, you know, how long did it take you to find
Starting point is 00:07:56 this in the first place? And they would say like three months, five months, like six months, whatever. And so that's where it comes from. And it's like that's the amount of time that it took them to come up with the finding. So let me just ask you a couple of questions so I can ground myself here. Is this tool kind of a box you type into like the other chatbots? And if so, what is powering it? Did you guys sort of build your own model from scratch? Did you sort of, you know, make fine tunings to another company's model? Yeah, yeah. So it is indeed a box that you basically type into. You ask a research objective. It's not a chatbot, right? Like, it runs for 12 hours or so before eventually coming back to you with its findings. In terms of how it's built, we build on top of a bunch of different language models from Open AI, from Google, from Anthropic. Like in any given run, we use models from all the different providers. We also have like our own models for specific tasks that we've trained internally, where those models are like much better, for the specific tasks that we train them on,
Starting point is 00:08:53 then the models that the frontier providers make. Got it. And then the key insight in Cosmos is basically this use of what we call, like a structured world model. So one of the main limitations with AI systems today is that they're just limited in the length of the task and the sophistication of the task that they can carry out before they kind of go off the rails.
Starting point is 00:09:12 So they, like, you know, forget what they're doing. They no longer are on task. And what we figured out was a way to have them contributing to this world model that gets built up over. over time that basically describes like the full state of knowledge about the task that they're working on, which then means that we can orchestrate hundreds of like different agents running in parallel, running in series, and have them all working towards a coherent goal. And that was like the real unlock. Right. Another thing that I found interesting about Cosmos is the cost.
Starting point is 00:09:41 This model costs $200 per prompt. Yeah. So every time you give it a task, you're paying $200. Why is it so expensive? I mean, it uses a lot of computers. I mean, that's like the fundamental answer is it uses a lot of compute, right? Give us a sense of how much? Well, so an individual run from Cosmos will write 42,000 lines of code and read 1,500 research papers
Starting point is 00:10:03 on average. Like, if you run Claude, it might write like a few hundred lines of code, right? So that gives you some sense. It's like there's a lot of compute that is going into this. Have you ever had like a scientist whose cat walks across the keyboard
Starting point is 00:10:19 and accidentally hits entered and all of a sudden, spent like $600. This is a problem. This is a problem. And we are like, right, so the thing that you have to understand, right, is that if you are a scientist and you go and do an experiment, you get some data back. You're going to spend $5,000 or $10,000 gathering that data.
Starting point is 00:10:35 And so what scientists want is they want the absolute best performance that they can get. And like scientists who have used cosmos generally come back to me and are like, they can't believe we're only charging $200 for them. Right? And, you know, I will say like, you know, $200 right now is a promotional price. we actually have to eventually charge more. Oh, it's going up. So get those prompts in before Christmas thing.
Starting point is 00:10:55 Get those props in. Exactly. But like, but really, you know, it's like if you have to spend thousands of dollars gathering the data, like the cost at the end of the day is not the limitation. We do have to be very generous with refunds because people have, you know, make mistakes over time. I made a typo. Yeah, exactly.
Starting point is 00:11:10 Yeah, yeah, yeah, yeah. So what you just mentioned about the sort of the test that you all ran to figure out how long this thing could run for, how much time it was saving scientists. that's about like sort of replicating existing research that's out there. But a lot of what we hear from the people who are running these big AI labs is the possibility that pretty soon AI will start making novel scientific discoveries. We'll start doing things that existing scientific methods and processes can't do. How close are we to that? That's already happening, actually.
Starting point is 00:11:42 So if you go and you read the paper that we put out about Cosmos, we put out seven conclusions. that it had come to, three of which were replications of existing findings, four of which are net new contributions to the scientific literature, like new discovery. And of those, what's the most impressive? So, like, one of the ones that we really like, the human genome contains millions of genetic variants, right? These are differences between different people's DNA that are associated with disease. And for the most part, we know that a variant is associated with a disease, but we have no idea
Starting point is 00:12:16 why, right? And so we asked Cosmos, we gave it a bunch of raw data about a huge number of different genetic factors. So, like, what the variants are, what proteins bind near the variants, right? Like all these kinds of things and just asked it for type 2 diabetes to go and, you know, identify a mechanism associated with one of these variants. And it came back and it identified this was a variant that was not in a gene. And Cosmos identified that this is actually somewhere where a different protein binds. It was able to identify what protein binds and what gene is being expressed and connected that to the actual mechanism of that gene, SSR-1, which is involved in the pancreas in
Starting point is 00:12:58 secreting insulin, right? Okay, so in this case, is what I'm hearing that your model was able to do some very fancy reasoning over some existing data and identify something that sort of no other human scientist had gotten around to and might not have for a really long time. Yeah, that's right. Okay. And I think science generally consists of deciding what data to gather, gathering that data, and then drawing conclusions.
Starting point is 00:13:23 And so at this point, basically, it's like step number three that Cosmos is aimed at, you know, and there's more work on. You left out step zero, which was getting the Trump administration to unfreeze your funding, but everything else was right. So what happens when you get a discovery like this from Cosnos? Do you have to then go validate it? Do you hand it to, like, a team of researchers who then, and have to, like, make sure it works?
Starting point is 00:13:43 Or, like, what happens next? Yeah, absolutely. You have to go and validate it. And so that's actually one of the things also, you know, in the paper, actually we described how we went and validated that particular variant. In general, when people are using it, yeah, you go in. I mean, actually, literally, when you run a customs run, the first thing you have to do is you have to understand what's telling you.
Starting point is 00:13:59 Because it has just done something that scientists think is, like, six months worth of work, and you're going to sit there for a long time just, like, reading and understanding it. Once you've read it and understood it, then, yes, indeed, you're going to go and you're going to run, you know, various experiments, do your own analysis, cross-reference to try to, like, convince yourself that this is true. And then based on what your research directive is, you'll decide next steps, right? You know, in this case, I think it's probably low likelihood. There's a new drug target, like, from this particular finding, right? But you could go and you could run this on other findings, and then eventually maybe you find new drug target, you start a drug program, that's, you know.
Starting point is 00:14:35 So one concern that I've heard people express about models like Cosmos is that that you that this is just, like, sort of not where the roadblocks are, that the sort of reason that we don't have more AI discover drugs and design drugs out there curing diseases is not actually because, like, we don't have the research methods to discover those. It's because there's, like, you got to go to trials and you got to recruit human subjects and you got to get FDA approval. Like, all that stuff just takes a lot longer than the actual discovery of the drug. So what problems are models like these helping to solve in our scientific process? right now. So, so absolutely. I actually like, you know, I really agree that like the bottleneck at the
Starting point is 00:15:17 end of the day in solving medicine is basically, you know, clinical trials. I mean, and the easiest way to see this is if you look at the number of diseases that we like know how to cure in mice, right? It's like astronomical because obviously you can just like run experiments. And in humans, things are just slow. That said, if you think that every experiment that is being run right now, by pharma companies, like every clinical trial that's being run is like optimally planned and optimally, you know, conceived given the full state of knowledge, you are off your rocker, right? There's like no way. And those experiments cost hundreds of millions of dollars.
Starting point is 00:15:55 And so the question is like, we do at the end of the day have to run clinical trials. How do we make sure that those experiments are the best experiments we could possibly be running, given all the knowledge that we have, given all the data we have? There's so much data that we have that has insights in it that are. waiting to be found where we just like do not have people to go and find them and that's ultimately going to feed into better experiments, better trials, right? Well, so then I'm curious how you see your tool fitting into the workflow of today's scientist. Is it the sort of thing where like I have completed my experiments and now I want some help doing some analysis? Is it I have all these
Starting point is 00:16:33 old experiments that I only did a little bit of analysis on and I'm curious I can like sort of squeeze any more juice out of them or like what other ways are you? seeing the AI being like really good right now for a working scientist? Yeah, yeah, great question. So going back to me in 2019, which is when I was wrapping up my PhD, right, I had this gigantic data set and I wanted to graduate because I was a PhD student, which meant that I was making like, you know, $40,000 a year or something on, and like there were trying of great opportunities to go out and like, don't be a PhD student anymore.
Starting point is 00:17:03 Okay, so I spent six months literally just like sitting at my desk, like trying to analyze the data and drawing conclusions, reading papers, right? For right now, that's what, that's where Cosmos fits in. It's like, you know, you would just take that day as that you give it to Cosmos. It comes up with a lot of findings. Right now, you need to go and do a bunch of manual work to validate those findings and so on. Pretty soon, it's going to come with findings and you're going to be like, great. Sam, I'm curious if you could help sort of give us, and our listeners, a state of the world of AI science right now. Recently, the White House announced what it's calling the Genesis mission, which is a federal effort to kind of
Starting point is 00:17:41 of corral and harness all of these data sets that the federal government is sitting on and use them to do new scientific exploring. We also have lots of efforts, including yours, but lots of things going on in and around the tech industry, the biotech industry, people doing AI for materials, science. Give us a sense of like the lay of the land of like what's hot right now in AI science. Where is the effort and money going? Right. In order to understand the landscape of AI and science, the first thing, like, fundamentally that you have to understand is that AI is about building models, right? So, for example, right, like a language model, like, what is a language model? A language model is fundamentally a model of human language. It just so
Starting point is 00:18:26 happens that when you build a model of human language, it like learns how to think like a human in some sense because humans like encode their thoughts in language. This is like one of the greatest discoveries, right? Certainly the 21st century, maybe of all time. So similarly, when we talk about AI and science, what you have to think about is that you are modeling things. That is what AI does. And there are kind of two fundamental categories. There's modeling the natural world, right? And there's modeling the process of doing science.
Starting point is 00:18:58 These things are fundamentally different. And the reason to make this distinction is because, you know, what we are doing, right, we are modeling the process of doing science. The other side of the AI for science world is building models that can. can, for example, predict the structure of proteins, that can generate a new antibody, that can create a new organism from scratch, which are all things that have kind of like happened in 2025 where there's just a huge amount of momentum. Yeah, that makes sense. I mean, of the things that are happening in the part of the sort of process of modeling
Starting point is 00:19:34 the natural world, you mentioned protein folding, novel organisms, like, what has most excited you as a scientist that you've seen? So it's absolutely what's most exciting right now, I think, without a doubt, is this trend towards what we call generative models. So these are things where these are models that can produce examples of, you know, proteins or antibodies or whatever that have desired characteristics, basically from scratch. This is a new capability that we have never had before, and it's huge. I'm curious about the reliability piece as you're running all of these experiments.
Starting point is 00:20:08 You know, I saw this going around on social media this week. I reproduced it myself. If you asked Google is 2026 next year. It said, no, 2026 is not next year. It is the year after next. So in such a world, Sam, some people might get concerned at the idea that we're now entrusting the AI with all of our data analysis. So how much time are scientists having to spend go back and essentially rechecking the work
Starting point is 00:20:33 of the AIs and what kind of tax does that place on their work? Yeah, this is very funny. I mean, look, you have to spend a lot of time going back and checking. Yeah. But, like, to be clear, this is true regardless of whether or not an AI does it or whether you ask a friend to do it. If you're going to publish a paper, you damn well better go back and check it and, like, be sure that you are confident. And it's never going to be 100%, right? The best you're going to do is you're going to get to a place where it is similarly good to if you were doing it yourself, which is not 100% because you're not, in fact,
Starting point is 00:21:08 Right? And checking the work is like always going to be faster than producing it in the first place. Got it. Got it. Right. By a lot. A lot of our biggest scientific breakthroughs in history have come from these kind of strange accidents. These moments of serendipity, you know, penicillin starts growing in a petri dish. Oh, my God. This is great. Does AI preserve that kind of serendipity, those kinds of accidents? Or do they sort of optimize it away. Yeah, this is a great question. And the fact of that is we just like really don't know yet. This is going to be a like really important core question that a lot of people are asking. What's your intuition on? I mean, I think they probably will because they probably will preserve it preserve it. Because penicillin, my understanding is that basically like the window was left open on some agar with like no antibiotic and it obviously didn't have antibiotics because this
Starting point is 00:22:03 was the discovery of the first one. Right. So the window was left open with some agar and like, you know, Some spores flew onto it and began growing and they observed that the bacteria was inhibited, right? That's a mistake. Someone screwed up, right? And that mistake led to something fantastic. And you will have mistakes, I think. That will be preserved. But in the meantime, scientists should always leave their windows open.
Starting point is 00:22:23 You never know what's going to happen. You have no, you know, seriously, though, like there's so much, when you get a graduate student in academia, right? When you get graduate students, first year graduates, they have no idea what to do. They have no idea what to do. And that is a huge source of scientific progress because they just do the most. random, kooky stuff that no one who knew anything, who knows anything would ever think to do. And it's actually, it's actually really important. You almost want your, like, AI scientist model to hallucinate a little bit.
Starting point is 00:22:48 Totally. So that it doesn't lose that quality of, like. We talk about this. It's just, like, adding noise in order to, this is actually important for, like, biological evolution also, right? Like, the genome has a lot of noise, and that's how the, the evolution randomly comes up with, like, new stuff. Mm-hmm. There's, like, there's a protein that, like, it's just totally random, doesn't do anything. then one day all of a sudden, oops, it does something, and that's great, right?
Starting point is 00:23:10 So what do you make of the leaders of the big AI labs, people like Demis and Dario and Sam Altman who are saying, you know, AI is going to allow us to cure all diseases or most diseases within the next decade or two? A decade is crazy. Oh, and I'm happy to take a very strong stance on this, because if I'm wrong, it's a great thing, right? But if I'm wrong, everyone wins. But, like, a decade is crazy. Why is it crazy?
Starting point is 00:23:36 Because for the reason that we were talking about before, you have to run clinical trials, right? If we had a drug right now that prevented aging, completely halted aging in humans, you know, between the ages of like 25 and 65 or something, you would not know for 10 years because you can't detect in humans in that age range whether or not they're aging for like at least like, you know, five or 10 years. You don't detect from one year to the next that you're aging. So you won't know if the thing is working. I don't know. Some people at my 10-year high school reunion were already looking pretty well. This is fair. I hate to say it. I did say 25. Yeah, okay. Fair enough. Fair enough. But right. I mean, you know, so we have to conduct experiments. Those experiments will take time. Now, will we, like 30 years, I think is very plausible. We don't know what is going to be possible. We don't know if it's possible to halt aging. We don't know if it's possible to cure all diseases or whatever.
Starting point is 00:24:35 But between now and 30 years from now, I think you should expect to see a humongous leap forward in terms of medical biology. Let me drill in on that a bit, because I think some people might hear that in saying that this is essentially a regulatory issue, that like we just don't have, you know, the FDA set up to measure this. I'm curious about the experimental side of it, though, right? Because my understanding is like we don't really have enough biologists to run all the experiments that we might not have like the funding to fund the experiments. And you did raise the point that some of these experiments. just actually take a long time to run, right? So, like, what are all of the factors that, in your mind, are just going to make it so hard to... My gosh, you have to go and you have to, like, you know, even supposing you have a molecule
Starting point is 00:25:15 that you want to test in a human and you know which humans you want to test it in, you have to go and make it, right? Humans are big. They require, like, a lot of it. You have to make sure it's, like, high enough grade that you can actually put it into a human. You have to find the patients, which means forming relationships with the doctors, right? Actually, you know, waiting until you have enough patients who are willing to do it for many diseases, is like there just aren't that many patients.
Starting point is 00:25:37 And so finding the patients is hard, right? And it just, and then you have to actually dose them. You have to wait and see what happens, right? Even with no regulation, it would be slow. There's no AI shortcut for almost any of that, at least not right now. No. Like, what AI will allow us to do is it will allow us to discover a lot of things where we already have the information to discover it.
Starting point is 00:25:58 We just haven't figured that out yet. You should not expect that you're one day, going to, like, get GPD 7 and just, like, ask it how to cure Alzheimer's, and it will just tell you, my expectation is that there is not enough knowledge, where we do not have enough knowledge to solve it in principle, even with infinite intelligence, right? Like, with infinite intelligence, there would still be some things that are just not known about the world where we have to conduct the experiments to see. You'll be able to plan the best possible experiment, given everything it's known, but you
Starting point is 00:26:30 will not just be able to, like, you know, de novo kind of. figured out, right. Casey, I took Latin. That means from new. Oh, thank you. Thank you. That's saved me a step of Googling. When we come back, we'll play a game of overhyped or underhyped with our guest, Sam Rodriguez. this isn't quite science per se but i'm curious what you make of this sam all of the big AI labs are obsessed with math yeah with winning the international math olympiad with putting up
Starting point is 00:27:28 a gold medal score with solving these unproven math theorems and i have a take about this, which is that I believe that this is because these labs are filled with people who were themselves competitive math elites in high school and took part in the IMO and did pretty well. And a lot of those people think that, like, AGI will just sort of be like a slightly smarter version of them. But I'm curious, like, why are these places so obsessed with math as being one of the sort of first places that they want to make a lot of progress? There are two reasons. I think that one of the reasons is exactly what you just said. It's just familiar, right? But the other reason is that you can measure progress, right? So ultimately,
Starting point is 00:28:08 like, what drives progress in machine learning, a big part of what drives progress is benchmarks. With math, you can tell whether or not your proof is right. And there's kind of like an infinite number of things to go and prove. So it's just, like, really easy to tell whether or not you're getting better. And things like the IMO just present, like, great opportunities. By contrast, if you look at, like, some of the biggest breakthroughs recently, you know, biggest breakthroughs this year in AI for biology, right? Things like, you know, Chai Discovery,
Starting point is 00:28:36 Nabla coming up with these extremely good models for producing antibodies de novo, right? Huge breakthrough. But, like, ultimately,
Starting point is 00:28:46 the win for them is going to be, like, when it's approved in a human, and that might be another five years or something. Ark Institute,
Starting point is 00:28:55 putting out, like, the first time anyone has designed an organism from scratch, they designed a bacteriophage. It's a kind of virus that infects bacteria.
Starting point is 00:29:01 incredible, right, but like just harder to evaluate, like, how good is it? Like, you're not going to release it into the wild. And so, it's harder to evaluate, whereas like the IMO is just like super clean. And so I think that's one thing that we think about a lot is just like, you know, how do we get really clear benchmarks that we can pursue to measure whether or not we're doing a good job at science? I have an answer here. International Cancer Curing Olympiad. I like that. Should we start this. I think that would be great. We can give people a medal if they win. Let's get on it, Labs. So when the CEOs or the leaders of these companies make these statements about how we're going to cure all disease using AI in the next 10 years or 15 years or whatever that, whatever timeline
Starting point is 00:29:46 they give, are they doing that because they don't understand the bottlenecks? I mean, these are very smart people. So what are they not seeing or are they just doing this as sort of a marketing exercise? Is this an attempt to get people excited about AI who might otherwise be freaked out about And why are they giving these projections? No, look, I mean, I think that they are reasonable people could disagree. There are lots of reasons why you could argue that, like, actually the models will get super smart and they will figure out ways to measure whether or not we're making progress before you run a clinical trial. And that will increase the iteration cycle, right?
Starting point is 00:30:20 Like, there are reasonable arguments to be made about that, right? Like, you know, that we are just going to not do full clinical trials anymore. We'll just, like, use biomarkers. Like, that's not crazy. and that's one way that I could be wrong. And maybe in 10 years we do have cures for all diseases. So that's part of it. Like, obviously there's part of it, which is that they want to hype the thing.
Starting point is 00:30:41 Part of it is that, you know, does Sam Altman like really intimately understand, like what it takes to go and manufacture, like scale up manufacturing for a small molecule to put into the clinic? Like, probably not, right? So there's a mixture. I don't think any of it's in bad faith. It's just people are very excited. there will be a little bit of a collision with reality at some point. We're going to see exactly where that is. But regardless, the future is going to be awesome.
Starting point is 00:31:08 At this moment in 2025, how much do you think AI tools have changed the life of a working scientist? And how different do you expect that will be a year from now? I think that you'd be shocked to the extent that they have not yet. Scientists in general are extremely conservative people because if you're running an experiment, you never actually fully know, in biology at least, you usually do not fully understand why the experiment works and why not. There are some things that you've inherited
Starting point is 00:31:37 from protocols that you've run in the past and where it's like we do it this way, you could go and test it, but there are way too many things to test. So you're just kind of like locked in in your methods, and it's what works, and you just want to do what works. And so for that reason,
Starting point is 00:31:49 like biologists just adopt new methods slowly. I think most labs around the world are still probably doing science the way they've done it before and probably will continue to do it. so for a while. And that's okay. You know, one place I think with coding, a lot of people are already adopting it because in biology historically, coding has been a big bottleneck. It's a huge unlock now that biologists who didn't know how to code can like do a lot of coding using cloud
Starting point is 00:32:12 code, using opening eyes models, Gemini, et cetera. So that's a huge unlock. I think that's going to see a lot of adoption quickly. Literature search, right? Like being able to parse the immensity of the scientific literature, that's a huge unlock. That's going to get adopted very quickly. right? The tools like what we're building are like a little bit more frontier. Ultimately people adopt them when they see other people using them and getting great results. Sam, can we play a little lightning round game here with you? We're calling this one overhyped, underhyped, so we'll tell you something and you tell us whether in your scientific opinion it is overhyped or underhyped. You ready? Yeah. Vibe proving. This is when AI systems go out and like write
Starting point is 00:32:48 math proofs. Probably just if I have a fourth choice, probably overhyped. It's great for, I mean, it's great as like a progress driver in AI. It's like, and we'll probably have not, you know, being good at it, we'll probably have implications elsewhere. But is it itself that useful? I'm not sure. Robotics for AI lab automation. Robotics for automated AI labs or? Yes. Or for automating scientific labs. Robotics for automating scientific labs. I think appropriately hyped. It is going to be
Starting point is 00:33:25 totally transformative. The technology is not at all there yet. Um, uh, there's a lot that we need to do, but like, yeah, probably appropriately hyped. Alpha fold three? Um, that's an interesting one. I mean, I think that I would say probably like underhyped in that, I think like all of the protein structure models, there's a lot of hype around them, but they're still, they're still probably, like, they're going to be extremely transformative. So maybe I would, I would, I would, I would say probably underhyped. It's a hard, there's a lot of hype around it, though. So it's a hard decision to make.
Starting point is 00:34:00 Virtual cells. We heard from Patrick Collison this summer about what the ARC Institute is done with making a virtual cell. This is overhyped, but for a specific reason, right? Like, the models that they're building at ARC are awesome. The models, and they're doing similar things at, like, New Limit, Chan Zuckerberg, right? Like, many of these places, many of these great companies and great organizations are doing it.
Starting point is 00:34:23 I think that, like, calling it a virtual cell, like, is a little bit, that's, like, a little bit over, that's overhyped, right? Like, ultimately, those model, that kind of model model is something, like, very specific. Like, actually building, like, a true virtual cell, like, being able to simulate a cell in a computer is an amazing goal. We are very far away from that. Quantum computing. Overhyped. Brain computer interfaces. I'm also.
Starting point is 00:34:53 Oh, man, this one's really hard. I'm going to say overhyped. I'm a huge believer in BCIs. I think, like, effective BCIs are the way that we imagine them in sci-fi are further out than people imagine. Even, like, Neurrelink is making amazing progress. Yeah, Casey's got one in his head right now. It's on the fritz.
Starting point is 00:35:14 Yeah. There are a lot of great people who are making progress there, but it's further out, I think, than people think. So we're nearing the end of the year. if we can put you in a bit of a reflective mode, what do you think were the top three AI-driven scientific advancements this year? Yeah, I think that honestly,
Starting point is 00:35:34 like this year has been the year of agents. This was the year when people discovered agents. And so I do, like, you know, in good faith, have to put myself, I have to put us on that list. Also with Google co-scientists, I mean, we're not the only people who are working on this. You know, Google has been doing a great job. There are a bunch of other people.
Starting point is 00:35:51 So AI agents for science, definitely. And then, like, generative design is just having a huge moment, right? So the other ones would probably be the work that Chai has been doing, the work that Nobla has been doing, and many others on de novo antibody design. I'm really glad you defined De Novo earlier in the broadcast, by the way. It's come up a lot. Yes. Sorry. When I say De Novo, I just mean, like, literally you just like, it generates it from scratch.
Starting point is 00:36:15 You don't give it anything, right? You just like, or you give it a target that you want to bind to and it generates it from scratch. This is huge because, like, basically, the promise. that companies like Chai, Nabila, and so on are going after is a world in which you can say, like, we know to cure this disease, we have to target that protein. You click a button and you have an antibody that you can go and put in humans tomorrow. It's huge. It cuts out an enormous amount of what people are to do previously.
Starting point is 00:36:40 So that's a huge one. And the third one, I just think, like, what Brian He, Patrick Shoe and so on at the Arc Institute have done with like generating organisms to know, sorry, generating organisms front scratch. We can say, we know what it means now. That's the important thing. This is our, like, Peewee's Playhouse Word of the Week this week. The de novo design of organisms. Is it useful?
Starting point is 00:37:01 I don't know. Is it awesome? Like, absolutely. It's so, it's such a big breakthrough. And Sam, what should we be watching for next year? What are you excited about that may be coming down the pipe for 2026? Honestly, it is, again, going to be the agents that see an explosion. We are right now at, like, the beginning of that S curve, and that is going to continue.
Starting point is 00:37:22 Maybe a year ago, I would tell people that, that I thought in 2026 or maybe 2027 that like the majority of the high quality hypotheses that are generated by the scientific community would be generated like by us or by like agents that are like the ones that we're building. And when I said it in 2024, I thought I was overhyping, right? I mean, I was just like, I need some hype. At this point, it may be real. I mean, I think 2026 would be ambitious for that.
Starting point is 00:37:47 I mean, that's a huge, right? For the majority of the good hypotheses that come out to be made by agents, that's a huge leap. But like 2027, yeah, man. I mean, 2026 is going to be the year when we just see these agents start to, like, infiltrate everything, right? Infiltrate labs, infiltrate people's normal life. I mean, it's already happening. Cool. Yeah.
Starting point is 00:38:06 Well, I look forward to it. Sam, thank you so much for giving us the science education that we clearly didn't get in school. Yeah, you've really given us some de novo things to think about. I appreciate that. Good. Thank you guys. Thank you. Heart Fork is produced by Rachel Cohn and Whitney Jones.
Starting point is 00:38:45 We're edited by Jen Poyant. Today's show was fact-checked by Will Paisal and engineered by Chris Wood. Original music by Diane Wong Rowan Nemistow Alyssa Moxley and Dan Powell Video production
Starting point is 00:38:58 by Soya Roque Pat Gunther Jake Nickle and Chris Schott You can watch this whole episode on YouTube at YouTube.com slash hardfork Special thanks to Paula Schumann Pui Wing Tam
Starting point is 00:39:10 and Dahlia Hadad you can email us at Hardfork at NYTimes.com Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.