No Priors: Artificial Intelligence | Technology | Startups - Biohub: The Future of Biology is Open-Source with Co-Founders Mark Zuckerberg, Priscilla Chan, and Head of Science Alex Rives
Episode Date: June 10, 2026Biohub started with an ambitious goal of curing, preventing, and managing all disease by the end of the century. A decade later, thanks to the convergence of frontier AI and biological data, that goal... may have been too conservative. In this episode, Elad Gil and Sarah Guo sit down with Biohub co-founders Mark Zuckerberg and Priscilla Chan, alongside Biohub Head of Science Alex Rives. Together, they discuss Biohub’s $500 million virtual biology initiative, which integrates frontier AI with wet-lab work to build predictive world models of cells, proteins, and systems. They also talk about their newly announced open-source engine for digital protein and antibody design, ESMFold2; why Biohub is a nonprofit rather than a venture-backed startup; and how hierarchical simulations will soon allow doctors to treat patients at an individual, mechanistic level. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @Biohub | @finkd | @alexrives | @ChanZuckerberg Chapters: 00:00 – Cold Open 01:02 - Mark Zuckerberg, Priscilla Chan, and Alex Rives Introduction 01:26 – Why Biohub and Their Mission 08:27 – Integrating Frontier AI and Frontier Biology 09:45 – Micro to Macro Biological Modeling 14:22 – Mechanistic Interpretiability 16:58 – Why Biohub is a Non-Profit 21:41 – Understanding How Biology Works 24:23 – Timeline for Curing All Diseases 26:25 – Translating Research to Patient Impact 28:04 – Launch of ESMFold2 32:13 – Tackling Off-Target Effects and Edge Cases 38:39 – Putting the Tech in Individual Hands 41:06 – Talent at Biohub 44:25 – What’s Next After ESMFold2 46:10 – Connecting ESMFold2 to Agentic Systems 46:51 – The Virtual Cell 49:33 – Defining Success for Biohub 51:52 – Biohub Strategy Update 56:20 – Conclusion
Transcript
Discussion (0)
We just want to give tools to the whole scientific community.
We want to understand how biology works.
I want to understand the genetics of this person.
I want to understand the risks they have to different illnesses.
My goal is to be able to treat the individual as an individual,
understand the mechanisms, and be able to intervene.
We'll have a bigger impact by getting this and more scientists
hands quicker, by doing it as open source projects instead.
It's not just like there's some factory somewhere that you can pay to produce the data.
You actually need to invent new novel,
scientific approaches. The theory isn't that we're going to cure the diseases. We're not.
It's that we want to help accelerate the pace of progress for the whole scientific field.
We folded over 1.1 billion proteins and predicted their structures. And we didn't design a model
for antibodies. We didn't design a model to be able to bind one particular target. We just
designed a model that could understand proteins. If we could design a protein to actually change
the physiology, then we can actually cure someone.
Today on No Priors, we're joined by Mark Zuckerberg, Priscilla Chan, and Alex Reeves.
We'll be talking about BioHub and all their various efforts to now start applying AI at scale to do world models of cells and different levels of interactions across biology.
Mark, Priscilla, thank you for doing this.
Yeah, thanks for having us.
This is fun.
Alex, congratulations on new missions.
Thank you.
You guys made BioHub your primary philanthropic effort and then committed $500 million to this virtual biology initiative.
can you tell us a little bit about, you know, why do that and how did you go from we should fund this to this is like who we are?
So BioHub in its current form, we're super excited about.
We feel like it's a really good fit for who we are and what we bring to the table and what we can achieve together.
But this work started 10 years ago when we were thinking about how can we give back.
And Mark had, Mark wanted to build an organization that could cure, prevent, and manage all disease by the end of the century.
And we had a series of hilarious meetings with scientists that, like, famous Nobel Prize winning scientists were just laughing at us.
Was that your starting line? We're just going to cure all disease.
No, no. And to be clear, we don't think that we're going to be the ones curing the diseases.
Our goal was always to build tools that could accelerate the whole scientific fields. That way, the scientific field collectively could cure all the diseases.
But still, I think people thought that by the end of the century was a stretch.
Now I think it's like too conservative.
And so we kept being like, okay, well, we had these series of funny, awkward educational conversations where we were like, okay, but like why?
Like, why do you think it's impossible?
And like, you know, just being the person in the room is just like, well, I don't know why.
You tell me.
Finally we got people to like, they're like, fine, if you really must know.
And we're like, you know, we do.
It seems important.
It's, you know, they were like, well, we work in silos.
and when you publish information doesn't get shared, it gets locked up for long periods of time,
and we don't have tooling.
They gave the example of, like, we build a great tool by one postdoc in a lab, and it lives
on their computer, and when they graduate, the tool is gone.
And they just, it was, what we heard was very hard to build shared tools, to move science
faster, build a shared knowledge base to quickly move science faster.
And that's sort of where we begin and thinking about, okay, like if those are the problems, like what can we contribute?
Yeah. I mean, so the original BioHub model was basically focus on long-term tool development by bringing together engineers and scientists across multiple universities to focus on long-term tool development.
And basically it worked. And we started off with ZZI doing a number of different things.
And I think over time we just felt like, okay, the science piece is really working.
And we just kept on investing more and more and more in it until now it is basically the primary and main thing that we're doing.
And we've expanded the original San Francisco Bio Hub to a handful now at this point.
There's New York.
There's Chicago.
The real focus in the unifying theme at this point is the virtual biology initiative around taking the unique data sets that are.
are able to be generated in order to model effectively,
starting with the smallest pieces of proteins,
but then eventually cells and whole biological systems.
But that's kind of how we've evolved is,
you know, this idea that we talk about around
that some of this is an AI problem
and you want to build a frontier AI lab,
but you need to couple that with a frontier biology effort
that can do the work of
of basically being able to understand and get the data that you need to actually be able to build these models.
Because unlike language models, there's just like a lot of data out there on the internet, that's not really the case with biology.
I mean, there are obviously a bunch of different data sets that exist that academia and scientists have generated over the decades.
But a lot of the stuff that I think we want to put into this, it doesn't exist, right?
It's like you want to be able to visualize things that people haven't been able to see before, which is why we're doing the imaging work.
You want to be able to record things that are going on inside the body, which is why we're doing the kind of cellular engineering work.
You want to be able to measure things like inflammation in ways that haven't been possible, which is why the Chicago Bio Hub is focused on building those kind of devices and being able to do that.
And that will fundamentally create new types of data sets that will allow new types of models.
And I think it's just a very exciting thing that, I'm going back to what you're saying, if the scientific field, it primarily needs kind of tool to be able to,
that now is going to empower scientists across the field to build to do their work faster,
that's what we think we can provide through this kind of long-term focus on tool development.
But I think there's a fun through line on where we started and, you know,
bringing us to our work to, with that Alex is driving now, is that our very first request for
application RFA here was around single cell sequencing.
And we wanted to look at sort of like the RNA that is transcribed in individual cells.
And that was possible, but it was still pretty early on in understanding how different cells were expressing their DNA.
To the point where at the beginning, we were just funding methods, like getting people to describe how to do it so that others could share that methodology.
And then that became us funding the human cell Atlas, which is now one of the largest databases of,
single-cell transcriptomes. It was getting hard for scientists to annotate the data. So we built
cell by gene, which was like a very simple annotation tool that scientists could use to make use of that
data. Then a community came around cell by gene, built around cell by gene, and started
contributing more and more data that we had nothing to do with sort of creating or funding or
making happen in the world. And now cell by gene is a corpus of knowledge that a lot of the
transcriptomic-based models are based off of and is used regularly by the scientific community.
But still, there are always critiques. Like, this is just stamp collecting. Like, you're just gathering
bits of knowledge, sorry, bits of data. And we're not going to be able to pull scientific
knowledge and wisdom and insights out of. And we're like, well, we didn't have an answer for a while.
And then imagine our delight when large language models became a huge topic of conversation that could make sense of large amounts of data.
And I just, for me, is like, what if we could actually understand how biology worked?
Move it from a discovery-based science to an engineering-based science where we could systematically understand how living beings, living cells worked and be able to understand why things go.
wrong. And so when we saw that moment, we're like, this is it. Something really big could happen here.
Alex, you were, you started at MetaFair, but you were on the path to, you know, you'd assemble the
team at evolutionary scale and you'd raise venture and you were making progress in your models.
What was the pitch from Mark and Priscilla where you said, like, that's actually the right way to
go after the mission? Well, I think for me, it was really kind of the moment when I understood that,
you know, they really saw this as an integrality.
of frontier AI and frontier biology. And I think I had developed conviction that, you know,
this is really a new era of science that's just beginning kind of what's going to be possible
with artificial intelligence. And, you know, we're in the age of information theory at scale.
And we have these systems that can basically kind of predict the next token. And they can,
you know, learn world models from that. They can learn biology from the data. And so, you know,
I think that it just, it was really clear.
that, you know, to build kind of that next, that next kind of institution for the next era,
you would really need to have frontier artificial intelligence. You would have to have frontier biology.
You would need to start to put those things in feedback and really have models that are learning from the
biology. And I think, you know, it's just, and you need the right scale and the right people.
And so this just really felt, I think, like the way to do that.
There's a variety of different models that you all have been working on. And I think it's kind of interesting because
some of the earliest breakthroughs in biology were things like alpha fold, where, you know,
there was a Google model that showed that you could do protein folding at scale in a really
interesting way that people didn't realize was very tractable, and this was pre sort of the really
big transformer waves that came later. And then you're working on a variety of different things
at different scale, right? You're doing incremental molecular modeling and protein folding.
You're doing cell-based stuff. You're thinking about interrogating larger-scale systems in biology.
How well do you think that extends from sort of the micro to the macro?
You mentioned almost starting with building blocks and building up, but
modeling cellular behavior is very different from modeling protein folding.
The data is very different. The modeling is different. I'm just curious,
like, do you think it's all similar in terms of just data and you train stuff?
Or do you think it's actually, there's some differences in terms of how you actually have to deal with these systems?
I mean, there are probably some differences. I mean, you can probably talk more to the specifics around this.
But, like, I mean, I think each layer is going to end up being somewhat qualitatively different, right?
But you need to be able to understand the protein interactions in order to be able to understand.
how cells work. So you can't just go straight to cells in a way without understanding the
protein modeling. And then if you're trying to understand something like the, you know, the way
the immune system works or a bunch of cells interact together, then, you know, it's tough to
do that without first understanding cells. I mean, you might be able to, like, a very high level
of abstraction simulate a system. But if you really want to like understand how it's going
to work, you kind of want to build the simulations at each level hierarchically. So that's basically
the approach that we're going through, starting with the building blocks. So the, the
the protein. But yeah, I mean, I think that there's going to be different types of data that you
want to collect for each. The modeling techniques, I think we'll see. I mean, that'll all keep on
advancing across the board. But I do think that a big part of the strategy is this view that you need
to build it up hierarchically. And, you know, one of the things that's unique about us in the
space is we are very intentional that the AI efforts and the wet lab efforts were a single effort.
And we've done a lot of work to bring them together. And the really neat things,
that we can do is really try to pull and gather data that helps us connect across sort of the
hierarchy. You know, you can look at transcriptomics with space within a cell and look at where it's
localizing. We can look at translucent zebrafish and look at the development across
different cells and when the brain develops. We have sensors that allow us to look at cell cell
communication and different molecules. And so we can be strategic about the types of experiments and
data we want to collect that helps us bridge across these that makes it so that there's some
connective tissue that helps drive the modeling that, you know, the modeling magic that happens.
Yeah, the reason I asked a question, by the way, is I used to be a biologist. I have a PhD in biology,
and I worked in wet labs for almost a decade and everything else.
Are you looking for a job?
We can talk about that later.
It's not a no.
At this point in my career, you know.
I'm like Danny Glover, you know, and he's a weapon. I'm almost at retirement.
But I think, you know, one of the things that was always lacking was this integrative nature across the different layers of biology and the developmental biologists would work on their own.
The molecular biologists would be doing different experiments.
And so that's what I was curious about.
Yeah.
Typically, there's a reductionist view of biology.
and there's a system's view, and those people didn't really work together deeply.
And so one of the exciting things about what you're doing actually is how you're bridging that.
And so that was kind of the basis for the question as well.
Yeah, and if I could add something there, you know, I think that, you know, we're in the age of this kind of information theory and biology.
And so, you know, there are levels of complexity and hierarchy and biology.
And kind of each level is made up of and, you know, constituted by the lower levels.
And so as you want to have that kind of more complete description,
and you want to have systems that can really generalize
and begin to actually answer experimental questions digitally
that you could ask in the lab, you need to have kind of the right basis
for modeling at every level.
And so I think what's really unique about what we can do
is to, as Priscilla and Mark were saying,
you know, really build information at each of these different layers,
collect them, collect kind of those connection points,
but then also will we kind of do it at the scale that will reveal that underlying information
architecture? And that's going to be really critical to actually be able to build digital
representations that can answer new experimental questions.
One of the things that inspires me most about this effort is really what Priscilla said,
which is like, well, there's so much we actually don't understand about biology and what if we could,
which I think is actually very different from lots of other incredibly interesting and useful AI problems
we attack, we're like trying to replicate human behavior. And I'm like, a lot of that data is,
you know, on the internet or captured. And without pretending to understand all human behavior,
you can predict a lot of it. I thought one of the most interesting things in your release was actually,
you know, the like mechanistic interpretability stuff you alluded to, which is can we actually
extract new knowledge from, you know, what the model believes is happening, right? Can you talk
a little bit about that? Yeah, I'm really excited about that. So I think, you know, in mechanistic
interpretability kind of traditionally it's been applied to large language models with the goal
of understanding, you know, kind of what is the representation space of a large language model?
How does it compute things? And does that really connect to, you know, what we understand about
our intuitive understanding of the world? And so there's, I think, this really rich toolkit
that has been developed to start to be able to ask those questions. So kind of what does that
mean for biology? One of the classes of models that we train are these,
protein language models. So they're really, you know, is trained on the codes of proteins. And so
anything they learn about biology is kind of emergent. And we've seen that they can learn things
like biological structure and biological function. And that's just kind of emergent from this,
you know, token prediction training task. So, you know, as we think about like mechanistic
interpretability in those models, you know, we're really seeing the unknown because the models have
been trained on billions of protein sequences. They've been trained on, you know, both known
and unknown biology. And yet they're developing these representations that start to kind of capture
things that we can really see correspond to that reductive picture of biology that's been built up
over the centuries. So kind of you can start to connect the dots between proteins where we kind of
really don't know anything about them with proteins where we do know something because there's
that kind of underlying structure, grammar that's linking them in the representation space of the
model. And at the extreme, it could be, you know, we're going to understand systems in the body
that we didn't before or the mechanism of action for a new treatment because we can ask the model,
right, interrogate that representation. That's right. The hope is that you kind of really learn
the underlying basis for how it's making the predictions. And so you open up the black box and
you can actually understand kind of the biology that the model is representing. So asking for a friend
you know, you guys all believe in venture-backed companies as a way to have impact on the world.
What was it like collecting data on zebrafish or the span of the data or the wet lab work or just the scale?
Like what makes this a better fit for this big nonprofit, you know, ecosystem effort versus a venture-backed company?
Well, I think we just want to give tools to the whole scientific community.
And I mean, like so I think in order to have the business.
biggest impact. I mean, part of it is just we're, I mean, it's not actually clear that we
couldn't run it as a business if we wanted to. I just think that we'll have a bigger impact by
getting this and more scientists hands quicker by doing it as open source projects instead.
So, yeah, I mean, I think that that's, that's kind of the approach. But I don't know,
it's an interesting question. I'm not sure that, I mean, obviously you were doing it as a,
as a for-profit company, a bunch of the modeling before,
then you run into certain issues.
I mean, you have to raise a large amount of money in order to build a compute
clusters.
You know, I mean, it's, I think in a lot of ways, the data is actually even more of a
constraint.
And because if you look at like the scale of these models compared to language models,
they're smaller, but they're smaller because the amount of data is less.
In order to get the data, it's not just like there's some factory somewhere that you can
pay to produce the data. Like, you actually need to invent new novel scientific approaches to be able
to do the, you know, for example, the type of cellular engineering we're doing in New York or the
types of devices in Chicago, which is why, you know, when we're talking about this concept of
frontier biology and frontier AI, the frontier biology is you need to do real science to advance
different biological methods in order to be able to observe the things that create the data
that go into the model. So it's not just like an off-the-shelf thing that you can create. Now,
That's a pretty big effort. I don't know that there are like that many things like that
that are done as biotechs. I think it's just the scale of the ambition of what we're doing,
the time horizon over which we're committed to doing it. I think part of the theory is like
if you're building tools that are this complicated, you kind of want to have a 10 to 15 year time
horizon on building out these efforts. And then the scale of capital required. I mean,
I guess there's no rule that said that you couldn't do it as like an incredibly well-funded
startup, but I think that this just made more sense. And then it also is simplifying strategically
to not have to think about how you're going to make money with the different things. And we just,
we want to get the models in people's hands. We release them as open source. I think that that's a
very valuable thing to do. And again, I mean, the theory isn't that we're going to cure the diseases.
We're not. It's that we want to help accelerate the pace of progress for the whole scientific
field. As the person least experienced with making money here, I would say that there, you,
the sort of neutral nonprofit nature of our work actually helps harness more people to enter this
effort. And to actually achieve the mission of like understanding the totality of human biology
and to cure, prevent, manage all diseases, you actually do need the entire academic biotech
industry to come together and to work on this in a sort of unified way, in part because
there's a lot of talent out there, and it's not helpful to leave any talent, exclude any talent from
the effort. And there's a super long tail of diseases. There are the common ones, and even the common
ones, I think if you unbundle heart disease, cancer, neurodegenerative diseases, even if you
unbundle like dementia or depression, there are many, many, many subcategories that become more
and more niche, and that's not even looking at the long, long tail of rare diseases. Those often get
orphaned and don't get brought along when we're sort of looking at the most efficient way to impact
the lives of many. But if you sort of decentralize the effort and put the tools in many people's
hands, you start getting people who are like, you know what, I am super interested in spinal mass
muscular atrophy. And that's something I care deeply about. And if you put the tools in that person's
hands, they're going to be able to make progress in a way if you had to focus your efforts and
make big bets, you probably wouldn't because it's just a niche individual small group disease
that actually will in turn, if we can understand that disease process helps us unlock knowledge
about a lot more about how the human body works. Do you have any thoughts or predictions in
terms of what disease areas this work will impact first? I know it's very hard to be predictive
about these things. But just given the nature of the work and the nature of the models, other areas
are most optimistic about in the short to medium term? That's actually not how I think about it, at least.
The way I think about it is like we want to understand how biology works. The ideal world, as you would say,
I understand, I understand the genetics of this person. So I want to think about people at the
individual level. I want to understand the genetics of this person. I want to understand the risks they have
to different illnesses. I want to understand.
understand the mechanistic connection between, say, a gene variant, a protein, and a disease process.
Because if you understand that through chain, then you can design a protein, design a drug,
bespoke to them, and actually make an intervention.
And right now, I'm sure we've all had experiences being sick.
And if you have something that's even remotely non-standard, you go into PubMed, you look up a paper,
you look up the supplement, and then you start going through the methods, and you're like,
am I represented in this paper? And we're just making guesses. We really have no mechanistic
understanding. We're saying, like, okay, you're kind of like these people that we studied.
And this drug kind of impacts the pathway that we think is implicated. Let's try and see if anything
happens. And time passes, and sometimes it works and sometimes it doesn't. So my,
goal is to be able to treat the individual as an individual, understand the mechanisms,
and be able to intervene. And there are different diseases that are different stages of
filling out that whole through line. And so for some diseases, you just want to understand
which gene variants actually cause disease and which don't. And that in itself can be
super empowering to patients. And if beyond that, there are some diseases where we
understand the chain, we just can't intervene and change a specific protein function. That's super
exciting too. Like if we could design a protein to actually change the physiology, then we can actually
cure someone. But to me, like, that is just as exciting as understanding, contributing to our
understanding of like how someone gets sick in the first. Yeah. And so it's a very exciting
vision because you're basically saying you can bring generalizable tools to provide very personalized
things for each individual person. Yes. And that's the power of the approach is you have these
big models that you build that can then apply anywhere. I know that you mentioned earlier that you were
going to try and cure prevent all diseases within 100 years. And you mentioned, hey, it could actually be
sooner now, given all the advances in AI. Do you have some thought of when we think we'll be
closer to that goal or some? I mean, I'm optimistic it'll be sooner. I mean, I think the thing that's
complicated is that it's a dynamic system, right? So if you fix it.
something. There will obviously be future things that you need to work on. So I don't think that the
current set of things that we're aware of are going to be the only things that need to get
worked out. But I don't know. I think that the progress with AI is really, is obviously, you know,
very exciting on this. The other thing that I'd say, just adding to what you were saying a second
ago, is we really look at more kind of systems than specific diseases.
So, for example, one area that seems really important to understand is inflammation.
We talked about this a bunch.
This is a big focus of the Chicago Bio Hub.
There's a lot of data on that.
It seems quite clear that it's connected to a bunch of different diseases, but we don't,
rather than studying the specific diseases, we think that by trying to understand inflammation
more broadly, that will make it so that other companies that can then use these tools can
work on specific therapies.
Another example is, I think that the immune system, I think, is a very good case to study for some of the work that we're doing in cellular engineering.
And when we're kind of ladder up from proteins to cells to like whole dynamic systems within the body, I think that that one makes sense.
I mean, it's sort of privileged.
It can, you know, the cells can travel around through the body, all that.
You know, so obviously that has a big part in addressing different diseases.
How do you make the immune system function better?
but exactly how do you connect that last mile, I think is going to be more something that biotech
or other academics individually studying things will be better suited to do.
So this is like kind of how we think about building out the tool set that just helps accelerate
all these other folks.
Whether the timeline is 10 years, hopefully, you know, less than 100 now, I think it's useful
for maybe your average doctor or patient, human being, everybody's a patient, to
to think about like what's externally visible in the progress here. You worked with patients for a
long time at UCSF. Like what should doctors look out for? What should people look out for if you're
actually accelerating progress? This is the part, you know, I'm super excited about the progress,
especially with this launch that Alex and his team have put forward. And I think it's very clear
that science is going to start moving pretty quickly. And I think the thing that's less
clear to me is exactly how we translate to the clinic and what that looks like. And I think
what has to change is actually the way we do clinical research. And my hope is that we're really
shortening the distance between bench research and patient impact. But there's a lot of steps there
that we need people who actually take care of patients to think creatively and think about how to
deploy safely. And that's a gap that we have some work in. We partner with Jennifer
down on a CRISPRC program at UCSF. So we're dipping our toe and understanding how the deployment
of research needs to change, given how quickly research will be progressing. But that one is still,
I think, is still shaping up. Maybe I could say something about our most recent launch.
because I think it also kind of, you should actually explicitly about it. Yeah. Yeah. So, you know, because I guess it was just a week ago about now. So we announced the new ESM fold. And so this is basically an open system for scientific discovery and protein biology. It's a world model of protein biology that's been trained. It's a language model base. So it's been trained on billions of protein sequences, kind of learns these emerging.
representations of protein biology. And then we can use it to make predictions of atomic resolution
protein structure, and we can use it to, and it's really fast. So it's blazing fast. It's kind of
illustrating this Pareto optimal frontier of kind of speed and accuracy in structure prediction.
And so this allows us to kind of characterize, you know, really vast kind of stretches of the
protein universe. So we folded over 1.1 billion proteins and predicted their structures and
identified kind of features connecting all of them through mechanistic interpretability.
But I think the thing that I thought was most exciting about this model is it's this really
general model of kind of protein biology. And so you can use it as a world model. You can actually
really start to search the space of the world model to design new proteins. And it's
It's really hitting state-of-the-art across pretty much every structure prediction benchmark,
and especially on protein-protein interactions and protein-anibody interactions,
which is really critical for therapeutic design.
And so what we found is you can actually now use the model to design proteins
and to design actually single-chain antibodies.
And so you can do all of this digitally and then, you know, really in a small number of experimental trials,
basically like a 96 well plate, you know, select from hundreds of thousands of trajectories digitally,
actually synthesize, you know, 96 proteins, tested in the lab, in a really kind of short, easy experimental cycle.
And we found nanomolar binders there. And so, you know, that's really the level for therapeutic activity.
So it's really, I think, showing that you can have these kind of general purpose models.
We didn't design a model for antibodies.
We didn't design a model to be able to bind one particular target.
We just designed a model that could understand proteins, and you kind of get protein design as an emergent property.
And then I also think it illustrates this kind of the power of open science and open source, because we release this as basically an open discovery engine.
And so really anyone can build on it.
And so it takes what are these really intensive laboratory experiments where, you know, you have to screen through hundreds of thousands or millions of antibodies and high throughput screens in the lab.
And, you know, you can really just kind of spin up an instance and compute and now, you know, be able to generate antibodies.
You should say more about sort of like we took that data when we did an antibody screen and then we validated it.
We looked at PDL in cells.
And then we looked at it under the cryo-em and sort of how all that complemented, validated what you were seeing in the models.
That's right. Yeah. So, I mean, I think it's really critical, you know, to actually go and characterize these molecules in the lab.
And it's, you know, we have a structural biology centered here. We have incredibly powerful cryoem microscopes.
And so we're really able to kind of look at these proteins biophysically.
and functionally.
And so, you know, we designed proteins for several therapeutically relevant targets.
And we're able to confirm their function.
It's delightful when it works the way it's supposed to.
Yeah, that's very amazing.
We're able to look at the structure also.
So you can see atomic resolution, kind of at the binding interface.
Is correct.
I know a lot of your work is really focused on basic research and kind of building out the
fundamentals.
If I look at actual translation into drugs or drug development, often a clinical trial will be
15 years, it'll cost $1.5 billion. About 50 million of that often is the molecule in preclinical
work, and it's a few years of work. And the other 1.45 billion and decade plus is actually the
drug development side of it. A lot of that seems to be gated on some regulatory issues, some of its
recruitment, it's a variety of things, but a lot of it also has to do with the failure of drugs
and trials around things like absorption or toxicity or things like that. Have you considered
at all tackling that other chain of sort of molecular design and thinking, or is the primary focus more on
the basic biology and sort of the initial sort of molecules.
I mean, at least my hope in building this like comprehensive model of how, you know, cells work
is actually also being able to predict off-target effects.
I think you can do some of that actually with biological models.
Because right now some of the off-target effects are we just didn't know, you know, your kidney
cell also expressed this receptor.
And then when we tested in human, like we see it happening and we see renal toxicity.
And so being, and if you have a single cell atlas that looks at all the different cell types, some of which actually were not predicted before we modeled them, you can start looking at which cells actually do have receptors for the target you thought you were exclusively targeting and be able to predict some of these downstream effects before we get into the human trials.
And I think that that's actually one of the more exciting applications of the like a transcriptomic model to understand actually how the different cells will react when you intervene and do something.
And, you know, but I think when you think about delivery mechanisms and patient care, you start, that's where you start having to be creative about when you ask you like, what disease do you want to care first.
there are certain diseases that will be easier to, like, deliver a therapeutic to or the risk reward is, makes more sense. And, you know, I think we were all inspired by Baby KJ, I think last year now, when the team at CHOP was able to deliver a CRISPR therapeutic to edit a mutation that he had would have, that would have inevitably led him to significant neurodoxicity.
and altered his life.
But we were able to, that disease was very carefully chosen
because we needed to target his liver cells.
And if we could easily deliver a product that would work in his liver.
And I think that's when the creativity, the wherewithal to choose the right applications
can help us unlock the first applications.
Maybe something just to add to that also.
You know, because, I mean, kind of you describe the conventional, you know, drug development process, right?
And I think, you know, these kind of tools have the potential to have a lot of impact on that process.
But, you know, what's interesting is to really start to think about kind of the new paradigms that can open up.
And, you know, what does it mean if you can, you know, the barrier to develop a drug, to design a molecule, you know, to kind of get through all of those stages is so much lower.
And so you have programmable biology.
and you can, you know, really start to, you know, create a medicine for every individual patient.
I think that has enormous implications for how we, you know, how we do drug development and what the future of medicine looks like.
It'll be an exciting day when the FDA accepts like a virtual clinical trial for the phase one or something or, you know, it's based on some person to view of that person.
Yeah.
Yeah.
But even short of that, like thinking about the specific, like, mechanisms where you see this acceleration, like, I imagine if people feel like they can prove.
predict impact in kidney cells or have a stronger perspective on talks because they have this
broader understanding, they'll be willing to try many more programs, right?
Yeah, the recruitment could also change.
And we have this program rare as one.
And the basic idea is that a lot of people focus on the most common diseases, but there's
this long tail.
And the economics don't quite work out for companies to focus on those diseases.
but if you can make it so that the groups of patients can kind of come together and organize and say,
hey, we would take an experimental drug on this, then it actually, because of the cost that you're talking about
and how that's a huge amount of the overall cost, if you can flip that, then it actually makes it,
so that the economics make a lot more sense to then if you can generate something more easily
and you can pair it with a group of people.
I think one of the interesting things from science and engineering is that often, you know, you can hit your head against the wall on the common
problems and in this case diseases. But a lot of times you learn a lot more about a system from
finding some kind of rare or like weird side thing that's happening. Edge case. So I don't know,
I think that that's like always been kind of an interesting part of this that actually connects
pretty well to this because now you're going to be able to enable a long tail of new kind of
ideas to get tried and enable them to potentially get tested more easily. Yeah. That's a really good point
on rare. In our rare disease cohorts, first of all, they're incredibly inspiring and powerful.
But patient groups are self-organizing patient registries, natural history registries,
biobanks. They're organizing their own clinical trials. There's gene therapy that one disease
group has moved forward over the course of like, I want to say, like, three to five years
rather than decades. And the speed is so fast because the patients, themselves. And the patients,
have organized the resources that a scientist or a clinician might need. And it's, it's, it's, it's,
incredible. But I think to some degree you're going to need something like this because there are
going to be many more new things that can get created. But that doesn't mean that for like the general
population that you're not going to want the same level of vetting that we've had historically.
But making it so that people who want to be on more of the frontier have the ability to do that
is, I think, also going to be pretty helpful. Yeah. Letting people opt in to be part of
trials, I think, is one of the big ships that is starting to happen but could really help
accelerate biology in general. All three of you have mentioned at different points, like the
power of open ecosystems in such a large space. Like, I think some of that logic around open
source and the breadth or diversity of data collection, you as were describing, it should also
apply in the like language model world and the multimodal AI world. Like, do you think that's right?
Does any of the work you're doing here change?
how you think about AI and meta.
I mean, I think it's sort of a similar philosophy overall.
And, you know, Priscilla was talking about this, that, you know, a lot of our focus is building
tools that empower individuals to do things.
And that's sort of a common theme across a lot of the things that I work on is just kind
of putting the technology in individuals' hands.
We don't believe in this, like, very centralized future where there should be a small
number of institutions that basically are advancing all of the stuff.
Our vision is not that there's going to be like some central superintelligence that solves all of science.
I think like people are really important and I think we'll be more important in the future.
And giving people more tools to be more productive is going to be like a critical part of any kind of positive future.
That both and that's how progress has always been made historically, right?
It's not through centralization.
It's through empowering individuals to try things that are somewhat out of the mainstream that other people didn't think were good ideas because they thought they were good ideas that already have been done.
So I think that that's very central to the whole ethos of, I mean, to some degree, it's like
why you create something like social media or to give people a voice.
It's, you know, I think a lot of the stuff that I care about in terms of empowering people
with individual AI.
Open source is one instantiation of it.
It's not the only way to do it.
It certainly is one way that you basically are saying we're going to take this technology
and put it in everyone's hands.
In terms of science, I think it really makes sense.
and we're deeply committed to open source.
There are obviously interesting considerations on this that are important too
because there's a lot of considerations around biosafety and things like that
that we're going to need to balance and think through how to how to handle.
But I think overall this is like very deep in the ethos of the work that we're doing
both at BioHub and like probably a theme for a lot of the stuff that I do is just like
we believe that a positive future is one where you build a technology as a tool,
you put it in individual's hands, and that's kind of how society makes progress.
You have this, like, I think, an incredibly ambitious mission at BioHub, and yet, you know,
the AI scientists that work here could also go work in commercial enterprises.
How do you think about the talent and, like, how to bring people to BioHub?
I mean, when you want to start?
You know, yeah, I mean, it's a very hot market.
for AI researchers. But I think that part of the part of what that means is that there's a lot of
demand and you like if they're very in demand and can work on the things that they want to work on.
And I think this gets back to this point again about frontier AI and frontier biology. Right.
So if, so yeah, I mean, I think like the AI researchers who work here could go work on on language
models or things at any of the main labs. But those labs don't have the,
frontier biology part attached to it. So I think that there's also a just very large mission component
of this, which is like there's an ability to do this unique work here that you just can't really
do it the other places. So if that's what your focus is, then this, then, you know, I don't actually
think that there's any other organization in the world that's doing both the frontier biology
and the frontier AI. Yeah. Why are you here, Alex? I mean, I think it's really simple.
Yeah. Our mission is take care of disease. And I think, you know, there's, it's just such a powerful. And you say it with a straight face and a less than 100 year time line. It's very serious now. There's no more. Yeah. Yeah. It's a really powerful mission. And I think, you know, you, yeah. I mean, it's just, you know, scientists, I think are very motivated by that. It's something people are deeply motivated by. And I think, you know, we're at this moment and talk.
time where that actually seems like something that can be achieved. And I think, you know, we're
building a really unique place where we're tackling that problem. And, you know, we have the resources.
I think kind of the right things to actually really, really go after that and do that.
Yeah. I mean, that resonates with me as somebody who, you know, talks to and hires a lot of research
scientists. They want to know if you have the data, if you have the tools, if you have the compute,
if you have the talent and then what the mission is.
And so I actually think that's super competitive.
The other thing is that you don't need a very large team.
Right.
So I think it's like an interesting thing about the world is that people care about different missions.
And that's good.
I think that's like part of the whole, I mean, part of why building these tools and
giving people the ability to explore what they care about, whether it's like across science
or just across everything, is like such a powerful way to make progress in society is that people
care about different things. And in order to make progress in AI, you don't need like many, many hundreds of
AI researchers or thousands or anything like that. I think you can really make progress with, you know,
a very strong group of a dozen or a couple dozen people. And yeah, I mean, finding people who like
care about this mission is not a particularly hard thing. I mean, this is like a super important
thing in the world. So I think that that's, yeah, it's just kind of a cool thing about the world is that
people obviously are drawn to different missions.
So I think the simplest mental models that folks have, even if they're paying attention
to the space, are essentially like, okay, you know, structured prediction models for proteins
and protein protein interaction models. And then so there's this one piece, which is fundamental
understanding. And then there's this like theory of someday we're just going to be able like zero shot
things into either the clinic or the clinic with much,
much better hit rate. What needs to happen for us to go from ESM Fold 2 to this other piece?
Yeah. Is that feasible? I think that's a great question. I mean, I would say that I'm really
optimistic on that. So I think, you know, on the one hand, you know, these are problems that historically,
you know, people could spend kind of an entire career working on. Like, how do you figure out how to
effectively optimize a drug? How do you get it, you know, get it through preclinical? How do you do the
early safety. I think that, you know, when you have a new scientific paradigm, kind of, you know,
questions that were wants hard kind of become simplified through the new paradigm. And so I'm very
optimistic that kind of many of these core problems will be solved kind of in an emergent way
through these models. And I think one great example of that is toxicity, whereas if you can
kind of really digitally, digitally kind of simulate everything and
be able to predict, you know, where a drug is going to distribute and bind across the human body.
You know, like, you kind of have the beginning of a solution to that kind of problem.
So I think that once you have these kind of accurate representations at the molecular level,
you know, we're going to start to see really rapid progress on a lot of these core problems.
What is the most exciting use or experimentation with the models you've seen in the last week since release?
Yeah, I mean, it's just been great to kind of see it get integrated in all kinds of things. I think one of the really interesting things that we've been seeing is people kind of connecting it with agentic systems to just kind of do automated design and kind of just automate that whole process. So it's really, I think, another example of how you can kind of see bringing together agentic and frontier AI with the ability to have a world model for biology and actually reason about biology.
really kind of start to automate the entire design process.
Are you taking, you know, how do you decide what the next step in the research agenda is?
It's like world model for biology and then I could, I'm just going to be very coarse here,
like I could scale it up, I could add more data, I could add, like adding data is a non-trivial
thing in terms of new methods and domains.
Like what is, do you take input from the larger ecosystem about, you know, how people are using it,
and what would make it more useful?
Or is it really like we understand, like, the next step of structures or coverage that we're looking for?
I mean, I think there's two things.
So, like, we have a view on kind of the next big challenge, which I think is, you know, the virtual cell.
And, you know, really being able to kind of ladder up the hierarchy of biological complexity to the cell.
And...
Sorry, very basic question.
Yes.
Virtual cell model.
Like, what is the input and output, I should expect?
Yeah.
I mean, I think there's a different views on that.
But I think kind of what you ultimately...
want is a system that can really model each of the levels of complexity. So, you know, the
proteomic layer, the genetic layer, the transcriptomic layer, and connect that to the phenotype. And you
need enough generality so that you can ask the model questions about a new intervention in a
context that it hasn't been trained on and kind of get an answer from it. And, you know, the gap that we
we need to close as a field is being able to
really make those predictions that can generalize.
So that's going to require an enormous effort to generate data.
Yeah, and then, I mean, in terms of what you decide to do next,
I think this is like, you know, a pretty normal process of constraint management, right?
I mean, it's like, I think every lab and every field across the world probably feels compute
constrained.
I think that that's probably true here too, right?
it's like, so, I mean, I know, like, you know, there's always questions. It's like, okay,
should we double down more on advancing the protein piece? Should we do more of the cellular stuff?
I think those are kind of ongoing debates in terms of how you sequence that. And then, yeah,
within that, there's kind of being at the Pareto frontier about how much you want to train the different
models in order to, like, and the size of the models is also dependent on the scale of the data that
you have because, you know, for obvious reasons. So, yeah, I mean, I think it's, there's some of that is just
where you want to be on the curves and the normal constraints. But I think that this is like
probably the same process that like any research organization goes through of like you want to go
in all these different directions and you're just trying to constraint to optimize and
make enough progress to do world class work at one thing at a time while planting some seeds that
can blossom over the next couple years as well. Yeah. This has been the most dynamic period of
technology at least I've seen over my career. I mean it's so exciting in terms of everything that's
happening with AI. And every week there's something new that's changed. Are you tired or invigorated?
I'm both. I feel like everyone feels. I feel like everybody's in this manic phase. Yes, it's a combination
of invigorated and exhausted. Yeah, it's wonderful. And so I guess, you know, things are very
unpredictable right now. It's really hard to know what's coming. We have this almost like early signs of
experimentation on the model side with agentic flows that we're starting to see in really interesting ways.
models starting to help more and more with models that's still very, very early days for that,
if you're thinking back five years from now and you were to define what success was relative to your efforts,
and I know things are very dynamic, things changed a lot, but you have this common threat of tooling for the biohub,
you have a common threat of empowering scientists at scale.
You're looking back five years from now.
Is there a specific thing that you really want to make sure that you've accomplished or achieved or a primary goal?
Well, but I think we have a pretty clear view of this, like, hierarchical.
set of world models that we want to build around biology. And the other part of that is that
we want to do the highest quality work in the world. Right. I mean, I think we're basically set up to do
that between having a world-class AI research team and this collection of biohubs, which are world-class
life sciences research organizations. I think that that's like fundamentally a setup that no other
organization in the world has. But, you know, you can have a lot of great ingredients and that doesn't
guarantee that you succeed. So, I mean, to me, like five years from now looking back, I think,
you know, it's other, I'm sure other labs or efforts will try to produce, like, things that
approximate what we're trying to do. And I just think that we should be able to do something
that is meaningfully better and a unique intellectual contribution to the world, right? I think that
that's kind of what you, whenever you do any kind of research, that's what you're trying to do.
Right? So, yeah, so if we do that, I think we'll all feel very good. I would also expect that at some
point we'll just start seeing a lot more idea generation from the people using the models. But
I have enough faith that that part will materialize, that for me it's more just about like making
sure that we do world-class work. And I think if we do, like the rest almost will take care of
itself. Very last question for you. Snapshot of its mid-20206 was the biggest update in your own
thinking about BioHub or the domain from the last year? Well, from the last year, I mean, you joined in
the last year. I mean, I think the biggest thing that we basically rotated, and I think in the last
year we basically kind of formalized that BioHub is the main focus of our philanthropy. So I think this is
like, I've been a very big shift. But Alex and the team coming in, I think, has been interesting,
not only because it's a world-class group, right? I mean, you guys have worked together for a while.
I think also, I mean, you talked about how stuff is changing so much in the field. I think one thing
that's underrated is like this is like a extremely talented group of people who also are like
know each other and work well together and like are stable and good and like I think that that also
is underestimated in terms of the compounding benefit of like people being able to like work well in a
stable environment over time. So I think that that's a really important piece. But part of what
we wanted to do was prior to Alex leading the effort, the previous leaders of the biohub
were basically primarily biologists who were interested in technology. Right. And now I think
this is the point where we really flipped that, right? Where, I mean, obviously you have a
background in biology as well, but like you are primarily an AI researcher who has a background
in in AI and in biology. I think that that's like a deep reflection on on kind of the
way that we expect that this is going to kind of drive more value in the future. So those are
probably the biggest updates in the last year in terms of the work that we're doing. I mean,
it's a new leader, not just the leader, but a team that I think has been, is like a really good.
And then yeah, I mean, I think on the rest of the industry, it's like, it's on track. I mean,
I think like it's kind of this crazy thing because like when you have an exponentially growing
curve, I think the way that an exponential curve feels is it's growing so quickly that the
kind of emotional feeling is it can't possibly keep going, right? Because like it's,
because it's just like, but I mean, the nature of an exponential curve is it doesn't just keep going.
It keeps accelerating, right? Exponential growth is accelerating. So I think that that has all
of these like emotions and psychology attached to it. But I think fundamentally when you look at the
curve in the industry, the kind of fundamental thing is it is on track. It has remained on that
curve, which I think has all these very profound implications for all of these domains. But
certainly it validates and makes one feel very good about making a very big investment in the
things that will play out if you stay on that track. And it seems like we are. So that I think is
very good news. I think the most important aspect of what you're doing there is you're actually
closing the loop with the actual biology. Because with code and research, it's closed loop systems.
And so they're very fast to iterate. This is an open loop system. So you're closing a loop.
And that's really crucial to progress. Yeah. For me, one of the biggest changes with the strategy
we're driving now and Alex at the helm is, you know, before we had amazing teams moving generally
in the same direction and understanding like the potential collaborations and interconnectedness of our work.
But now we are arms linked moving together with the single goal.
It's very directed.
And it's very exciting.
It's a little bit scary.
But it's like truly a team playing off each other and trying to make progress towards this goal.
And that has taken a lot of work, but also the maturity, our teams being able to have their work at a level of maturation where it actually does make sense to.
Turlock.
Amazing.
Well, to teams being on the curve,
thank you guys for doing this.
Thank you for joining us.
Thank you.
Find us on Twitter at No Pryors Pod.
Subscribe to our YouTube channel.
If you want to see our faces,
follow the show on Apple Podcasts, Spotify,
or wherever you listen.
That way you get a new episode every week.
And sign up for emails or find transcripts
for every episode at no dash priors.com.
