a16z Podcast - Mark Zuckerberg & Priscilla Chan: How AI Will Cure All Disease
Episode Date: November 6, 2025Priscilla Chan and Mark Zuckerberg join a16z’s Ben Horowitz, Erik Torenberg, and Vineeta Agarwala to share how the Chan Zuckerberg Initiative is building the computational tools that will accelerate... the cure, prevention, and management of all disease by century's end. They explain why basic science needs $100 million-scale projects that traditional NIH grants can't fund, how their Cell Atlas became biology's missing periodic table with millions of cells catalogued in open-source format, and why their new virtual cell models will let scientists test high-risk hypotheses in silico before investing in expensive wet lab work. Plus: the organizational shift unifying the Biohub under AI leadership, what happens when biologists and engineers sit side-by-side, and why modern biology labs are expanding compute instead of square footage. Timestamps:4:17 - Building tools to accelerate scientific discovery5:47 - The credible path to funding basic science7:21 - Biohub = Frontier Biology + Frontier AI9:05 - Challenges building on a 10-15 year timeline9:43 - How CZI chooses what to work on11:15 - Making sense of science with LLMs11:31 - Measuring success in the therapeutic realm13:32 - “Most diseases should be thought of as rare diseases”15:39 - Inspiration: building a periodic table for biology19:27 - Why virtual cells?21:17 - The Biohub Master Plan21:51 - How virtual cell models allow more risk taking28:15 - Bringing CZI & Biohub together30:32 - Why Biohub matters33:36 - The importance of interface design in democratizing scientific discovery35:34 - How Biohub encourages cross-functional collaboration40:38 - Looking ahead: the broader impact of AI on biotech Stay Updated: If you enjoyed this episode, be sure to like, subscribe, and share with your friends!Find a16z on X: https://x.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zListen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYXListen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711Follow our host: https://x.com/eriktorenbergPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Stay Updated:Find a16z on XFind a16z on LinkedInListen to the a16z Podcast on SpotifyListen to the a16z Podcast on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
This is a space that, I mean, that there's just going to be a huge amount of leverage with AI.
It still seems like there could be a lot more effort in this space around building tools.
And it's kind of this crazy thing that we're, you know, here in, you know, 2025
and there's not the kind of periodic table of elements equivalent for biology.
We think that this is, like, probably one of the most important sets of tools that you need to build.
When we first set out that the goal to cure and prevent disease by the end of the century,
people like honestly most scientists couldn't look at us with a straight face
and they're like you're crazy yes and it was true because if you just decided to spend the money
funding the next best grant for every single lab in the country like you there's no pathway
to that being true the biology folks i think looked at it as if it were crazy ambitious and then
the ai folks are like well that's kind of boring that's just automatically going to happen
I know, it's like, okay, there's something in between there that needs to be bridged.
The scientific community needs fundamentally new tools to cure disease, not just more funding.
For decades, biological research has been constrained by the same limitations.
Small grants that fund incremental progress, isolated labs working on narrow questions,
and the lack of shared infrastructure to tackle the biggest challenges in medicine.
But what if we could change that?
Today, you'll hear from Priscilla Chan and Mark Zuckerberg on their 10-year journey
building the infrastructure for modern biological research.
We discuss how they accidentally created the standard for biology data
with their cell atlas project,
cataloging millions of cells in an open source format.
We explore why they're betting on virtual cells
that let scientists test high-risk hypotheses in silico
before investing in extensive wet lab work.
And we dive into BioHub,
their play to accelerate discovery
by pairing frontier biology with frontier AI.
Hope you enjoy.
Mark Bressel.
Hello, welcome to the A-SX-Z podcast.
Thanks for having us.
Yeah, great to be here, excited.
All right, excited to have you.
You're doing exciting stuff.
Yeah.
To that end, almost a decade ago, you guys started the Chan Zuckerberg Initiative
with the mission and intent to cure, prevent, manage all disease by the end of this century.
There's a lot of missions that you guys could have poured your time and resources into.
Take us behind the conversations of why you guys pick this one.
Maybe Priscilla, why don't we start with you and your side of the story?
It always surprises people when I talk about how we work in basic science research.
I trained as a pediatrician and people always think, oh, it must be about medicine. And for me,
I went into medicine because I wanted to improve people's lives. I wanted to make a difference.
I wanted to be able to help others. And I think training as a pediatrician at UCSF, I met a lot of
patients and frankly, like little kids and families, for which we just had no idea what the problem was.
and they might have a specific gene that they could name if they were lucky
or they could be grouped into a bunch of other diseases
and there'd be a general sort of PDF they'd print out like this is what we know
and then it was my job as an intern or resident to try to translate
like a few lines of information to how we were supposed to take care of the patient.
And for me, that's when I really realized the power of basic science
and how we'd need to work on basic science to advance
the forefront of what's possible,
I think of it as the pipeline of hope.
Yeah.
And why did you think you could cure all disease?
Because that's like a very, like, aggressive goal.
Do you want to answer that one?
Yeah, well, I mean, we're not going to cure all diseases, to be clear.
I mean, the strategy is to help scientists
and the scientific community cure all diseases.
So the strategy is really one of accelerating the pace of basic science.
And the theory that we had was if you look at
history of science, most major breakthroughs are basically preceded by the invention of a new
tool to observe phenomenon in a new way. Right. So I think about things like the microscope,
right? Being able to observe bacteria or other fields, the telescope or, you know, but it's,
just to use an engineering example, without those kind of tools, it's kind of like you're coding
without being able to step through the code. And you both things, right? That's like the old days,
isn't it? Yeah, yeah. So our whole approach on this is,
basically, let's help build tools that will accelerate the pace of the whole field. And I think
that there's a niche that I think fits that, because if you look at how funding works in science,
the vast majority of funding comes from the government and NIH grants. It's parceled out into these
relatively small grants that allow individual investigators to investigate usually pretty near-term
things. And the development of these kind of new types of tools, whether it's imaging or
building now a lot of AI things like virtual cell models are longer term, oftentimes more
expensive to develop. So think about like on the order of maybe $100 million to a billion
over a 10 to 15 year period. And then you try to unlock those tools and give them to the
scientific community to accelerate the pace. So that's kind of the theory. Right. And it seems like
there's also something that is you don't really get credit for the tools in a lot of ways.
I mean, we have companies that use your tools and they're very happy about it. But I
I didn't even know that that was the case.
That's why it's philanthropy.
Yeah, well, it is, but most people do philanthropy to get credit, too.
I mean, that's kind of a part of it.
So I guess did you think about that, or were you just like, no, like, this is going
to work, and if it works, that's all we need?
We're super focused on, like, actually making every scientist better and beyond science,
like startups, startup founders, because the point is we can't do this alone.
and when we first set out the goal
to cure and prevent disease
by the end of the century,
people, like, honestly,
most scientists couldn't look at us
with a straight face.
And because...
You're crazy.
Yes.
And it was true
because if you just decided
to spend the money
funding the next best grant
for every single lab
in the country,
like, there's no pathway
to that being true.
But if you forced people
to really think about this
and, like, okay,
what is the most credible
pathway to doing this, and what are the barriers to that credible pathway? Then we sort of got
somewhere, right? They were like, well, like, there's no shared tools or we're not working on
big projects and building the right data sets. And we're like, okay, well, then we can start doing
something about that. And so that's where the idea of building shared tools, because no one
right now in the site. Well, that's so interesting. So basically, you're like, we're going to cure
all disease, and they're like, can't be done. Why can't it be done? Well, because we don't have the
tools, okay? That's a pretty cool sequence. Yeah, I mean, there's also this funny thing where
the biology folks, I think, looked at it as if it were crazy ambitious, and then the AI folks are
like, well, that's kind of boring. That's just automatically going to happen. I know that's like,
there's something in between there that needs to be bridged. And if you can like kind of use the kind
of modern AI tools in order to build the types of tools that biologists need. So that's a big part
of how we think about our work is...
AI has got to be the most overestimated
and underestimated technology ever,
like simultaneously. I mean,
yeah, we'll probably, like, the internet early on.
But we kind of think about ourselves
and the work that we're doing at the BioHub
as Frontier Biology
paired with Frontier AI.
So there are labs that do Frontier AI
that basically, you know,
are building the most advanced models.
And then there are
lots of biological research organizations
that effectively do
very leading-edge research to build
to either discover new data sets
or looking to certain challenges.
But so far, there hasn't been anyone
who's tried to do both of those at once.
And when you look at, I mean, even something like AlphaFold,
which is amazing, right?
It was built off of this data set
that was a public data set
that had been produced decades ago, right?
And what I think you have the opportunity to do
if you do both of those together
is produce specific data sets
for the purpose of training AI
models to build virtual cells that can do specific things.
So I think that that's like a pretty interesting zone to be in.
And of all the things that we've worked on,
actually when we started CZI,
we kind of actually focused on a number of areas.
And what we found is just that the science research has had by far the biggest
returns who just doubled down on it over and over and over until now we're at the point
that we're 10 years in.
And BioHub is really the main focus of our philanthropy at this point.
But yeah, I mean, that's kind of, that's basically the focus.
Maybe you're not giving yourselves enough credit because you're sort of saying, well, there's bite-sized science. We don't want to do that. There's century-scale science, and that seemed like a long-time horizon, but achievable, ambitious. But you've actually identified, which I think is really fantastic, grand scientific challenges that are right in between. They're 10-15-year horizons, at least per kind of the way you communicate about them and the way you energize the scientific community about them. Ten to 15 is kind of an interesting time horizon. Sort of like similar to the time horizon.
of a venture-backed companies, similar to the time horizon on which a team can work together
for that period of time. How did you get to that number? And then how are you thinking about
the challenges that you take on in each 10-to-15-year wave? Because that's concrete, achievable.
You build a lot of credibility around it the way that you've announced those challenges.
Well, I'm curious how you guys think about it. But for us, when we looked at the grand challenges
on the 10 to 15 year time horizon,
it needs to be, like,
when you look at it, you're like, I see a path.
Right.
Not everything needs to be solved
for us to take it on.
In fact, if everything's solved,
then that feels like that should just go.
And it wasn't ambitious enough.
Yeah, like we have some risk appetite, right?
So we want things where we're like,
there's a credible pathway,
someone who is at the home who can do this,
and there's enough ambiguity
where we feel like we could take on that risk
And if we do it, like, the returns could be higher than even expected.
And the way we modeled that in the biohubs is we have three biohubs.
We have one in San Francisco, one in Chicago, one in New York.
The one in New York works on cell engineering.
Can we engineer cells to go in and detect signals, read it out, or to take certain actions?
In Chicago, we're building tissues and looking at cell communications within tissues.
And then in San Francisco, we're looking at deep imaging and transcarriage.
and that work, the locations are not by accident. We also look at the partner universities
because we have folks who come to the biohubs to do this work, collaborative, interdisciplinary,
and sort of unconstrained by the traditional lab, but we also build off of the labs at these
academic institutes that support the work. And so that's how we sort of choose the grand challenge
and the locations.
And then the sort of layering
and the large language models
and AI coming into the picture
has been so interesting
because we were already building tools
to measure interesting data,
building the data sets.
But we didn't really know
what to do with them yet.
And large language models
coming onto the scene,
we're like, wow, we can make sense
of all of this now.
I'm curious what you've used
success as in the therapeutic realm. So, you know, we think a lot about understanding biology and
sometimes we bet on startups that want to unlock completely new biological areas, diseases, where we
don't know what's going wrong. And then there's another group of folks who kind of say,
hey, okay, now that we understand what's going wrong, let's fix it. Let's come in with a drug.
Let's come in with a new type of chemistry, any type of antibody. How do you, what do you think
success for the CZ Bio Hub looks like? 10, 20,
50 years from now, in terms of the new medicines that you've enabled?
We want there to be like an explosion of a community who are building these,
just the new wave of what it means to be deploying precision medicine.
Like I think for rare diseases and common diseases alike,
you're really talking about individual biology that we sort of lump together.
And we often don't know how it happens, right?
we know that you have this mutation or the worst nightmares, you have a variant of unknown significance.
What does that even mean?
The horrible of the U.S.
Yes, horrible.
And you're like, you tell someone you kind of know something, but we don't know what it means.
But if you look at the way we've been able to look at variants and look at single cell transcriptomics,
we're starting to be able to say, okay, this variant actually impacts this set of downstream cells.
And then we start looking at the proteins that get expressed and how it looks similar or different.
to what a healthy cell would look like.
Then you can start targeting, okay, like, let's look at that as a target,
and you both know the specificity of the target you want to build
based on the ability to connect mutation to protein expression,
as well as to be able to predict off target effects.
What are the side effects?
Because you also know where else that drug will be able to interact with the body.
And so those are rare, like, but I really think most diseases should be thought of as rare diseases, because each one of our biology is different.
And right now we just get lumped, right?
We get lumped based on age, demographics, ancestry, if we're lucky to have that level of understanding.
But truly, each one of our biology is different and say, like, if you look at hypertension or depression, like we kind of just go by trial and error and saying, like, let's just try that drug.
see what happens. But what should really happen is being able to precisely and accurately and quickly
treat people by looking at individuals' biology. We want to enable the basic science, and we would
be thrilled if people picked up the models that we build to be able to build the diagnostics,
the therapeutics that need to come. You've built amazing data sets. I have to say, like, I mean,
you may not hear the feedback from the startup community and the pharma community and the R&D community,
but it's there because you've committed to open source.
And so people may not be, they may not all be writing papers,
but they are using those tools.
There's a startup in our portfolio working on idiopathic pulmonary fibrosis.
The name tells you how vexing the diseases.
It's idiopathic.
We don't know why it happens.
The IPF is named that way.
And so, you know, he was telling me that he used your cell by gene atlases
to look at millions of single cells in patients,
with disease, without disease, try to pinpoint the fibroblasts, double-click on the fibroblasts
and their gene expression.
It's incredible.
And try to, you know, use that to inform, hey, where could I go after a new drug target
in this disease that's fundamentally a strange clump of idiopathic origin.
So I think there's a huge group of innovators who love the tools, the visualizations,
the query systems, and really the software approach that you built to making that data
incredibly accessible. So cell by gene is like almost an accident though. Tell us more. So do you want to
share a little bit about cell hygiene or do you want me to start? Well, I mean, I don't know which part you
want to get into. But I mean, but the Cell Atlas work overall. And it's kind of this crazy thing that
we're, you know, here in 2025 and there's not the kind of periodic table of elements equivalent
for biology. Right. So that was sort of a lot of the inspiration of it was, all right, how do we
both through work that we're going to do in the biohub and through other grants,
be able to pull together and standardize a format where you can have all this data.
And when we were starting off, we didn't even necessarily have in mind
that we were going to use that to build virtual cell models.
I think that that's sort of just come into focus as the AI work has advanced.
But that's a very exciting thing.
We should definitely spend a bunch of time on the virtual cell models.
But I'm not sure what you wanted to get into on the Cell Atlas.
Well, the single-cells work was one of our first RFAs 10 years ago we started, and we were like, okay, we think this is possible.
We actually funded the methodology for it to standardize how it was going to be done.
So that was 10 years ago.
And we then we ceded a few labs to start building out that data set.
But we're like, there are like millions or billions of different cell types and different permutations.
Like, how are we going to do this?
and especially with like a burgeoning technique.
And so we ended up seeding a few groups
and they started doing work.
And then they told us they had a problem.
There was a bottleneck in their workflow
because they couldn't annotate the data fast enough.
And so we built, Cell by Gene was an annotation tool.
That's the original source of this.
So we built the annotation tool
to make it easy for people who are doing single cell science
to be able to annotate the data.
And then we put the data that we collected publicly
so people could share.
But because everyone started using the same annotation tool,
everyone was standardized then on the same data formats.
And then there started being a community around the tool,
and they wanted to share back and build the Atlas.
So now after 10 years, there are millions of cells
that have been built into this shared resource
for the entire scientific community.
we only funded about 75% of it.
Sorry, that's wrong.
We've only funded 25% of it.
75% came from the broader community saying this is useful
and there's an easy way for us to standardize and build this together.
You have the same metadata.
Yeah.
That's right.
It's like what you'd call a network effect.
I was going to say it sounds like the internet.
Come for the annotations, stay for the virtual cell model.
Well, it was very important when we were getting started with the work to
have everyone who is doing it have a consistent format. So that way it could be used and portable.
And then once that kind of took off as the way that it would get done, then other people
just found it valid. Yeah, and even relative to prior data bases like geo and whatnot,
they're just simply not as standardized or QC. Yeah. Yeah. Let's get into virtual cells.
Sure. Sure. The great challenges that the grandchild you would focus on. Maybe talk about what
does the promise or the hope and maybe some of the challenges are where we're at with it.
Yeah. I mean, we think that one of, this is going to be one of the most important.
tools at this point is basically building up the kind of hierarchy from proteins to
to just different structures from the cell to whole to like whole like a virtual immune system
or different levels of hierarchy and we think this is going to end up being like a very important
set of tools for people to effectively generate hypotheses for for different science work
you know even before you get to the point where you're really running full experiments in it
you can come up with some estimate of how that might run. It will be useful for some of the
precision medicine type examples that Priscilla was talking about a few minutes ago. But we think
that this is probably one of the most important sets of tools that you need to build. And it's not
a single thing, right? So there's different angles to come at this from. The cell Atlas data
is helpful for understanding things on a cellular level.
One of the kind of most important things
that we're doing right now,
there's this great company evolutionary scale
actually had a bunch of researchers
who'd formerly worked at meta and protein folding models
is joining a biohub
and Alex Reeves, the leader of it,
is actually going to be the kind of head
of the whole science program,
which is actually kind of interesting
when you think about it,
where it's like you have AI and biology
coming together and really it's like an AI person
who understands biology is running it rather than
a biologist who has some understanding of AI
I think just kind of speaks a little bit
to where we think the relative
weight of these things is
but I mean
we basically view you know like Priscilla
was saying with the different biohubs
and then New York doing cellular engineering
will basically make it so that
you can have cells that can record
different things that are going on around the body
and share that data and then you can build
that into models. The Chicago
BioBioHub being able to record inflammation and basically study that in order to kind of help
understand that's a different data set. We have the Imaging Institute, which is we just trained
our first set of models around that, which are the first like spatial models around understanding
like the way that that kind of cells look in different states. And eventually, just like you have this
analogy on the
kind of the industry side
or on language models where you have different capabilities
and then over time you train them into models and it gets
more and more general. That's kind of
the idea here. So we'll
build the biohubs around grand
biological challenges. The biohubs
will build tools that will generate novel
data sets. We will build
models based on those and then
eventually combine the models into an increasingly
general view of a virtual
cell that will be useful
both for scientists and
hopefully startups and companies that are working on finding drugs, which is not our part of the
whole thing, but I think is obviously a really important part of what needs to happen.
Yeah. And, you know, you guys think about risk all the time in terms of when you make investments.
Like, I think the promise of being able to do virtual biology using a virtual cell model is you can
actually take on riskier ideas. Right now, like grant funding can be hard to come by.
and the wet lab work is expensive and slow,
and it's not just money, it's also time.
And so you have to choose something
that you think is going to have some likelihood of success
to keep your lab career going.
And so it naturally lends people to take on some risk,
but not a lot of risk,
because they need to make sure that they are hitting
a certain percentage of the time
to make tenure or publish or whatever they need to do.
But if you had a virtual cell model where you could simulate really high-quality biology,
you could actually then start testing and tinkering on the computational side
and, like, ask riskier questions, things that would have been expensive and costly
in terms of time and resources to do in the lab, and actually see if there is promise
doing the experiments in silico before you make the time and money investment in the wet lab.
Do you think of it kind of like a model organism?
Yeah.
Like it's the new foot fly.
Yeah.
I was just going to ask, given the complexity of a cell, like, how close, like how accurate do you think you'll get the model too?
I mean, just assuming, I mean, maybe you get it to like a perfectly accurate representation of a cell, but like how accurate to be useful would the virtual cell have to be?
I think it will obviously iterate and get better and better because right now we, like right now we're still just talking about transcriptomics.
We're expanding into different ways of looking at the cell.
but you get more and more accuracy.
But I don't think it needs to be 100% accurate to be useful
because you just want to be able to de-risk the idea on the front end a little bit.
And the more and more you de-risk it, the more efficient it gets, obviously.
But it will be useful if you even get directional signal.
And yes, we do think about it as a model organism.
But in a way that's like has fidelity to the human body.
Like, you know, like, I don't want to...
All models are wrong, some are useful.
Yeah.
Yes.
Yes.
It has utility on certain acts.
Exactly.
And just like the language models, you build in specific capabilities.
So it's not...
So, for example, you know, one of the models that we're publishing is variant former, right?
It basically, you know, makes it so that, you know, it's trained on a bunch of, effectively, pairs.
If you have a cell, you apply CRISPR to it in a place, you see what comes out at the other side.
So it basically is able to make that kind of a prediction.
Like, okay, if you have this edit that you're doing to a cell,
what is likely going to happen?
Another one of the models is it's this diffusion model.
Basically, you can describe a type of cell
that you would like it to simulate,
then it will just produce a kind of synthetic model of the cell.
Again, I mean, it's kind of interesting
because to Priscilla's point before
about how everyone is different
and different cells have kind of,
you want to be able to simulate these kind of rare configurations,
having at least a synthetic version of what that could look like
is interesting, and then you can test against that.
The cryo model, I think, is interesting because it's spatial.
So it kind of gives you a sense of there are all these different models
that you can have that allow you to basically look at different kinds of things,
and then you just train them in to be increasingly general over time.
Wow.
Very interesting.
And is the modeling technology basically,
LLMs are like, is there a reasoning model?
Is it like a just...
Oh, that's actually, yeah, no, that's a fascinating one too.
Because one of the new models, I think this one is very early, but it's basically the first
reasoning model over biology.
So the idea is that, yeah, you effectively have these models that kind of simulate world
models in different ways, and then you want it to be able to not just be able to spit out
correlations, right, in terms of like what it's found.
but actually be able to kind of reason through how things would evolve
and why things would happen.
I know it's quite early, but it's – but it is interesting, conceptually,
as what I think is clearly going to be an important direction
in terms of how these models evolve.
Yeah, because that's what I was thinking, you know,
that if it doesn't work, the next question you have is why.
Yeah, you know, like –
But I think what you find in reasoning –
Because if you're married to your hypothesis.
Sure, sure.
Yeah, I mean, the, yeah, I think you're saying
if the reasoning model doesn't work, why.
I mean, I think the language model analogy for that would be
you need better kind of world models or better pre-trained models
in order to get the reasoning to be good.
But it's, yeah, you build more capabilities into it.
And I think that there's probably an order to.
So the work that Alex and the evolutionary scale folks worked on is a lot of it is protein, which is interesting because that's at a kind of smaller resolution, obviously, than the cellular data, the cell atlas.
But part of the hypothesis is that you can look at all these different cells and you can kind of simulate how they might behave, but you're going to have a somewhat shallow understanding unless you actually have this hierarchical understanding of how the subcompensate.
components of the cells are going to interact.
So our view is that you basically want to build up a state-of-the-art protein model
and then have that be a part of the state-of-the-art cellular model.
And then once you have that, you build things like the virtual immune system,
which allows you to simulate much more complicated systems.
But it's sort of this hierarchical approach to building up these virtual models.
That makes a lot of sense.
Because also as you get into personalization, you've got like common protein.
combining into a unique cell
so that
makes it like from a system standpoint
that makes it like much more manageable
that makes a lot of sense. Interesting.
Yeah. Wow. Yeah, no, it's it's very
fascinating stuff. Yeah. So you guys are announcing
some big news this week. Do you want to give us a sneak preview?
Well, the big news is
thinking about how we are going to be coming together
as one team.
And you know, in the past we have done
We've run biohubs, and we've done built software, we've done some AI research.
But all of it has been really thinking about, has been a little bit decentralized.
But now under Alex's leadership, we are going to come together as the biohub, an operating philanthropy
where we are doing the science in service of a singular goal together, and how do we actually advance the state of biology and research at the
intersection of AI and biology. Amazing. Alex's amazing. Yeah, no, he's great. And then the other thing is
the piece that I mentioned earlier, which is just, yeah, I mean, CZI has focused on a number of
different things. We've really just found over time that we feel like we've been able to make
the biggest difference in science. So we've just kept on doubling down on it. And we're going to
continue doing work in education. We're going to continue supporting local communities and in those
different pieces. But going forward, the biohub is really going to be the main thrust of our philanthropy.
And we're very excited about that because I think that this is, there has been, you know, when we started the mission to see if we could help the scientific community cure and prevent diseases by the end of the century, I do think with the advances in AI, that should be possible to do significantly sooner.
And that is a very worthy and important and very exciting goal that we think we kind of have a unique place in the ecosystem that we can help empower others to make fast progress on that.
So there's obviously like plenty of advantages to do.
decentralization from a management communication overhead and so forth.
And so, like, what are you trying to add by adding this kind of new layer slash unification
on top?
Like, what are the outputs?
And then I guess what are the complexities to that?
Because that's, I'm sorry, to ask a CEO question.
No, no.
I mean, I'm like, I'm just like, obsessed with this stuff.
We think about this.
You want to go for you?
Then I can jump out.
Yeah.
So they're obviously amazing groups doing Frontier AI and a lot of groups doing
great Frontier Biology. And where we think we can do uniquely is actually tie these two together.
And we are, we've funded datasets, we've built datasets. We're like building the instrumentation now
to be able to look at the cell, whether it's, you know, at the tissue cell cell communication,
our cryo-em, where we can look at the cell at nearly atomic level. So we have the ability to
not only build the data sets, but actually shape and form them the way we want based on what
we see as necessary to complement the existing body of knowledge. And so we have amazing teams
doing that work, and we're building these AI models. And so the reason to do it together is then
we can actually complete the flywheel. Like, you know, the model is looking like it has some gaps
and blind spots in this area.
Okay, who do we talk to?
How do we build the next data set?
And, you know, we're seeing this in the lab.
Like, the metadata is going to be so rich
that we can feed back into the way that we do this modeling.
And so if we can close that loop,
which is our goal and bringing everyone together,
I think it's going to be incredibly powerful.
And it's more than just, like, you know,
writing down a spec and saying, like,
please deliver this.
Like, these people need to be sort of working
shoulder to shoulder and shaping each other's work for this to actually be a more and more accurate
model of how the human cell works.
Well, you know, it's so interesting because that is exactly, like, that has been the biggest
surprise in the industry for us in AI world, like forget biology for one second, is that
the domain-specific models have been, like, super interesting.
The original piece is, well, like, there's just some AIs are going to.
It's so smart.
They're going to be smarter than everybody at everything.
But like on video models, like every video model is best at something, but not everything.
And so knowing what problem you're solving actually turns out to be sort of ironically very important in AI
because you can actually get to a way better result if you put the two together.
Like, yeah, we're seeing that over and over again in a way that is, I would say, very, very,
counterintuitive to the whole narrative kind of going into it.
In biology, it used to be the, or at least, you know, one assumption was all the data sets
aren't on the Internet.
So part of the reason you need a domain-specific model is that the data sets are not public.
You guys are kind of bucking that trend, too, by creating a lot of open-source access
to the data.
And then even then, it sounds like you're betting, you know, on the trend that we're seeing
in other industries.
But still, there will be nuance in how you annotate that data, curate that data,
well, how you talk to a scientist, right?
Right, because you have to not only know the data in the model and so forth,
but like the conversation is what we keep finding out ends up being very, very important.
So rich and so important in how you actually.
A scientist isn't going to talk to it like, you know, I talk to chat GPT or whatever.
Well, this is the fly you can talk to.
Yeah, yeah, yeah, that's really, that's super exciting.
And the user interface is actually really important.
You talked about you guys have a founder who's using Cell by Gene.
That user interface was intentionally designed to not need to have a computational or really a very deep biological background to be able to use because you want people coming from different fields to look at the problem.
It's like, look here, help us solve problems here.
And so building the user interface in a way where it's not a very high barrier to entry to be able to poke around and learn something and bring knowledge back to your work, that's intentional.
And we're really hoping when we build these virtual models that we get to a place where we can allow a lower and lower barrier entry for people to say, like, you know, like, I have some knowledge about this.
Maybe I can contribute.
A very pertinent example is, turns out, I think immunology has a ton to do with neurodegeneration, right?
Seems like immunology is behind all this.
Everything.
So might be part of your century vision.
So you need to be able to allow the immunology.
to come in and understand their degeneration and understand how their world fits in.
And so the more you lower the barrier to entry allows people to actually think
in a sort of truly collaborative and interdisciplinary way.
So will the biohub grow as a team?
Like will you employ more people at the biohub proper or are you moving towards more of a network
model with more sites, more labs, more community-driven data sets, like which is the thrust
or maybe it's both.
Probably a little of both.
And we've added new biohubs over time.
And then we're also building up more of this like central AI team.
Cool.
So, but I know, I think that these organizational questions of how do you set this up are fascinating.
And a lot of our approach is sort of informed by what the rest of the field is doing.
Because you kind of think about science as it's this portfolio, right?
Society has a portfolio of stuff.
that it's trying to do.
And in terms of philanthropy,
you want to be the most additive that you can be
by trying to figure out what else is underrepresented.
So science by default is very decentralized, right?
It's like kind of the way that granting has worked,
the way that I think scientists by default want to work.
So I think a lot of what we've found
is that figuring out ways to encourage collaboration
in ways that otherwise seem very simple
but weren't happening before
can unlock a lot of value.
So the very first biohub, what we did,
there were two kind of interesting things.
One was, it was this collaboration
between UCSF, Stanford, and Berkeley.
And there were all these really smart people
at all these different places
who previously, I guess in theory,
they could have figured out a way to work together,
but there was not really a formal construct
for them to do that.
And this just allowed a lot more collaboration.
The other one is cross-discipline,
basically having biologists sit next
to engineers and this view that like these two disciplines are things that need to um and i don't know i mean
i'm sure you know you've seen this in a lot of in a lot of the companies but like it's there's so many
interesting and the companies they always set them apart well it's interesting no it's interesting how
many organizational questions or problems you can fix just by having two teams sit together right it's like
it doesn't matter what the org chart is or like whatever it's like you guys need to sit next to each
and until you get this thing to work.
And that's something I really believe in.
And you have time.
You have 10 to 15 years.
Well, no, it's all like communication is such an underrated problem in general in all kinds
of, and building anything or solving anything.
So that's pretty neat.
Yeah, and it's just really kind of simple stuff.
But I think it's sort of novel as a model.
And one of the things that's, so we've now copied this from the first
BioHub to the biohub network and expanded it to other models. But it's also just been neat to see
other folks who are working in the field also adopt similar models because it's a pretty
intuitive thing. But at some point, you'll reach the point where, you know, actually it's
really good to have decentralized work too, right? So it shouldn't be that like, we're not saying
that this is like the way that all science should work. We're just saying that there's a space for this
that can unlock a lot of value because it, for whatever reason, hasn't been the default.
Yeah, and we still rely on, like...
Yeah, there's famous, like, stories in the MIT lab about that.
That's how they invented lasers and so forth.
Is it put a bunch of people from different departments in the same space?
Well, actually, physics is where we got a lot of the inspiration.
Like, physics has just historically been, like, labs have just rallied around big projects and big shared resources.
And we will, you know, we are relatively centralized, but we still depend on a lot of labs
who are doing sort of exact frontier work
or complementary work to come together to support those.
There's that.
But one more thought on your expansion question
is like, and maybe this is like the modern AI lab,
we are not expanding like a lot of square footage per se,
but we're expanding our compute.
The research, they don't want employees working for them.
They don't want space.
Yeah, they just want GPUs.
Yeah.
So it's just like, in a sense, that's new lab space.
It's much more expensive.
than what lab space.
And you guys have always been creative on that
even in the last few years.
You've created ways to share access to compute,
you've enabled academic labs to, you know,
it forgot the name of your program.
Scientists and residents, or something like that.
Yeah.
But you're a complete rental kind of pro-telling of clusters.
You know, if you look at individual labs,
they'll have like a large lab would have tens of GPUs.
And we were the first to really build a large-scale compute cluster.
A thousand, now we have plans to move to the 10,000 range.
And that, one, requires a different type of project, obviously.
You're able to ask different types of questions.
And it's a resource that we use, but also we've invited scientists to apply and say,
like, what question do you have that could use this amount of resource?
and be able to step, sort of seed collaborations that way.
And so if a scientist is out there listening,
like, who's not employed by the biohub
or working at the biohub but wants to collaborate with the biohub,
that you're going to create really interesting, interesting doors
to utilize their resources.
That's awesome.
Yeah, I mean, the GPUs are somewhat zero-sum, right?
So the data is, so, yeah.
Yeah, fair enough.
Yeah.
So you're about to celebrate 10 years.
doing this. As you look out in the years to come, what else can you tell us about either things
that you're thinking about for the future or maybe even principals or a North Star that's going
to guide how you guys grow and evolve going forward? You know, it's been really interesting
in the past 10 years because I actually spent the first few years completely envious of people
working for for for-profit companies because there's so much clarity. Like the market will tell
you, whether or not it's private or public, we'll tell you if you're doing a good job.
If they think you're doing it.
If they think you're doing it.
They're not always right.
They're not always right.
But I was still envious because that was, I was like, I craved that feedback, like, am I doing a
good job?
And, you know, 10 years in, you know, the reason why we're doubling down on biology is like,
not only did we achieve what we said we were going to do and when we set out to set out
on these projects, it actually delivered.
more than we thought we were going to. And I was like, okay, that's a signal I can latch
onto. And like, that's a signal. We can really continue doubling down and doing more of that.
And so I think it's continuing to tolerate the early ambiguity when you're like, okay, I'm
going to do more of this. And being patient, but being willing to have a long time horizon,
but be impatient at the same time. Because it's all those iterations along the way that have sort
have allowed us to get to this place where, you know, to get lucky, ready, having built
data sets to take advantage of AI and large language models, that's because of all the work
that we have been doing.
And so being able to continue moving forward in this ambiguity and sometimes lack of signal
on a big goal, like I think we sort of set the DNA for that.
Amazing.
Oh, no pun intended.
Yeah.
But we get to see how many people use the tools and the feedback.
Yeah, yeah.
Yeah, you have customers for that.
which is pretty cool.
Yeah.
Yeah.
For philanthropy.
Like, that's awesome.
Yeah.
No, it's one of the fun things about building tools.
It's like, you kind of get to see, how valuable do people find the tools?
Do people use the tools in order to publish important work?
Right.
Right, right, right.
Yeah.
Well, I mean, our feedback is they're awesome.
Our feedback is great.
And completely unique, by the way.
So, like, the other thing is, like, what would you use if you didn't have this?
It's like, there's nothing.
No, yeah. It's a real kind of void. I mean, there's this whole pipeline that needs to exist from accelerating basic science to funding a lot of people to use it to then you can get into the biotechs that basically can start to work on basically coming up with novel therapies and then you get the pharma companies that do them at scale. And then there's a space for philanthropy on the other side of public health of basically taking the therapies and kind of bringing them out to everyone in the world. But this is a space
that there's just going to be a huge amount of leverage with AI.
And it is, yeah, it still seems like there could be a lot more effort in this space around building tools
and just accelerate the whole thing a lot better.
Yeah, and I do think it is the place where you are completely unique, right?
The other things, there are other people who can do that, but there's nobody doing what you're doing.
Yeah, it's got good, good founder market.
Yeah, it's founder market.
If we didn't exist, it would it be a problem?
Yes, like those questions really land.
It's like one of us as an engineer, the other one is a scientist, doctor.
Yeah, very happy in this direction.
We thank you very much, not only for our companies, but for us as humans, for work on this work.
It's amazing work.
Oh, thank you.
Thank you.
Thank you, guys.
Thank you so much.
Thanks for listening to this episode of the A60Z podcast.
If you like this episode, be sure to like, comment, subscribe,
leave us a rating or review and share it with your friends and family.
For more episodes, go to YouTube, Apple Podcasts, and Spotify.
Follow us on X and A16Z and subscribe to our Substack at A16Z.com.
Thanks again for listening, and I'll see you in the next episode.
As a reminder, the content here is for informational purposes only.
Should not be taken as legal business, tax, or investment advice,
or be used to evaluate any investment or security,
and is not directed at any investors or potential investors in any
A16Z fund.
Please note that A16Z and its affiliates may also maintain investments in the companies discussed
in this podcast.
For more details, including a link to our investments, please see A16Z.com forward slash disclosures.
