No Priors: Artificial Intelligence | Technology | Startups - How AI can make drug discovery fail less, with Daphne Koller from Insitro
Episode Date: March 2, 2023Life-saving therapeutics continue to grow more costly to discover. At the same time, recent advances in using machine learning for the life sciences and medicine are extraordinary. Are we on the verge... of a paradigm shift in biotech? This week on the podcast, a pioneer in AI, Daphne Koller, joins Sarah Guo and Elad Gil on the podcast to help us explore that question. Daphne is the CEO and founder of Insitro — a company that applies machine learning to pharma discovery and development, specifically by leveraging “induced pluripotent stem cells.” We explain Insitro’s approach, why they’re focused on generating their own data, why you can’t cure schizophrenia in mice, and how to design a culture that supports both research and engineering. Daphne was previously a computer science professor at Stanford, and co-founder and co-CEO of edutech company Coursera. Show Links: Insitro - About Video: AWS re:Invent 2019 – Daphne Koller of insitro Talks About Using AWS to Transform Drug Development Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @DaphneKoller Show Notes: [1:49] - How Daphne combined her biology and tech interests and ran a bifurcated lab at Stanford [4:34] - Why Daphne resigned an endowed chair at Stanford to build Coursera [14:14] - How insitro approaches target identification problems and training data [18:33] - What are pluripotent stem cells and how insitro identifies individual neurons [24:08 ] - How insitro operates as an engine for drug discovery and partners to create the drugs themselves [26:48] - Role of regulations, clinical trials and disease progression in drug delivery [33:19] - Building a team and workplace culture that can bridge both bio and computer sciences [39:50] - What Daphne is paying attention to in the so-called golden age of machine learning [43:12] - Advice for leading a startup in edtech and healthtech
Transcript
Discussion (0)
In some sense, the wealth of opportunities here is one of the biggest challenges
because everywhere you look there is a big opportunity for machine learning to be deployed
in a potentially quite significant way.
It's like computers.
You're going to use it everywhere and it's going to be transformative everywhere.
It's not going to be the silver bullet unless you figure out how to use it most effectively,
but the opportunities are pretty much endless.
This is the No Pryors podcast. I'm Saragua.
I'm a lot, Gil.
We invest in, advise, and help start technology companies.
In this podcast, we're talking with the leading founders and researchers in AI about the biggest questions.
We've talked about computational biology for decades, but drugs keep getting more expensive to
discover, and at the same time, recent advances in using machine learning for the life sciences
and medicine are extraordinary. Are we on the verge of a paradigm shift in biotech? We're thrilled
to have a pioneer in AI, Daphne Kohler, on the show to help us explore that question. She's CEO and
founder of Encitro, a company that applies machine learning to pharmaceutical discovery and
development, specifically by leveraging induced pluripotent stem cells, which will get into explaining.
Daphne was a computer science professor at Stanford, co-founder and CEO of Coursera, is a MacArthur
fellow and was named by time as one of the world's 100 most influential people. We could go through
all her other work, but we'd run out of time. Daphne, welcome to the podcast. Thank you, Sarah. It's a
pleasure to be here. As we were saying, we won't ask you to walk through every part of your
amazing life story, but you came to biology as a computer science application years into your career.
What sparked you going down that route? My initial interest in biology came from the technical side
in the sense that the data sets, this is way back when in the mid-90s, the data sets that were available to
machine learning research at the time were kind of boring and not very inspiring. So things like
classifying text into 20 different news groups. And I found that there were more interesting
data sets technically to be had on the biology side back then as we were starting, for example,
to measure the activity of genes across the entire genome in multiple samples. So initially it was
really more from a technological perspective. But then I ended up actually having an interest in
biology in its own right and ultimately ended up having a bifurcated lab at Stanford where half my lab
did core machine learning work published in traditional computer science venues. And the other half
did core biology work that was published in nature and cell and science. And what was really
interesting is that most of my computer science colleagues had no idea that I did biology.
Most of my life science colleagues had no idea I was in a computer science department. So it was a bit
of a bifurcated existence, but it was a lot of fun. One more historical question for you. You wrote
the book on probabilistic graph models. When I asked a mutual friend, what I should ask you,
he suggested, you know, what motivated that work and how that field has changed. Just like in most
fields, there is a swing of a pendulum. A lot of the early work in probabilistic graphical models
was hugely influential in bringing artificial intelligence more into the world of machine learning
and working with numerical data rather than just symbolic AI. And then I think the advent of
deep learning, push that to the side a little bit because there was so much power that could
be gained from basically the kind of pattern recognition, from raw inputs, raw images,
hex and so on, without having to worry very much about interpretable representations.
What I think we're starting to see right now is a pendulum starting to swing back in the sense
that there is a greater understanding that you really need a bit of both.
You need that hugely powerful pattern recognition that we get from deep learning.
But you also need the ability to reason about things like causality, and you also need some interpretability of your deep learning models so that you can potentially convey to a clinician why you made the decision that you did.
And so what we're ending up with as a really powerful paradigm is some kind of synthesis of the ideas from both of these disciplines coming together.
You went from Stanford, I believe, I think, going and co-founding Coursera with Andrewing.
And then you went to Calico a few years after that.
I'm sort of curious, what made you decide to go into Calico?
you mentioned your career was split between life sciences and computer sciences, and so you went
down the computer science online learning route, and then you went back into biology. So I'm a little bit
curious what drove you back in. So actually, I'm going to go back and answer the earlier part
of that, which is what took me to Coursera in the first place, because I think it feeds into what
took me away. So throughout much of my career at Stanford, I had an increasing sense of urgency
that I needed to make an impact in the world, a real impact on real people, not something that
was at one step or two steps removed by training great students and having them
go and do amazing things, but by something that I get to experience myself.
And so when the work that I was doing at Stanford on technology-assisted education gave rise
to the launch of those first Stanford massive open online courses, and we saw just how much
impact those were having, I felt like it was too amazing of an opportunity to pass up and
just assume that if I didn't do this, then somehow other people would take on the flag and
carried forward, I felt like there was an incredible need to go and actually have that impact
myself and make sure that it was done right. And so that led to what my departure from Stanford on
what was supposed to be a two-year leave of absence to go and found Coursera, and I had the full
intention to go back to Stanford at some later point and resume my faculty life. That didn't
happen. Stanford has a very strict leave of absence policy. And when they came two years later and
said, so are you coming back? And I responded that it wasn't really the right time I needed to
see the project through for another year or so. And they said that that was not an option. I ended up
doing this completely crazy thing, which is resigning an endowed chair from Stanford and staying
at industry. My mother thought I was nuts. I think she still thinks I'm nuts. But I ended up staying at
Coursera for a total of about five years. And so five years was kind of a reasonable point to take a
step back and reflect. And when I did that, this was in early 2016. I realized that while I'd been
deep in the trenches building Coursera, the machine learning world had totally transformed. Because as a
reminder, I left Stanford for Coursera in late 2011, just before the machine learning revolution
really took off in 2012. And so I suddenly lifted my head, looked around me and said, wow,
machine learning really is transforming the world, but not really having much of an impact in the life
sciences. And so I left Coursera in good hands. Coursera is a wonderful company, but it's not
really a deep technology company and certainly not a science company and decided that where I could
have a really disproportionate impact was in bringing these two days.
disciplines together because there's just not a lot of people who had the benefit as I did of spending
basically 20 years doing machine learning and maybe a decade doing biology and could really speak
both languages and figure out how to synthesize them. But since I've been in industry for five years
and away from science and even away from machine learning, I didn't quite know where I wanted to go
and what I wanted to do. And so I turned for advice actually more than anything else to Art Levinson,
who is the former CEO of Genentech, the former chairman of Google and Apple.
And I figured that if there was anyone who would know how to bring those two fields together,
he was probably uniquely qualified to do that.
And so I asked him for advice, and he was very, I think, admittedly self-serving in his advice.
He said, you should come to Calico.
And honestly, I didn't know much about what Calico did other than it worked on aging,
which seemed like a really important problem to think about.
But I did know that it's not many times that one has the opportunity to work with a luminary like Art Levinson and I'd also by that point met Hal Barron, who's another person I have tremendous respect for.
And I figured this was, you know, a really interesting way to spend some time and learn from these wonderful people.
And I learned a ton during my time at Calico.
It was only 18 months because ultimately I realized that I didn't want to be at a company that focused on a particular biology but really built a platform for doing drugs.
discovery differently, addressing some of the points that you Sarah made in your introduction
about how drug discovery is this incredibly fraught, largely unsuccessful and very expensive endeavor,
and so how could I make that happen differently? And it didn't seem like Calico was necessarily
the right place to take on what was a platform company built. So that's why I left and found
in Citro. Were there any specific insights from Calico that drove the founding of in Citro?
Was it just more of the exposure to biopharmaceuticals and how things are developed that really
drove your thinking that maybe ML and AI would have a real application area there?
I think that it was really the exposure for the first time to how biofarmaceuticals were developed,
as you said. At Stanford, I'd worked a lot at the intersection of machine learning, data science,
and biology and realized just how much power these machine learning technologies can have when applied
even to small datasets, and certainly as the technology had evolved tremendously since then,
datasets were becoming considerably larger and richer, there was an even larger opportunity
to make a huge difference. And so that's what led my move back into that intersection and
then, therefore, to Calico. But I think it was really the realization that, I guess, twofold.
One is that the way in which you turned insights into therapeutic interventions was so,
old-fashioned and so unaccommodating of the use of data that I felt there had to be a better way
to do this, which I think that the industry has since started to demonstrate across the board
in many different companies. And I think the other thing that made me make that shift is that
whereas data is in life science is growing tremendously data in aging and specifically human aging
is really hard to get because human aging is a very long process. And in order,
order to get data on the longitudinal trajectory of human aging today, we needed to start collecting
data, you know, 20, 30 years ago, and the cohorts are rather small. And so I felt like
there was a huge opportunity in this intersection, but maybe aging wasn't the first place
where one could most beneficially apply it from at least my perspective. Yeah, when you look
across direct development, because I guess right now it costs a billion to a billion and a half
dollars to develop a drug successfully, it takes a decade plus to actually get there.
When I look at the potential areas that are challenging in the industry, there's sort of the
initial small molecule selection and design or alternatively the pathway or cell type that you're
using. Separate from that, there's the clinical trial itself and how do you figure out who to
enroll and how to deal with the data and the patients and everything else. There's all the calibration
around diagnostics and endpoints and clinical endpoints and how you think. And all those places
seem like there could be real uses of AI. How did you choose what in CITRO is actually going to do,
given how much room there actually is to innovate in this area relative to data, to your point?
I mean, it's just, it's shocking how little is done, right? It's like awful.
I completely agree. And yeah, in some sense, the wealth of opportunities here is one of the
biggest challenges because everywhere you look, there is a big opportunity for machine learning
to be deployed in a potentially quite significant way. Sometimes,
I have these discussions with the increasingly fewer number of people within biopharma who think that, yeah, this machine learning thing is a fad that will go away, or maybe that machine learning is going to be this thing that helps you in a particular point area like X-ray crystallography. It can improve this narrow little vertical, but that's pretty much what it's going to do. And my analogy is that it's not like extra crystallography. It's like computers. You're going to use it everywhere, and it's going to be transformative everywhere.
It's not going to be the silver bullet unless you figure out how to use it most effectively,
but the opportunities are pretty much endless across the entire process from beginning to end.
So with that, how did we pick what we end up working on?
You know, I thought about this, and you could divide the process as many do into three large chunks.
One is the original biology discovery, which is what targets do we employ in what indications and maybe in what.
patient population is kind of the first chunk. Then there's turning those targets into
therapeutic matter, which is a molecular design process. And then at the end, there is the
enablement of the clinical trials in terms of actually actualizing patient selection or
biomarkers for efficacy and things like that. And all of those are important and all of those
are valuable. But if you look at the actual numbers of what makes drug discovery so expensive,
It is the fact that 95% of drug programs fail.
They just do not succeed.
And the biggest reason why they don't succeed is not because the clinical trial was poorly designed.
That still happens, but it's not the biggest reason, nor is it because the molecule doesn't
hit its target and modulated in the right way.
That too happens.
But again, it's an increasingly smaller number of situations because pharma companies have gotten
better and better at making therapeutic matter.
It's a place where most programs fail is because we're just not modulating the right thing.
It's the wrong target in the wrong indication or the wrong patient population.
So if you really want to bring down that $2.5 billion number, what you have to do is to bring
down this completely mind-blowing statistic of 95% of drug programs fail into something that is much
more manageable so that a successful program doesn't have to carry on its back all of the many failures,
expensive failures of all the things that didn't quite make it. And so I figured that it was maybe
the hardest thing to do, but also the thing was going to be the most impactful.
So how do you approach that problem as a computer science and now computer science and biology
person, like the target identification problem? Yeah, you know, it's really hard, right? Because
when you think about it, it's the one area where you really don't have the right type of training
data, at least not obviously, because the question you're asking yourself is, if I make this
therapeutic intervention in this patient, what is it going to do clinically? And that is the thing about
which you don't have data until the very end of the process, which is called a clinical trial.
And so how do you train a machine learning model that doesn't have training data to train it,
right? And so the direction that we've chosen to take is actually a two-pronged approach
and it's the synthesis of the two that we think is particularly powerful. We bring in
data from two quite different sources. One is data from human individuals where we don't get to do
experiments, but we have experiments of nature. Each of us is an experiment of nature where nature
has modulated our genetics into, you know, different types of activity levels or of individual
genes, where some of them behave this way and others behave that way. And we can look at that
mapping from genotype to phenotype as a surrogate of what a therapeutic intervention would do in
those humans. So that's great, but it limits you to those experiments of nature and the experiments
of nature are not necessarily the same as what a therapeutic intervention would do. And so what we've
done in parallel is to create our own data in our own wet lab where we make interventions in
cellular systems and measure the phenotypic consequences there, again, using very large-scale
data with very high content modalities. And so the machine learning is actually used, I would say,
in three different ways.
One is to interrogate the phenotypic consequences of genetic variation in a human,
looking at very high content data like imaging where we know machine learning works really well,
like different types of omic modalities, transcriptomics, proteomics, and so on,
to really understand that mapping between genetics and phenotype.
We similarly look at the mapping between genetic interventions,
which in this case we get to actually direct ourselves by doing genome editing of cells,
and say, what is the phenotypic consequences of modulating this gene in this cell background
and reading out a large, high content data to really understand how cell state response to these interventions.
And so the machine learning is used on each of those two separately and then also to bring them together
so that you can kind of think about building cellular models that are predictive of human clinical outcomes,
which is ultimately what we're looking to do is to replace the sort of untranslatable animal,
models was something that is much more driven from human biology.
When you think about, again, just like the focusing of in situ, what domains do you decide to
work in first? Because this approach should be quite horizontal of, of course, then you have
complexity of what that cellular model can be. It for sure is. And again, focusing has always
been a challenge in the sense that there's so many opportunities and how do we say no to some
of them. So what we've done is tried to go in areas where we think there is both a large
unmet need in the sense that the current tools that we're deploying are just not very effective
and at the same time where we think that the technologies that we are developing internally
provide us with the unique differentiated advantage. So one of those areas has been in neuroscience
because as we know, the unmet need there is humongous. There are so very few effective
of therapeutic interventions in neuroscience, and that's partly because the model systems that we've
been using specifically animal models, while one can quibble about in which other therapeutic errors
they are more or less relevant. In neuroscience, it is very clear that they're probably not. And that's
one of the reasons why things work so well in whatever curing mice of schizophrenia, whatever the heck
that means, and not having much of an impact in human schizophrenia because it's not really even
the same disease, right? So that's on the unmet need side.
And on the opportunity side, we know that induced pluripotent stem cells are actually relatively
easily differentiated into neurons.
We have mostly a computer science, not biology audience.
And can you just explain how you get a Daphne neuron at all?
Okay.
So in order to get a Daphne neuron in the lab, you take either a white blood cell for me or a skin cell
for me and you go through a process of what's called reprogramming, which received a technology
which received the Nobel Prize number of years ago,
which allows you to turn it into what is basically a stem cell,
which means a cell that can then take any lineage.
It doesn't have to form a skin cell, which is where it came from.
It can form a liver cell or a heart cell or a brain cell.
And so, and then with that stem cell,
which is why it's called an induce because you force it to be pluripotent,
which means it can go on any different direction,
stem cell or it's called IPSC.
And you can, depending on what you do to it,
it can now be transformed, as I said,
into a neuron or a cardiomyosite, which is a heart cell and so on and so forth.
And so you can effectively get the effect of our genetics in these cellular systems.
And similarly, you can make an even more pointed change by editing those cells and say,
if there is a genetic variant that we know causes a particular disease or significantly increases
the chances of such a disease, we can introduce that into different genetic backgrounds and
then do a sort of almost like an in vitro case control, which is same cell with and without the
genetic variant, what are the differences? And that very carefully positioned for tech people that's
like an AB test. This in vitro AB test is something that allows us to really get at those
differences that are specifically associated with this disease-causing variant. So that is one aspect
of the capability that drove us towards our therapeutic areas. The other is, as I said, we have a two-prong
strategy. One is the data that we produce in the lab, and one is data that we collect from
humans. So we also looked for areas in which data from humans is relatively readily available.
And in neuroscience, we have an increasing number of brain MRIs. I think there will be even more
now with the approval of some of the earliest Alzheimer's drugs, because it's going to be part of the
process by which people are either selected to receive the drug or not, depending on whether their
brain MRI shows certain aspects of disease. The other areas that we've gone into are metabolism
and oncology, because again, those are areas where relevant, disease relevant data that is high
content, that is unbiased and truly informative about the disease state, is collected quite abundantly
as part of the standard of care. And so those are, again, we tried to look for areas where there is
large unmet need and where the two types of capabilities that we bring to bear can be deployed.
That makes sense. If you think about something like, you know, neurodegenerate diseases, Alzheimer's, et cetera, like, you know, is it single cell? Who can say, but feels unlikely? What's beyond single cell? And do you guys do organoid research? Like, what is that within the scope of in vitro?
Yeah, no, that's a great question. So a lot of complex diseases are not encompassed within a single cell lineage.
However, I think even there one can study, in many cases, not always, the disease state by looking at a cell type that is clearly relevant to disease and perhaps pushing it out of its comfort zone.
So, for example, in some of the work that we've done in metabolic disease, I mean, it's clear that hepatocytes are not the BL and end all of what it takes to make a disease liver, but you can push the hepatocyte out of its comfort zone by putting.
in the right combination of, you know, fatty acids and maybe various immune system factors
or whatever to create a disease state that is much more similar to what you see in its natural
environment. That having been said, it's clearly the case that we're not going to be able
to recapitulate the entire complexity of a disease state for a lot of those diseases. And so
one of the things that we do, and this is in the spirit of being pragmatic and prioritizing,
there's plenty of things that we can do today
where the disease does manifest sufficiently
in a single cell lineage.
And so we go after those first
and we defer some of the other ones to a later stage
because technologies such as organoids, for example,
that encompass multiple cell types in a single little micro brain
or micro liver or whatever,
or sometimes these things called organs on chips
which allow you to actually create things
that are more than just,
even a single organ, they start to create
sort of the flow between different organs,
systems. Those are technologies that other people are currently developing. They're getting better by
the day. And so we feel like there's a lot of value that we can bring with the capabilities that are
out there, even if we know the reductionist, even if we know they don't fully capture the disease,
but they capture enough diseases so that we can bring medicines to patients. And maybe in three
years, we'll have another tranche of diseases that are unlocked by the technological tidal wave
that we're all writing.
You mentioned there was sort of two areas of exploration for in CETRA right now.
One was metabolic disease and cancer.
I guess that's really three areas.
And the second is neurological areas.
I was just sort of curious how far you want to take these
in terms of the actual development of drugs in-house versus partnering out.
And then I noticed you had things like relationships with BMS and others for ALS and
dementia and a few other areas.
So a little bit curious about how far you actually want to take the development of drugs
yourself versus partnering with others and how you think about that in the context of building a
company and culture? That's a great question. And the answer is that we are going to be
relatively pragmatic about this as well and what makes sense in terms of maximizing the impact
that we have on patients. So one of the things that we have going for us, I think, over a lot of
other companies is that what we've built as an engine for generating novel insights, novel
target. So it's not the situation that a lot of companies are in, which is you have one program,
two programs. And if you kind of sell those off, then you're left with an empty cupboard. And then
what do you do? You're not a company anymore. So what we think is because we have this engine,
we have the opportunity to have some of those programs be done in partnership with others. Some of
those perhaps even be entirely outlicensed to others. While the engine continues to give us
additional insights, maybe even better insights, as we expand, for example,
into new indications using new technologies.
On the other hand, to think about it from the complementary perspective, some of the targets
that we find ourselves having emerged from our platform are ones around which there's already
a drug available because, you know, there's only 20,000 genes.
And so sometimes someone may have developed a drug just didn't deploy it in the right
indication or didn't deploy it in the right patient population.
And we don't believe that the only thing that makes our existence worthwhile is if we come up
with a new chemical matter towards those targets.
So we might go to the asset owner and say,
hey, let's work together to bring that asset to patient faster.
And that can usually shave off, you know, two, three,
maybe even five years from the development of a program
because you've already made the drug.
Sometimes you've already put it in people.
You've shown that it's safe.
You have a good biomarker for when it's working
and when it's not all those things that can really slow down a program
if you're starting from absolutely square one
and a brand new target.
And so we hope to be very pragmatic in terms of what we develop in house and what we develop
with others with a goal of really trying to maximize the impact that the platform can bring
to as many patients as possible.
How much work, if any, are you doing on the biomarker side?
Because I think one of the points that you just raised is really interesting.
When I look at a lot of clinical drug development, a lot of it is waiting for clinical
endpoints that may take months or years to really substantiate.
And so sometimes the FDA or others will be willing to accept certain clinical biomarkers as
sort of intermediary stabs or things that tend to vary relative to the trade or the outcome.
Are you doing biomarker development as well?
Because that seems like such a great area for the applications of ML.
And yet it seems like there's so little work in terms of actually translating ML into the real world for biomarkers in particular.
And I completely agree.
And I think there's research that shows that drugs that have a biomarker are about twice as likely to be successful.
in the clinic as ones that do not. By the way, there's also data that show that
drugs that have support in human genetics are twice as likely to succeed as ones that do not.
And so we are deep believers in both of those. And I think that because our focus is so much
on human data, a lot of the insights that come out of analysis of human clinical data does
actually give you a biomarker for which patients are likely to benefit from a particular
therapeutic intervention.
And so in some ways, you can think of clinical biomarkers as coming out almost for free,
if you will, not for free, but sort of as a consequence of the work that we're doing anyway,
as long as we pay attention.
And don't just say, as a lot of companies do, that, oh, we found the target, we're just
going to go and apply it in all comers, because honestly, that is one of the big things that
causes drugs to fail, is that you are trying to apply it more broadly, if I'm being cynical,
sometimes it's so as to maximize the revenues that you can get from a drug versus trying to figure out exactly in which patients it's going to work.
And one of the things you asked earlier a lot, which, what did I learn at Calico?
And one of the things that I learned there, there were a lot of former genetic people there, as one would expect, given the pedigree of the company.
One of them told me that if one of the earliest precision oncology drugs was Herceptin, and that goes after her two positive breast cancer patients, that if they had tried to run a Herceptin clinical,
trial in an all-comer breast cancer population, you would have needed a population of 10,000
in the clinical trial, which is a very large clinical trial. And even then, you might not have
seen a sufficiently strong, statistically significant signal because the adverse side effects,
and every drug has adverse side effects in the non-responders may have outweighed the benefits
with the very strong benefits in the responders. So the fact that they had the right patient
population in the clinical development of Herceptin was absolutely critical to create
a successful and reasonably sized clinical trial.
And so I think that that is a pattern that many more people in the drug development industry
should be following.
And frankly, a lot of them have started to see the benefits of this, so we're not the only
ones going in there.
But I do think, to your point a lot, that we have a differentiated technology staff that
will hopefully allow us to get even better, more accurate biomarkers via machine learning
on high content data.
Yeah. You mentioned two really key points, I feel, to expediting drug delivery. There's a biomarker part, and then there's finding the right patients relative to the drug. And I think that actually also was very famous for the HRD drugs where there's a specific set of pathways that if you didn't actually select out the patients with specific mutations, the drugs didn't work. And the second you focused on that population, it worked extremely well. And so there's lots of examples of that where you just have to figure out who you're actually targeting. There's a really great interview from a couple years ago with Jansen, who started Jansen Pharmaceuticals.
We talked about how he felt that a lot of drug regulation and the length of time it takes to develop drugs was driven by almost an overly safest view of the world.
Like there wasn't a strong series of cost-benefit tradeoffs or willingness to sub-segment patient populations or really look at data in a rich way.
And we've seen recently with things like COVID that we can really expedite both drug development, vaccine development, everything, right?
We did things in six months that normally would take 10 years during COVID because we decided we could do it.
how much time do you think an ML1-first company or an ML1-first approach can really cut out
of drug development? Or do you think it's purely a regulatory issue in terms of those timelines?
I don't, I think that's a complicated question and I think has elements of both.
I think first there does need to be a discussion with the regulators around what might be
feasible from a regulatory approval perspective about different kinds of biomarkers.
There's also elements that I think are very legitimate questions.
like how do you collect the relevant biomarker in a robust reproducible way from different
patients, what kind of lab protocols one would need in order to have that be collected
robustly? That's not always trivial. You can have the most beautiful, sophisticated biomarker
that works in a very carefully designed research environment and it's not going to work in the
wild as part of the standard of care. So I think the regulator does have legitimate questions
that need to be answered there. But I do think that with that discussion, and especially if you
can front-load that and have the discussion with the regulators, not at the very end when you
show up with your whatever NDA package, but in an earlier state saying, okay, what would it take
in order to make this reasonable from your perspective? What questions would you like to see
answered? I think there is a legitimate opportunity to actually accelerate things. Having said
that, I think one needs to be realistic about what is and is not feasible. In COVID, we were in
the fortunate or unfortunate position that there were a lot of people.
patients with COVID. It was rampant. And so you were able to fill your clinical trials relatively
quickly. And the disease progression was relatively fast. If you're doing an Alzheimer's trial,
the disease progression is what it is. And you need to wait long enough to see a delta in
the cognition curve in order to convince yourself that there is in fact a difference, that your
drug is making a difference. Now, I think there is an opportunity to try and create
proxy biomarkers amyloid beta is an example of that there's been questions about is it the right
proxy for cognition or not my guess would be that it is for some patients and probably not others so
it's a mixed bag to our earlier point about heterogeneity and finding the right patient population
but i think that is a thing that we need to gain conviction around over time and so
ultimately there's only so much that you can speed up biology
in certain cases because biology takes as long as it takes.
Yeah, it's interesting because I feel like that's a mindset that those of us have worked
in both computer science and biology have to learn, right?
You are so used to just being able to manipulate some data in the cloud and then you get
an answer versus waiting for years for a readout or to make progress.
When you think about how you built out the team at Encitro and how you built out the culture,
how did you think about having each side learn about the different aspects that you
provides. And in general, how did you think about the culture of a company that could bridge both
things? You know, it's really hard. And I think building the right culture is one of the most
challenging things that we had to do it in CITRO. And at the same time, I think a big competitive
advantage because doing it is really not very easy. You have to bring in people who truly have
both a learning mindset on their own in terms of being interested enough to learn about
something that for many is a totally different set of concepts and even ways of thinking about
the world. So you need computer scientists who are willing to learn about this fuzzy ill-behaved
field of biology where things don't do what they're supposed to do. When you program a computer,
yeah, you can have bugs. But ultimately, assuming you did the right things, the same thing will
happen. And that's not true in biology. We just don't know that much.
Exactly. And these things are living beings so they don't respond in the same.
same way, even day after day. And so there's just, it's really hard. And then conversely, you have
the scientist mindset that sometimes they get frustrated with, okay, we can take those building
blocks and put them together, and this is what will happen. And science is not like that. And so
you have to create a bridge between the different cultures, the different jargons, the different
mindsets, and really both get people who are willing to learn about the other discipline, but also
just engage in meaningful ways with people who are different to themselves.
What did that mean when you said science is not just not like that in terms of manipulating building blocks?
So there are so many variables that have a huge effect on the system that sometimes we only are only vaguely appreciate.
Sometimes we don't appreciate it all.
A colleague told me an anecdote about an experiment where some days it went perfectly well.
And then the other days the cells just died.
And they tried to figure out what was going on.
And it turns out the day the cells died were the days when there was a particular technician who had really had a fondness for onion sandwiches.
And so it turns out that the onion on his breath actually ended up, you know, making the cells be less happy.
And so you just don't even think about these things if you're an engineer, right?
The other really interesting mindset difference between how scientists and how engineers approach the world is when you show an engineer,
computer scientist, a bunch of dots, usually the natural inclination is to try and find the
pattern, the thing that explains as many of the points as you can, because that is the thing around
which you will engineer your system. If you're a scientist, oftentimes what you look for
are the outliers, the exceptions, because those exceptions are often the beginnings of a scientific
discovery, because they're the beginning of a threat. It's like, why did this one behave differently
from everybody else? And that gives rise to a new discovery. So again, it's just the mindset.
So different.
Was there anything you did from a process perspective to help bridge these things?
So, for example, I remember at Color, we tried to often embed a bioinformatician with a team of
systems engineers and they'd learn off of each other.
But then everybody on the team, you know, it could be a variant scientist, could be somebody
else would participate in a scrum, which was a concept that they weren't used to, right?
On the biology side, for example, it was more of a way to set that everybody does things
on weekly cadences, and you don't just do long-term planning. You also do way more short-term
planning than you're normally what in a lab. There's different approaches to almost try and bridge
those divides. Were there any things that you specifically did along those lines, or were there
other approaches that you took from a tangible perspective? Well, so first of all, we do bring in people
with their different mindsets, and we try and create sort of bridges between them. So we have product
managers who do scrums and do these agile planning processes.
and we apply that also to our platform development, even on the biology side.
But at the same time, you know, drug discovery projects, which are years long, you don't do scrums.
You know, there is a timeline.
And when you have a, whatever, a 45-day differentiation for your iPS cells, it takes 45 days.
And there's no point to doing an agile scrum in the middle.
You just need to wait for the cells to do their thing.
And so we have project managers and we have product managers and we make sure they communicate with each other.
but they each deploy their discipline in their own way.
But to your question about one of the things that we did,
a lot of it comes down to really being deliberate about culture and values.
And so one of the things that we did at the very beginning of the company
is we laid out a set of behavioral norms, which you can think of as values.
And the one that is, I think, among my favorites, maybe my favorite,
is actually the last one.
They're ordered, not an order of importance, but from what we do to how we do it,
which is that we engage with each other.
openly, constructively, and with respect. Each of the words matter. Engagement means we don't
silo ourselves and just sit with our tribe. We really have an engagement with others, openly
being open to asking naive questions, and at the same time, being open to naive suggestions
from someone from a discipline other than yourself, because sometimes the question of, why don't
we do things this way, is actually a really good idea when you don't come in with a preconceived
notion of, oh, because that's how we've always done it. Constructively means that when you
make these suggestions, it has to be with the goal of making the outcome better rather than being
the smartest person in the room, which is a big problem in companies. We have a lot of smart
people. And the respect is really the respect for what everyone brings to the table. And I think
that's really important because there's a lot of, and please forgive me a lot, but a lot of tech people
who come in to life sciences. And it's like, we have that self-verbala. We are the smart.
we're machine learning. We're going to solve everything. And they don't respect the challenges
of the other discipline. Sometimes they don't even take the time to learn what the challenges
of the other discipline are. And that creates immediate hackle raising on the other side.
And from there, the conversation can only get worse. So I think it's really important to have
that respect for all sides. We have a lot of tech people, engineers, founders, researchers,
as listeners, what would you be working on if you weren't working on in CITR?
Like, what else are you paying attention to in digital bio or AI, assuming people are attuned
to having that culture of openness and respect and constructive thinking?
So, I think that's a great question.
And this really is the golden age of AI machine learning.
And there's just so many different ways in which that can be deployed in useful ways.
I mean, my personal compass has always been that we should be deploying this towards areas where we make life better for people.
So I've tried to veer towards applications that are really about improving life, improving health versus, you know, selling more ads or whatever.
Not that, you know, I mean, I guess selling ads is good too.
But for me, it's really about how do we make life better.
So I think there's a lot of really exciting opportunities right now.
I think that intersection or that interface, if you will, between biology and technology
is one of the richest areas that exist today because each of these fields has been making
a huge amount of progress in its own right.
We all hear about, you know, AI much more in the news because of Chad GPT and so on,
and it's something that everyone can really relate to and understand, but the toolkit that
biologists have available to them with CRISPR and pluripotent stem cells and the huge advances
in microscopy and such are maybe not quite as visible.
to the everyday person, but they are equally dramatic, I think, in terms of what they
unlock.
And so bringing those two together creates so many opportunities for change in not just a drug
discovery, which is where I happen to pick my own trajectory, but in agriculture technology,
in environmental technology, in energy, in biomaterials, maybe materials that are much
less destructive to the environment and such with better properties. In food tech, I think there's
just a tremendous wealth of directions that one can take those fields and bring them together in
interesting ways. Having said that, I think there's other really beneficial societal directions that
one can deploy this. I think we're only starting to see the applications of machine learning and
AI to say energy other than things like biofuels because the data just haven't been.
is readily available, but I'm sure that will change. Similarly, I think going back to my
Coursera days and even my Stanford days, the benefits of machine learning in education and really
personalizing learning experiences to individual learners, maybe having a more beneficial experience
than just letting Chad GPT write their essays for them, I think there is a lot of opportunities
to really deepen and enhance learning experiences for students. So I think there's almost unlimited
things that one could do. One just needs to be committed to finding them versus falling into the
sort of uncomfortable place of going to one of the tech giants and just doing something that
earns you a lot of money, which is, I guess, nice for you, but maybe not as good in terms of
making the world better. You've worked with great success in areas that are perhaps traditionally
harder to make money in as a startup, ed tech, health tech. There's not traditionally a ton of budget,
or there's an impedance mismatch, you know, you have regulatory controls or whatever it is that
makes it more challenging traditionally than many other areas of software.
But what advice would you give to founders who want to work in these areas in particular?
So I think that there is, I'm hoping, a realization among investors that there are entire
unpapped ecosystems where technology can make a difference and hasn't.
And so I think that as you look at what we did at Coursera, for example,
Ettec had always been a backwater of investment.
And yet we were very fortunate to have been able to attract fairly significant funding,
even at the very early stages, because we had an idea that our investors found compelling
and differentiated from what others had done.
So I guess I'm a believer, and maybe I'm an optimist, that if you have a really good idea
that is differentiated from what others have done where the impact is something you can make clear
as we were able to do with those first early MOOCs, people will have confidence that you can turn
that into something that is revenue bearing and will be willing to, you know, go with it for a while.
So that having been said, I would say that ultimately, and this is, I guess, how I feel about maybe
the other half of the question, which is, is this going to be the place where you make the most money
with the greatest amount of certainty. Maybe not. But I believe that we only have one life to live
and that ultimately what you want to be able to do is to look back on your life at some point
since I have done something that's really worthwhile and important. And I think that's something that
is important for people to keep in mind as they decide where to spend their time.
Daphne, thanks for an incredible conversation and thank you for joining us on the podcast.
Thank you very much.
Thank you for listening to this week's episode of No Pryors.
Follow No Pryors for a new guest each week and let us know online what you think and who an AI you want to hear from.
You can keep in touch with me and conviction by following at Serenormus.
You can follow me on Twitter at Alad Gill. Thanks for listening.
No Pryors is produced in partnership with Pod People.
Special thanks to our team, Synthel Galdia and Pranav Reddy and the production team at Pod People.
Alex McManus, Matt Saab, Amy Machado, Ashton Carter, Danielle Roth, Carter Wogan, and Billy Libby.
Also our parents, our children, the Academy, and Open Google Soft AI, the future employer of all of mankind.