No Priors: Artificial Intelligence | Technology | Startups - How AI can make drug discovery fail less, with Daphne Koller from Insitro

Starting point is 00:00:00 In some sense, the wealth of opportunities here is one of the biggest challenges because everywhere you look there is a big opportunity for machine learning to be deployed in a potentially quite significant way. It's like computers. You're going to use it everywhere and it's going to be transformative everywhere. It's not going to be the silver bullet unless you figure out how to use it most effectively, but the opportunities are pretty much endless. This is the No Pryors podcast. I'm Saragua.

Starting point is 00:00:39 I'm a lot, Gil. We invest in, advise, and help start technology companies. In this podcast, we're talking with the leading founders and researchers in AI about the biggest questions. We've talked about computational biology for decades, but drugs keep getting more expensive to discover, and at the same time, recent advances in using machine learning for the life sciences and medicine are extraordinary. Are we on the verge of a paradigm shift in biotech? We're thrilled to have a pioneer in AI, Daphne Kohler, on the show to help us explore that question. She's CEO and founder of Encitro, a company that applies machine learning to pharmaceutical discovery and

Starting point is 00:01:18 development, specifically by leveraging induced pluripotent stem cells, which will get into explaining. Daphne was a computer science professor at Stanford, co-founder and CEO of Coursera, is a MacArthur fellow and was named by time as one of the world's 100 most influential people. We could go through all her other work, but we'd run out of time. Daphne, welcome to the podcast. Thank you, Sarah. It's a pleasure to be here. As we were saying, we won't ask you to walk through every part of your amazing life story, but you came to biology as a computer science application years into your career. What sparked you going down that route? My initial interest in biology came from the technical side in the sense that the data sets, this is way back when in the mid-90s, the data sets that were available to

Starting point is 00:01:58 machine learning research at the time were kind of boring and not very inspiring. So things like classifying text into 20 different news groups. And I found that there were more interesting data sets technically to be had on the biology side back then as we were starting, for example, to measure the activity of genes across the entire genome in multiple samples. So initially it was really more from a technological perspective. But then I ended up actually having an interest in biology in its own right and ultimately ended up having a bifurcated lab at Stanford where half my lab did core machine learning work published in traditional computer science venues. And the other half did core biology work that was published in nature and cell and science. And what was really

Starting point is 00:02:42 interesting is that most of my computer science colleagues had no idea that I did biology. Most of my life science colleagues had no idea I was in a computer science department. So it was a bit of a bifurcated existence, but it was a lot of fun. One more historical question for you. You wrote the book on probabilistic graph models. When I asked a mutual friend, what I should ask you, he suggested, you know, what motivated that work and how that field has changed. Just like in most fields, there is a swing of a pendulum. A lot of the early work in probabilistic graphical models was hugely influential in bringing artificial intelligence more into the world of machine learning and working with numerical data rather than just symbolic AI. And then I think the advent of

Starting point is 00:03:24 deep learning, push that to the side a little bit because there was so much power that could be gained from basically the kind of pattern recognition, from raw inputs, raw images, hex and so on, without having to worry very much about interpretable representations. What I think we're starting to see right now is a pendulum starting to swing back in the sense that there is a greater understanding that you really need a bit of both. You need that hugely powerful pattern recognition that we get from deep learning. But you also need the ability to reason about things like causality, and you also need some interpretability of your deep learning models so that you can potentially convey to a clinician why you made the decision that you did. And so what we're ending up with as a really powerful paradigm is some kind of synthesis of the ideas from both of these disciplines coming together.

Starting point is 00:04:13 You went from Stanford, I believe, I think, going and co-founding Coursera with Andrewing. And then you went to Calico a few years after that. I'm sort of curious, what made you decide to go into Calico? you mentioned your career was split between life sciences and computer sciences, and so you went down the computer science online learning route, and then you went back into biology. So I'm a little bit curious what drove you back in. So actually, I'm going to go back and answer the earlier part of that, which is what took me to Coursera in the first place, because I think it feeds into what took me away. So throughout much of my career at Stanford, I had an increasing sense of urgency

Starting point is 00:04:45 that I needed to make an impact in the world, a real impact on real people, not something that was at one step or two steps removed by training great students and having them go and do amazing things, but by something that I get to experience myself. And so when the work that I was doing at Stanford on technology-assisted education gave rise to the launch of those first Stanford massive open online courses, and we saw just how much impact those were having, I felt like it was too amazing of an opportunity to pass up and just assume that if I didn't do this, then somehow other people would take on the flag and carried forward, I felt like there was an incredible need to go and actually have that impact

Starting point is 00:05:23 myself and make sure that it was done right. And so that led to what my departure from Stanford on what was supposed to be a two-year leave of absence to go and found Coursera, and I had the full intention to go back to Stanford at some later point and resume my faculty life. That didn't happen. Stanford has a very strict leave of absence policy. And when they came two years later and said, so are you coming back? And I responded that it wasn't really the right time I needed to see the project through for another year or so. And they said that that was not an option. I ended up doing this completely crazy thing, which is resigning an endowed chair from Stanford and staying at industry. My mother thought I was nuts. I think she still thinks I'm nuts. But I ended up staying at

Starting point is 00:06:02 Coursera for a total of about five years. And so five years was kind of a reasonable point to take a step back and reflect. And when I did that, this was in early 2016. I realized that while I'd been deep in the trenches building Coursera, the machine learning world had totally transformed. Because as a reminder, I left Stanford for Coursera in late 2011, just before the machine learning revolution really took off in 2012. And so I suddenly lifted my head, looked around me and said, wow, machine learning really is transforming the world, but not really having much of an impact in the life sciences. And so I left Coursera in good hands. Coursera is a wonderful company, but it's not really a deep technology company and certainly not a science company and decided that where I could

Starting point is 00:06:47 have a really disproportionate impact was in bringing these two days. disciplines together because there's just not a lot of people who had the benefit as I did of spending basically 20 years doing machine learning and maybe a decade doing biology and could really speak both languages and figure out how to synthesize them. But since I've been in industry for five years and away from science and even away from machine learning, I didn't quite know where I wanted to go and what I wanted to do. And so I turned for advice actually more than anything else to Art Levinson, who is the former CEO of Genentech, the former chairman of Google and Apple. And I figured that if there was anyone who would know how to bring those two fields together,

Starting point is 00:07:29 he was probably uniquely qualified to do that. And so I asked him for advice, and he was very, I think, admittedly self-serving in his advice. He said, you should come to Calico. And honestly, I didn't know much about what Calico did other than it worked on aging, which seemed like a really important problem to think about. But I did know that it's not many times that one has the opportunity to work with a luminary like Art Levinson and I'd also by that point met Hal Barron, who's another person I have tremendous respect for. And I figured this was, you know, a really interesting way to spend some time and learn from these wonderful people. And I learned a ton during my time at Calico.

Starting point is 00:08:06 It was only 18 months because ultimately I realized that I didn't want to be at a company that focused on a particular biology but really built a platform for doing drugs. discovery differently, addressing some of the points that you Sarah made in your introduction about how drug discovery is this incredibly fraught, largely unsuccessful and very expensive endeavor, and so how could I make that happen differently? And it didn't seem like Calico was necessarily the right place to take on what was a platform company built. So that's why I left and found in Citro. Were there any specific insights from Calico that drove the founding of in Citro? Was it just more of the exposure to biopharmaceuticals and how things are developed that really drove your thinking that maybe ML and AI would have a real application area there?

Starting point is 00:08:51 I think that it was really the exposure for the first time to how biofarmaceuticals were developed, as you said. At Stanford, I'd worked a lot at the intersection of machine learning, data science, and biology and realized just how much power these machine learning technologies can have when applied even to small datasets, and certainly as the technology had evolved tremendously since then, datasets were becoming considerably larger and richer, there was an even larger opportunity to make a huge difference. And so that's what led my move back into that intersection and then, therefore, to Calico. But I think it was really the realization that, I guess, twofold. One is that the way in which you turned insights into therapeutic interventions was so,

Starting point is 00:09:41 old-fashioned and so unaccommodating of the use of data that I felt there had to be a better way to do this, which I think that the industry has since started to demonstrate across the board in many different companies. And I think the other thing that made me make that shift is that whereas data is in life science is growing tremendously data in aging and specifically human aging is really hard to get because human aging is a very long process. And in order, order to get data on the longitudinal trajectory of human aging today, we needed to start collecting data, you know, 20, 30 years ago, and the cohorts are rather small. And so I felt like there was a huge opportunity in this intersection, but maybe aging wasn't the first place

Starting point is 00:10:29 where one could most beneficially apply it from at least my perspective. Yeah, when you look across direct development, because I guess right now it costs a billion to a billion and a half dollars to develop a drug successfully, it takes a decade plus to actually get there. When I look at the potential areas that are challenging in the industry, there's sort of the initial small molecule selection and design or alternatively the pathway or cell type that you're using. Separate from that, there's the clinical trial itself and how do you figure out who to enroll and how to deal with the data and the patients and everything else. There's all the calibration around diagnostics and endpoints and clinical endpoints and how you think. And all those places

Starting point is 00:11:06 seem like there could be real uses of AI. How did you choose what in CITRO is actually going to do, given how much room there actually is to innovate in this area relative to data, to your point? I mean, it's just, it's shocking how little is done, right? It's like awful. I completely agree. And yeah, in some sense, the wealth of opportunities here is one of the biggest challenges because everywhere you look, there is a big opportunity for machine learning to be deployed in a potentially quite significant way. Sometimes, I have these discussions with the increasingly fewer number of people within biopharma who think that, yeah, this machine learning thing is a fad that will go away, or maybe that machine learning is going to be this thing that helps you in a particular point area like X-ray crystallography. It can improve this narrow little vertical, but that's pretty much what it's going to do. And my analogy is that it's not like extra crystallography. It's like computers. You're going to use it everywhere, and it's going to be transformative everywhere. It's not going to be the silver bullet unless you figure out how to use it most effectively,

Starting point is 00:12:10 but the opportunities are pretty much endless across the entire process from beginning to end. So with that, how did we pick what we end up working on? You know, I thought about this, and you could divide the process as many do into three large chunks. One is the original biology discovery, which is what targets do we employ in what indications and maybe in what. patient population is kind of the first chunk. Then there's turning those targets into therapeutic matter, which is a molecular design process. And then at the end, there is the enablement of the clinical trials in terms of actually actualizing patient selection or biomarkers for efficacy and things like that. And all of those are important and all of those

Starting point is 00:12:55 are valuable. But if you look at the actual numbers of what makes drug discovery so expensive, It is the fact that 95% of drug programs fail. They just do not succeed. And the biggest reason why they don't succeed is not because the clinical trial was poorly designed. That still happens, but it's not the biggest reason, nor is it because the molecule doesn't hit its target and modulated in the right way. That too happens. But again, it's an increasingly smaller number of situations because pharma companies have gotten

Starting point is 00:13:26 better and better at making therapeutic matter. It's a place where most programs fail is because we're just not modulating the right thing. It's the wrong target in the wrong indication or the wrong patient population. So if you really want to bring down that $2.5 billion number, what you have to do is to bring down this completely mind-blowing statistic of 95% of drug programs fail into something that is much more manageable so that a successful program doesn't have to carry on its back all of the many failures, expensive failures of all the things that didn't quite make it. And so I figured that it was maybe the hardest thing to do, but also the thing was going to be the most impactful.

Starting point is 00:14:04 So how do you approach that problem as a computer science and now computer science and biology person, like the target identification problem? Yeah, you know, it's really hard, right? Because when you think about it, it's the one area where you really don't have the right type of training data, at least not obviously, because the question you're asking yourself is, if I make this therapeutic intervention in this patient, what is it going to do clinically? And that is the thing about which you don't have data until the very end of the process, which is called a clinical trial. And so how do you train a machine learning model that doesn't have training data to train it, right? And so the direction that we've chosen to take is actually a two-pronged approach

Starting point is 00:14:48 and it's the synthesis of the two that we think is particularly powerful. We bring in data from two quite different sources. One is data from human individuals where we don't get to do experiments, but we have experiments of nature. Each of us is an experiment of nature where nature has modulated our genetics into, you know, different types of activity levels or of individual genes, where some of them behave this way and others behave that way. And we can look at that mapping from genotype to phenotype as a surrogate of what a therapeutic intervention would do in those humans. So that's great, but it limits you to those experiments of nature and the experiments of nature are not necessarily the same as what a therapeutic intervention would do. And so what we've

Starting point is 00:15:34 done in parallel is to create our own data in our own wet lab where we make interventions in cellular systems and measure the phenotypic consequences there, again, using very large-scale data with very high content modalities. And so the machine learning is actually used, I would say, in three different ways. One is to interrogate the phenotypic consequences of genetic variation in a human, looking at very high content data like imaging where we know machine learning works really well, like different types of omic modalities, transcriptomics, proteomics, and so on, to really understand that mapping between genetics and phenotype.

Starting point is 00:16:13 We similarly look at the mapping between genetic interventions, which in this case we get to actually direct ourselves by doing genome editing of cells, and say, what is the phenotypic consequences of modulating this gene in this cell background and reading out a large, high content data to really understand how cell state response to these interventions. And so the machine learning is used on each of those two separately and then also to bring them together so that you can kind of think about building cellular models that are predictive of human clinical outcomes, which is ultimately what we're looking to do is to replace the sort of untranslatable animal, models was something that is much more driven from human biology.

Starting point is 00:16:57 When you think about, again, just like the focusing of in situ, what domains do you decide to work in first? Because this approach should be quite horizontal of, of course, then you have complexity of what that cellular model can be. It for sure is. And again, focusing has always been a challenge in the sense that there's so many opportunities and how do we say no to some of them. So what we've done is tried to go in areas where we think there is both a large unmet need in the sense that the current tools that we're deploying are just not very effective and at the same time where we think that the technologies that we are developing internally provide us with the unique differentiated advantage. So one of those areas has been in neuroscience

Starting point is 00:17:42 because as we know, the unmet need there is humongous. There are so very few effective of therapeutic interventions in neuroscience, and that's partly because the model systems that we've been using specifically animal models, while one can quibble about in which other therapeutic errors they are more or less relevant. In neuroscience, it is very clear that they're probably not. And that's one of the reasons why things work so well in whatever curing mice of schizophrenia, whatever the heck that means, and not having much of an impact in human schizophrenia because it's not really even the same disease, right? So that's on the unmet need side. And on the opportunity side, we know that induced pluripotent stem cells are actually relatively

Starting point is 00:18:23 easily differentiated into neurons. We have mostly a computer science, not biology audience. And can you just explain how you get a Daphne neuron at all? Okay. So in order to get a Daphne neuron in the lab, you take either a white blood cell for me or a skin cell for me and you go through a process of what's called reprogramming, which received a technology which received the Nobel Prize number of years ago, which allows you to turn it into what is basically a stem cell,

Starting point is 00:18:52 which means a cell that can then take any lineage. It doesn't have to form a skin cell, which is where it came from. It can form a liver cell or a heart cell or a brain cell. And so, and then with that stem cell, which is why it's called an induce because you force it to be pluripotent, which means it can go on any different direction, stem cell or it's called IPSC. And you can, depending on what you do to it,

Starting point is 00:19:13 it can now be transformed, as I said, into a neuron or a cardiomyosite, which is a heart cell and so on and so forth. And so you can effectively get the effect of our genetics in these cellular systems. And similarly, you can make an even more pointed change by editing those cells and say, if there is a genetic variant that we know causes a particular disease or significantly increases the chances of such a disease, we can introduce that into different genetic backgrounds and then do a sort of almost like an in vitro case control, which is same cell with and without the genetic variant, what are the differences? And that very carefully positioned for tech people that's

Starting point is 00:19:54 like an AB test. This in vitro AB test is something that allows us to really get at those differences that are specifically associated with this disease-causing variant. So that is one aspect of the capability that drove us towards our therapeutic areas. The other is, as I said, we have a two-prong strategy. One is the data that we produce in the lab, and one is data that we collect from humans. So we also looked for areas in which data from humans is relatively readily available. And in neuroscience, we have an increasing number of brain MRIs. I think there will be even more now with the approval of some of the earliest Alzheimer's drugs, because it's going to be part of the process by which people are either selected to receive the drug or not, depending on whether their

Starting point is 00:20:41 brain MRI shows certain aspects of disease. The other areas that we've gone into are metabolism and oncology, because again, those are areas where relevant, disease relevant data that is high content, that is unbiased and truly informative about the disease state, is collected quite abundantly as part of the standard of care. And so those are, again, we tried to look for areas where there is large unmet need and where the two types of capabilities that we bring to bear can be deployed. That makes sense. If you think about something like, you know, neurodegenerate diseases, Alzheimer's, et cetera, like, you know, is it single cell? Who can say, but feels unlikely? What's beyond single cell? And do you guys do organoid research? Like, what is that within the scope of in vitro? Yeah, no, that's a great question. So a lot of complex diseases are not encompassed within a single cell lineage. However, I think even there one can study, in many cases, not always, the disease state by looking at a cell type that is clearly relevant to disease and perhaps pushing it out of its comfort zone.

Starting point is 00:21:52 So, for example, in some of the work that we've done in metabolic disease, I mean, it's clear that hepatocytes are not the BL and end all of what it takes to make a disease liver, but you can push the hepatocyte out of its comfort zone by putting. in the right combination of, you know, fatty acids and maybe various immune system factors or whatever to create a disease state that is much more similar to what you see in its natural environment. That having been said, it's clearly the case that we're not going to be able to recapitulate the entire complexity of a disease state for a lot of those diseases. And so one of the things that we do, and this is in the spirit of being pragmatic and prioritizing, there's plenty of things that we can do today where the disease does manifest sufficiently

Starting point is 00:22:41 in a single cell lineage. And so we go after those first and we defer some of the other ones to a later stage because technologies such as organoids, for example, that encompass multiple cell types in a single little micro brain or micro liver or whatever, or sometimes these things called organs on chips which allow you to actually create things

Starting point is 00:23:00 that are more than just, even a single organ, they start to create sort of the flow between different organs, systems. Those are technologies that other people are currently developing. They're getting better by the day. And so we feel like there's a lot of value that we can bring with the capabilities that are out there, even if we know the reductionist, even if we know they don't fully capture the disease, but they capture enough diseases so that we can bring medicines to patients. And maybe in three years, we'll have another tranche of diseases that are unlocked by the technological tidal wave

Starting point is 00:23:33 that we're all writing. You mentioned there was sort of two areas of exploration for in CETRA right now. One was metabolic disease and cancer. I guess that's really three areas. And the second is neurological areas. I was just sort of curious how far you want to take these in terms of the actual development of drugs in-house versus partnering out. And then I noticed you had things like relationships with BMS and others for ALS and

Starting point is 00:23:57 dementia and a few other areas. So a little bit curious about how far you actually want to take the development of drugs yourself versus partnering with others and how you think about that in the context of building a company and culture? That's a great question. And the answer is that we are going to be relatively pragmatic about this as well and what makes sense in terms of maximizing the impact that we have on patients. So one of the things that we have going for us, I think, over a lot of other companies is that what we've built as an engine for generating novel insights, novel target. So it's not the situation that a lot of companies are in, which is you have one program,

Starting point is 00:24:35 two programs. And if you kind of sell those off, then you're left with an empty cupboard. And then what do you do? You're not a company anymore. So what we think is because we have this engine, we have the opportunity to have some of those programs be done in partnership with others. Some of those perhaps even be entirely outlicensed to others. While the engine continues to give us additional insights, maybe even better insights, as we expand, for example, into new indications using new technologies. On the other hand, to think about it from the complementary perspective, some of the targets that we find ourselves having emerged from our platform are ones around which there's already

Starting point is 00:25:13 a drug available because, you know, there's only 20,000 genes. And so sometimes someone may have developed a drug just didn't deploy it in the right indication or didn't deploy it in the right patient population. And we don't believe that the only thing that makes our existence worthwhile is if we come up with a new chemical matter towards those targets. So we might go to the asset owner and say, hey, let's work together to bring that asset to patient faster. And that can usually shave off, you know, two, three,

Starting point is 00:25:41 maybe even five years from the development of a program because you've already made the drug. Sometimes you've already put it in people. You've shown that it's safe. You have a good biomarker for when it's working and when it's not all those things that can really slow down a program if you're starting from absolutely square one and a brand new target.

Starting point is 00:25:58 And so we hope to be very pragmatic in terms of what we develop in house and what we develop with others with a goal of really trying to maximize the impact that the platform can bring to as many patients as possible. How much work, if any, are you doing on the biomarker side? Because I think one of the points that you just raised is really interesting. When I look at a lot of clinical drug development, a lot of it is waiting for clinical endpoints that may take months or years to really substantiate. And so sometimes the FDA or others will be willing to accept certain clinical biomarkers as

Starting point is 00:26:30 sort of intermediary stabs or things that tend to vary relative to the trade or the outcome. Are you doing biomarker development as well? Because that seems like such a great area for the applications of ML. And yet it seems like there's so little work in terms of actually translating ML into the real world for biomarkers in particular. And I completely agree. And I think there's research that shows that drugs that have a biomarker are about twice as likely to be successful. in the clinic as ones that do not. By the way, there's also data that show that drugs that have support in human genetics are twice as likely to succeed as ones that do not.

Starting point is 00:27:05 And so we are deep believers in both of those. And I think that because our focus is so much on human data, a lot of the insights that come out of analysis of human clinical data does actually give you a biomarker for which patients are likely to benefit from a particular therapeutic intervention. And so in some ways, you can think of clinical biomarkers as coming out almost for free, if you will, not for free, but sort of as a consequence of the work that we're doing anyway, as long as we pay attention. And don't just say, as a lot of companies do, that, oh, we found the target, we're just

Starting point is 00:27:40 going to go and apply it in all comers, because honestly, that is one of the big things that causes drugs to fail, is that you are trying to apply it more broadly, if I'm being cynical, sometimes it's so as to maximize the revenues that you can get from a drug versus trying to figure out exactly in which patients it's going to work. And one of the things you asked earlier a lot, which, what did I learn at Calico? And one of the things that I learned there, there were a lot of former genetic people there, as one would expect, given the pedigree of the company. One of them told me that if one of the earliest precision oncology drugs was Herceptin, and that goes after her two positive breast cancer patients, that if they had tried to run a Herceptin clinical, trial in an all-comer breast cancer population, you would have needed a population of 10,000 in the clinical trial, which is a very large clinical trial. And even then, you might not have

Starting point is 00:28:31 seen a sufficiently strong, statistically significant signal because the adverse side effects, and every drug has adverse side effects in the non-responders may have outweighed the benefits with the very strong benefits in the responders. So the fact that they had the right patient population in the clinical development of Herceptin was absolutely critical to create a successful and reasonably sized clinical trial. And so I think that that is a pattern that many more people in the drug development industry should be following. And frankly, a lot of them have started to see the benefits of this, so we're not the only

Starting point is 00:29:07 ones going in there. But I do think, to your point a lot, that we have a differentiated technology staff that will hopefully allow us to get even better, more accurate biomarkers via machine learning on high content data. Yeah. You mentioned two really key points, I feel, to expediting drug delivery. There's a biomarker part, and then there's finding the right patients relative to the drug. And I think that actually also was very famous for the HRD drugs where there's a specific set of pathways that if you didn't actually select out the patients with specific mutations, the drugs didn't work. And the second you focused on that population, it worked extremely well. And so there's lots of examples of that where you just have to figure out who you're actually targeting. There's a really great interview from a couple years ago with Jansen, who started Jansen Pharmaceuticals. We talked about how he felt that a lot of drug regulation and the length of time it takes to develop drugs was driven by almost an overly safest view of the world. Like there wasn't a strong series of cost-benefit tradeoffs or willingness to sub-segment patient populations or really look at data in a rich way. And we've seen recently with things like COVID that we can really expedite both drug development, vaccine development, everything, right?

Starting point is 00:30:11 We did things in six months that normally would take 10 years during COVID because we decided we could do it. how much time do you think an ML1-first company or an ML1-first approach can really cut out of drug development? Or do you think it's purely a regulatory issue in terms of those timelines? I don't, I think that's a complicated question and I think has elements of both. I think first there does need to be a discussion with the regulators around what might be feasible from a regulatory approval perspective about different kinds of biomarkers. There's also elements that I think are very legitimate questions. like how do you collect the relevant biomarker in a robust reproducible way from different

Starting point is 00:30:51 patients, what kind of lab protocols one would need in order to have that be collected robustly? That's not always trivial. You can have the most beautiful, sophisticated biomarker that works in a very carefully designed research environment and it's not going to work in the wild as part of the standard of care. So I think the regulator does have legitimate questions that need to be answered there. But I do think that with that discussion, and especially if you can front-load that and have the discussion with the regulators, not at the very end when you show up with your whatever NDA package, but in an earlier state saying, okay, what would it take in order to make this reasonable from your perspective? What questions would you like to see

Starting point is 00:31:28 answered? I think there is a legitimate opportunity to actually accelerate things. Having said that, I think one needs to be realistic about what is and is not feasible. In COVID, we were in the fortunate or unfortunate position that there were a lot of people. patients with COVID. It was rampant. And so you were able to fill your clinical trials relatively quickly. And the disease progression was relatively fast. If you're doing an Alzheimer's trial, the disease progression is what it is. And you need to wait long enough to see a delta in the cognition curve in order to convince yourself that there is in fact a difference, that your drug is making a difference. Now, I think there is an opportunity to try and create

Starting point is 00:32:15 proxy biomarkers amyloid beta is an example of that there's been questions about is it the right proxy for cognition or not my guess would be that it is for some patients and probably not others so it's a mixed bag to our earlier point about heterogeneity and finding the right patient population but i think that is a thing that we need to gain conviction around over time and so ultimately there's only so much that you can speed up biology in certain cases because biology takes as long as it takes. Yeah, it's interesting because I feel like that's a mindset that those of us have worked in both computer science and biology have to learn, right?

Starting point is 00:32:56 You are so used to just being able to manipulate some data in the cloud and then you get an answer versus waiting for years for a readout or to make progress. When you think about how you built out the team at Encitro and how you built out the culture, how did you think about having each side learn about the different aspects that you provides. And in general, how did you think about the culture of a company that could bridge both things? You know, it's really hard. And I think building the right culture is one of the most challenging things that we had to do it in CITRO. And at the same time, I think a big competitive advantage because doing it is really not very easy. You have to bring in people who truly have

Starting point is 00:33:38 both a learning mindset on their own in terms of being interested enough to learn about something that for many is a totally different set of concepts and even ways of thinking about the world. So you need computer scientists who are willing to learn about this fuzzy ill-behaved field of biology where things don't do what they're supposed to do. When you program a computer, yeah, you can have bugs. But ultimately, assuming you did the right things, the same thing will happen. And that's not true in biology. We just don't know that much. Exactly. And these things are living beings so they don't respond in the same. same way, even day after day. And so there's just, it's really hard. And then conversely, you have

Starting point is 00:34:19 the scientist mindset that sometimes they get frustrated with, okay, we can take those building blocks and put them together, and this is what will happen. And science is not like that. And so you have to create a bridge between the different cultures, the different jargons, the different mindsets, and really both get people who are willing to learn about the other discipline, but also just engage in meaningful ways with people who are different to themselves. What did that mean when you said science is not just not like that in terms of manipulating building blocks? So there are so many variables that have a huge effect on the system that sometimes we only are only vaguely appreciate. Sometimes we don't appreciate it all.

Starting point is 00:35:02 A colleague told me an anecdote about an experiment where some days it went perfectly well. And then the other days the cells just died. And they tried to figure out what was going on. And it turns out the day the cells died were the days when there was a particular technician who had really had a fondness for onion sandwiches. And so it turns out that the onion on his breath actually ended up, you know, making the cells be less happy. And so you just don't even think about these things if you're an engineer, right? The other really interesting mindset difference between how scientists and how engineers approach the world is when you show an engineer, computer scientist, a bunch of dots, usually the natural inclination is to try and find the

Starting point is 00:35:48 pattern, the thing that explains as many of the points as you can, because that is the thing around which you will engineer your system. If you're a scientist, oftentimes what you look for are the outliers, the exceptions, because those exceptions are often the beginnings of a scientific discovery, because they're the beginning of a threat. It's like, why did this one behave differently from everybody else? And that gives rise to a new discovery. So again, it's just the mindset. So different. Was there anything you did from a process perspective to help bridge these things? So, for example, I remember at Color, we tried to often embed a bioinformatician with a team of

Starting point is 00:36:25 systems engineers and they'd learn off of each other. But then everybody on the team, you know, it could be a variant scientist, could be somebody else would participate in a scrum, which was a concept that they weren't used to, right? On the biology side, for example, it was more of a way to set that everybody does things on weekly cadences, and you don't just do long-term planning. You also do way more short-term planning than you're normally what in a lab. There's different approaches to almost try and bridge those divides. Were there any things that you specifically did along those lines, or were there other approaches that you took from a tangible perspective? Well, so first of all, we do bring in people

Starting point is 00:36:59 with their different mindsets, and we try and create sort of bridges between them. So we have product managers who do scrums and do these agile planning processes. and we apply that also to our platform development, even on the biology side. But at the same time, you know, drug discovery projects, which are years long, you don't do scrums. You know, there is a timeline. And when you have a, whatever, a 45-day differentiation for your iPS cells, it takes 45 days. And there's no point to doing an agile scrum in the middle. You just need to wait for the cells to do their thing.

Starting point is 00:37:35 And so we have project managers and we have product managers and we make sure they communicate with each other. but they each deploy their discipline in their own way. But to your question about one of the things that we did, a lot of it comes down to really being deliberate about culture and values. And so one of the things that we did at the very beginning of the company is we laid out a set of behavioral norms, which you can think of as values. And the one that is, I think, among my favorites, maybe my favorite, is actually the last one.

Starting point is 00:38:03 They're ordered, not an order of importance, but from what we do to how we do it, which is that we engage with each other. openly, constructively, and with respect. Each of the words matter. Engagement means we don't silo ourselves and just sit with our tribe. We really have an engagement with others, openly being open to asking naive questions, and at the same time, being open to naive suggestions from someone from a discipline other than yourself, because sometimes the question of, why don't we do things this way, is actually a really good idea when you don't come in with a preconceived notion of, oh, because that's how we've always done it. Constructively means that when you

Starting point is 00:38:40 make these suggestions, it has to be with the goal of making the outcome better rather than being the smartest person in the room, which is a big problem in companies. We have a lot of smart people. And the respect is really the respect for what everyone brings to the table. And I think that's really important because there's a lot of, and please forgive me a lot, but a lot of tech people who come in to life sciences. And it's like, we have that self-verbala. We are the smart. we're machine learning. We're going to solve everything. And they don't respect the challenges of the other discipline. Sometimes they don't even take the time to learn what the challenges of the other discipline are. And that creates immediate hackle raising on the other side.

Starting point is 00:39:22 And from there, the conversation can only get worse. So I think it's really important to have that respect for all sides. We have a lot of tech people, engineers, founders, researchers, as listeners, what would you be working on if you weren't working on in CITR? Like, what else are you paying attention to in digital bio or AI, assuming people are attuned to having that culture of openness and respect and constructive thinking? So, I think that's a great question. And this really is the golden age of AI machine learning. And there's just so many different ways in which that can be deployed in useful ways.

Starting point is 00:40:03 I mean, my personal compass has always been that we should be deploying this towards areas where we make life better for people. So I've tried to veer towards applications that are really about improving life, improving health versus, you know, selling more ads or whatever. Not that, you know, I mean, I guess selling ads is good too. But for me, it's really about how do we make life better. So I think there's a lot of really exciting opportunities right now. I think that intersection or that interface, if you will, between biology and technology is one of the richest areas that exist today because each of these fields has been making a huge amount of progress in its own right.

Starting point is 00:40:46 We all hear about, you know, AI much more in the news because of Chad GPT and so on, and it's something that everyone can really relate to and understand, but the toolkit that biologists have available to them with CRISPR and pluripotent stem cells and the huge advances in microscopy and such are maybe not quite as visible. to the everyday person, but they are equally dramatic, I think, in terms of what they unlock. And so bringing those two together creates so many opportunities for change in not just a drug discovery, which is where I happen to pick my own trajectory, but in agriculture technology,

Starting point is 00:41:22 in environmental technology, in energy, in biomaterials, maybe materials that are much less destructive to the environment and such with better properties. In food tech, I think there's just a tremendous wealth of directions that one can take those fields and bring them together in interesting ways. Having said that, I think there's other really beneficial societal directions that one can deploy this. I think we're only starting to see the applications of machine learning and AI to say energy other than things like biofuels because the data just haven't been. is readily available, but I'm sure that will change. Similarly, I think going back to my Coursera days and even my Stanford days, the benefits of machine learning in education and really

Starting point is 00:42:09 personalizing learning experiences to individual learners, maybe having a more beneficial experience than just letting Chad GPT write their essays for them, I think there is a lot of opportunities to really deepen and enhance learning experiences for students. So I think there's almost unlimited things that one could do. One just needs to be committed to finding them versus falling into the sort of uncomfortable place of going to one of the tech giants and just doing something that earns you a lot of money, which is, I guess, nice for you, but maybe not as good in terms of making the world better. You've worked with great success in areas that are perhaps traditionally harder to make money in as a startup, ed tech, health tech. There's not traditionally a ton of budget,

Starting point is 00:42:56 or there's an impedance mismatch, you know, you have regulatory controls or whatever it is that makes it more challenging traditionally than many other areas of software. But what advice would you give to founders who want to work in these areas in particular? So I think that there is, I'm hoping, a realization among investors that there are entire unpapped ecosystems where technology can make a difference and hasn't. And so I think that as you look at what we did at Coursera, for example, Ettec had always been a backwater of investment. And yet we were very fortunate to have been able to attract fairly significant funding,

Starting point is 00:43:37 even at the very early stages, because we had an idea that our investors found compelling and differentiated from what others had done. So I guess I'm a believer, and maybe I'm an optimist, that if you have a really good idea that is differentiated from what others have done where the impact is something you can make clear as we were able to do with those first early MOOCs, people will have confidence that you can turn that into something that is revenue bearing and will be willing to, you know, go with it for a while. So that having been said, I would say that ultimately, and this is, I guess, how I feel about maybe the other half of the question, which is, is this going to be the place where you make the most money

Starting point is 00:44:21 with the greatest amount of certainty. Maybe not. But I believe that we only have one life to live and that ultimately what you want to be able to do is to look back on your life at some point since I have done something that's really worthwhile and important. And I think that's something that is important for people to keep in mind as they decide where to spend their time. Daphne, thanks for an incredible conversation and thank you for joining us on the podcast. Thank you very much. Thank you for listening to this week's episode of No Pryors. Follow No Pryors for a new guest each week and let us know online what you think and who an AI you want to hear from.

Starting point is 00:44:59 You can keep in touch with me and conviction by following at Serenormus. You can follow me on Twitter at Alad Gill. Thanks for listening. No Pryors is produced in partnership with Pod People. Special thanks to our team, Synthel Galdia and Pranav Reddy and the production team at Pod People. Alex McManus, Matt Saab, Amy Machado, Ashton Carter, Danielle Roth, Carter Wogan, and Billy Libby. Also our parents, our children, the Academy, and Open Google Soft AI, the future employer of all of mankind.

Your Ad Here

No Priors: Artificial Intelligence | Technology | Startups - How AI can make drug discovery fail less, with Daphne Koller from Insitro

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.