a16z Podcast - a16z Podcast: Deep Learning for the Life Sciences

Starting point is 00:00:00 Hi, and welcome to the A16Z podcast. I'm Hannah. Deep learning has come to the life sciences. Lately, it seems every week a published study comes out with code on top. In this episode, A16Z general partner on the biofund Vijay and Bart Ramsundar talk about how AI and ML is unlocking the field in a new way in a conversation around their recently published book, Deep Learning for the Life Sciences, written along with co-authors Peter Eastman and Patrick Walters. The book aims to give developers and scientists a toolkit on how to use deep learning. for genomics, chemistry, biophysics, microscopy, medical analysis, and other areas. So why now? What is it about ML's development that is allowing it to finally make an impact in this

Starting point is 00:00:40 field? And what is the practical toolkit, the right problems to attack, the right questions to ask? Above and beyond that, as this deep learning toolkit becomes more and more accessible, biology is becoming democratized through ML. So how is the hacker ethos coming to the world of biology, and what might open source biology truly look like? So, Bart, we spend a lot of time thinking about deep learning and life sciences. It's a great time, I think, for people to become practitioners in this space, especially for people, maybe that's never done machine learning before from the life sciences side, or maybe people from the machine learning side to get into life sciences.

Starting point is 00:01:18 But maybe even a place to kick it off is, you know, what's special about now? Like, why should people be thinking about this? The challenge of programming biology has been that we don't know biology, and we make up theoretical models and the computers are wrong and, you know, biologists and chemists understandably get grumpy and say, why are you wasting my time? But with machine learning, the advantages that we can actually learn from the raw data. And all of a sudden we have this powerful new tool there. It can find things that we didn't know before.

Starting point is 00:01:45 And this is why now is the time to get into it, really to enable that next wave of, you know, breakthroughs in the core science. The part that still blows me away is just how fast this field is moving. And it feels like it's a combination of having the open source code on places like GitHub and archive. And there's like a paper a week that's impactful when it used to be maybe a paper a quarter or a paper a year. And the fact that code is coming with a paper, it's just layering on top.

Starting point is 00:02:13 I mean, that seems to me to be sort of the critical thing that's different now. Yeah. I think when you can clone a repo off GitHub, you all of sudden have new insights just because I'm using a new language. And now that thousands of people are getting into it, I think all of a sudden you'll find lots of semi-self-thought biologists who are really starting to find new, interesting things. And that is why it's exciting. It's like the hacker ethos, but kind of coming into the bio world, which has typically been

Starting point is 00:02:38 much more buttoned down. Now, I think anyone who can go into repo can start really making difference. I think that's going to be where the real long-term impact arises from these types of efforts. You don't need a journal subscription to get archive or to get the code, which is actually that alone is kind of amazing. It wasn't that long ago where a lot of academic software was sold, you know, and it may be sold for $500, which is very material.

Starting point is 00:03:01 That's one piece. You connect that to the concept of now AI or ML can unlock things in biology. Then biology is becoming democratized, is kind of sort of your point, right? And so let's talk about that because we're still learning biology collectively. What is it about deep learning in biology now? Because biology is old, machine learning is old, like what's new now? Deep learning has this question all over the place. Why does it work now?

Starting point is 00:03:27 The first neural nets kind of popped out in the 1950s. And I think it's really a combination of things. I think that part of it is the hardware, really. The hardware, the software, the growth of kind of rapid linear algebra stacks that have made it accessible. I think also an underappreciated part of it is, you know, the growth of the cloud and the internet, really. You know, neural nets are about as janky now as they used to be in the 80s. The difference is that I can now pull up a blog post where someone says, oh, these things are chanky, here's the 17 things I did.

Starting point is 00:03:57 I can copy, paste that into my code, and all of a sudden, I'm a neural net expert. It's not quite that easy. It turns it to a trade craft almost that you can learn by just working through it. That's why the deep learning tool kit has been accessible. Then you get to biology, and the question is, why biology, why now? And I think you're actually the question's a little deeper. I think that it's really about, I think, representation learning. So we have now reached this point where I think we can learn representations of molecules that are useful.

Starting point is 00:04:28 This has been something that, you know, in the science of chemistry, we've been doing a long time. There's been all sorts of, you know, hand-encoded representations of parts of molecular behavior that we think are important. But I think now using the new technology from, you know, image processing, from word processing, we can begin to learn molecular representations. And, you know, to be fair, I actually don't think we've really broken through there. If you look at, you know, what's happening in images or text, there are five years ahead of us. Well, let me break in here because just for the listeners to give a sense for why representation is important, one of my pet examples, is that if I gave anybody, like, say, two, five-digit numbers to add, it'd be trivial. If I gave you those same five-digit numbers in Roman numerals, and you want to add them, the representation there would make this like insane. And what would you do?

Starting point is 00:05:15 Well, you would convert into appropriate representation where the operations are. trivial or obvious. And then the operation is done and maybe re-encode, auto-encode back to the other representation. So this is the problem is like when you have a picture, representations are obvious because it's pixels, right? And computers love pixels. And maybe even for DNA. Like DNA is like a one-dimensional image. And so you have bases, but those kind of like pixels. We used to joke early days that we would just take a photograph of a small molecule and then use all the other stuff. But that's kind of insane too. And so with the right representation, things become transparent and obvious with the wrong representation becomes hard.

Starting point is 00:05:52 This is really at the heart of machine learning. It's that there's something about the world that I want to compute on. But computers only accept, you know, very limited forms of input. Zero ones, tax strings, like simple structures. Whereas, like, if you take a molecule, a molecule is like a frighteningly complex entity. So one thing that we often don't realize is that until 100 years ago, like we barely had any idea what a molecule was.

Starting point is 00:06:17 It's this alarmingly strange concept that although we see little diagrams in 10th grade chemistry or whatever, that isn't what a molecule is. It's a much weirder, weirder quantum object, dynamic, kind of shifting, flowing. We barely understand it even now. So then you just really start asking the question of, like, what is water, for example? Is it the three characters, H2O? Is it, you know, two hydrogens and oxygen? Is it some quantum construct? Is it like this dynamic vibrating thing?

Starting point is 00:06:47 Is it this bulk mass? There's so many layers to kind of the science of it. So what you really want to do is like you've got to pick one. And this is where it gets really hard, right? Like if I'm thirsty, what I care about in water is a class of water. If I'm trying to answer deep questions about, you know, the structure of Neptune, I might want a slightly different representation of water. The power of the new deep learning techniques is we don't necessarily have to pick a representation.

Starting point is 00:07:13 We don't have to say water is X or water is Y. Instead, you say, let's do some math, and let's take that math and let the machine really learn the form of water that it needs to answer the question at hand. So one form of mathematical construct is thinking of a molecule as a graph. And if you do this, you can begin to do these graph deep learning algorithms that can really extract meaningful structure from the molecule itself. We've learned finally that here's a general enough mathematical form we can use to extract meaningful insights about molecules or these critical biological chemical entities that we can then use to answer real questions in the real world. What I think is interesting here in particular is that so much has been developed on images and there's a lot of biology that's images.

Starting point is 00:08:02 And so we could just spend the whole time talking about images and it could be microscopy or radiology and tons of good stuff there. But there's a lot of biology that's more than images. and molecules a good example. And for a long time, it seemed like deep learning was being so successful in images that that's all it really did. And if you could sort of take your square peg

Starting point is 00:08:20 and put in whatever holes you got, it would work. What you're talking about for graphs is kind of an interesting evolution of this because a graph in an image are different types of representations. But, you know, at a technical level, convolutional networks for images or graph convolutions for graphs

Starting point is 00:08:35 are kind of a sort of barring a concept at a higher level. The biology version of machine learning is starting to sort of grow up and starting to not just be a direct copy of what was done with images and in other areas, but now starting to be its own thing. A five-year-old can really point out the critical points in an image, but you almost need a PhD to understand the critical points of a protein. So you have this dual kind of weights or burden of understanding. So it's taken a while for the biological machine learning approach to really mature. Because we've had to spend so much time even figuring out the basics. But now we're finally at this point where it feels like we are diverging a little bit from the core trunk of, you know, what people have done for images or text.

Starting point is 00:09:23 In another five years, I'm going to be blown away by what this thing does. It's going to understand more deeply. So we kind of have this sort of connection between democratization of ML, ML into biology, democratization into biology. but I don't think we're there yet. I think for ML, I think there really is a sense of democratization. You could code on your phone and do some useful things or certainly on a laptop, a cheap laptop. I mean, but for biology, what is missing?

Starting point is 00:09:51 One is data. And there's a fair bit of data. On the book, we talk about the PDB, we talk about other data sets and there are publicly available data sets. But somehow that doesn't get you into the big leagues. So like if in this vision of democratizing biology, what's left to be done? In some ways, the democratization of ML is a teeny bit of an illusion even.

Starting point is 00:10:12 It's because that the core constructs were mathematically invented, that there is this convolutional neural net or its cousins, the LSTM, or the other forms of core mathematical breakthroughs that have been designed, that you can take these building blocks and just apply them straight out. In biology, as you pointed out earlier, I think we don't have those core building blocks just yet. we don't know what the Lego pieces are that would enable a newcomer to really start to do, you know, breakthrough work. We're closer than we were. I think we've had the beginnings of a toolbox, but we're not there yet.

Starting point is 00:10:49 Well, let's think about what happened on the ML side as inspiration for the bio side. How much is it driven through academia, how much driven through companies? Because what I'm getting at is that there's a lot of bio in academia. I don't know if we're seeing that being made open sourced in companies. We're getting to this really weird set of influences where, In order for companies to gain influence, they need to open source. So this is why, you know, 10 years ago, I can't imagine that Google would have open source TensorFlow.

Starting point is 00:11:14 It would have been core proprietary technology. But now they know that if they don't do that, developers will shift to some other platform. Right. Buy some other company. Exactly. So it's weird that the competitive market forces are driving democratization. Well, so high torch, basically sort of Facebook based, right, and TensorFlow from Google. So let's say Google kept pencil for proprietary, what would be so bad for them if they did that?

Starting point is 00:11:39 What if everybody outside used Pythorch? This actually, I think there's like a really neat kind of analogy to the financial sector. A lot of financial banks have masses of functional programs that they keep under the hood, under the covers. If you look at Jane Street or I believe Standard and Chartered or a few other these other big institutions, lots and lots of functional code hiding behind those walls. But that really hasn't really infiltrated further out. And this actually, I think, in the long run, weakens them because it's harder to train. It's harder to find new talent.

Starting point is 00:12:11 It's more specialized. A lot of the codebase at Google is proprietary, like the original map producers never put out there. And this, I think, has actually caused them a little bit of a problem in that new developers coming in have to spend months and months and months getting up to speed of the Google stack. Whereas, if you look at TensorFlow, it doesn't take any time at all. someone could walk in and basically be able to write. They've been using it for months to years. Exactly. And I think at the scale that big tech is at, this is just like, it's a powerful market

Starting point is 00:12:42 advantage. You're almost outsourcing their education process. Yeah, and I guess if they don't put it out, someone else will, and then they'll learn on their platform. Yes, but then maybe what is the missing part in biology? We've got pharma, you know, a huge force there, but they have very specific goals. A lot of agricultural companies, but it's much more distant. bro.

Starting point is 00:13:02 Yeah. It's dramatically hard to actually take an existing organization and turn it into an AI machine learning organization. And so one thing I've honestly been surprised by is that, you know, when I've seen companies or organizations I know try to incorporate AI into their drug discovery process, it ends up taking them years and years and years because they're fighting all these upstream battles, to get their, you know, old computing system to upgrade to the right version of, you know, their numerical library so they could even install TensorFlow. And then they had all these things

Starting point is 00:13:36 about who can actually say, upgrade kind of the core software, who is it kind of this department, like how much does you need to talk to the biologists, to the chemists? The fact is that pharma and, you know, existing big codes are not built this way. That's not their core expertise. Whereas if you look at, you know, Facebook or Google, they've been machine learning for almost two decades now, like from the first AdWords model. So in some sense, like they had to change very little about their culture. Like, you know, there's a slight difference instead of like this function, use that function, but whatever.

Starting point is 00:14:11 But the core culture was there. Exactly. And I think the culture, the people, changing that is going to be dramatically hard. So which is why I think it will really take, I think, 10 years and a generation of students who have been trained in the new way to come in and shift. Yeah, well, Google was a startup too, right? I think, you know, the thesis was that and is that startups will be able to build a new culture. And I think the key thing that I think we're seeing sort of boots on the ground is that that culture has to be not, here's your data scientists or machine learning people in one room

Starting point is 00:14:41 and your biologist in another room, that they'd have to be the same. What's intriguing to me is just the size of the bio market. Biology is health care, it's agriculture, it's food, it could be the future of manufacturing. There are so many different places that biology plays a role to date and will play a role. But it just means that I think to the point we're talking about, these companies just are being built right now. There's, I think, this whole host of challenges here because biology is hard. And building kind of like that selective understanding of like, you know,

Starting point is 00:15:14 of the 10 best practices that existed, five are actually still best practices. The other five, we need to toss out a window and stick in a deep learning model. That kind of very painstaking process of experimentation and understanding, that I think is like where the really hard innovation is happening. And that's going to take time. You're never going to be able to replace like a world-class biologist with any machine learning program. A world-class biologist is typically freaking brilliant. And they often bring a set of understanding that no programmer or no computer scientists can. Now, the flip side holds true.

Starting point is 00:15:48 And I think that merger, as you said, that's where like there's power for magic diet. One really interesting factoid I heard from an entrepreneur in the space is that, you know, the best biologists that they could hire had a market rate that was lower than a introductory, intermediate, you know, front-hand dev. Yep. And, you know, of course, front-hand is very hard engineering. I don't want to put that down. But there's so many fewer these biologists, so there's almost this market imbalance of how is it possible that, you know, you can take really a world-class biologists of whom there's maybe a couple hundred in the world and not have them be valued properly by the market.

Starting point is 00:16:29 So do you even out those pay scales in one company? Do you like have two awkward pay ladders that coexist and create tension in your company? These are the types of like really hard operational questions that almost have nothing to do the science, but at the heart of it they do. Maybe it's interesting to talk about like how we can help people get there. Yeah. So what's like the training they should be doing? Maybe we could even go like super nuts and bolts.

Starting point is 00:16:51 So I got my laptop. What do I do? So, I mean, like, I guess there's a couple key packages we install. It's like TensorFlow, maybe DeepGAM, something like that. Python is often already installed, let's say, on a Mac. Is that it? And then we start going through papers and books and code? I think the first place really is to...

Starting point is 00:17:08 You need to form an understanding of, like, what are the problems even that you can think about? I think if you're not trained as a biologist, and even if you are, you might not see that that intersection of these are the problems where biological machine learning can or cannot work. And that I think is really what the book tries to teach you, as in like, what's the frame of thinking? What's the lens at which you look at this world and say that, oh, that is data coming out of a microscope. I should spend 30 minutes spin up a connet and iterate on that. This is a really gnarly thing about how I prepare my like, you know, see elegant samples.

Starting point is 00:17:46 I don't think the deep learning is going to help me here. And I think it's a very gnarly thing. that blend of knowledge that the book tries to give you. It's like a guidebook. When you see a new problem, you ask, is this a machine learning problem? If so, let me use these muscles. If it's not a machine learning problem, well, I know that I need to talk to someone who does know these things. And that's what we try to give. Vandering has a great rule of thumb. If a human can do it in a second, deep learning can probably figure it out. So start with something like, say, microscopy. You have an image coming in, and an expert can probably eyeball and say, interesting, not interesting. So there's this binary choice, and there's some arcane black box that was trained within the expert's head and experience.

Starting point is 00:18:29 That's actually the sort of thing machine learning is, like, made to solve. So really ask yourself, like, when you see something like that, is there some type of perceptual input coming in? Image, sound, text, and increasingly molecules, a weird new form of perception, almost magnetic or quantum. But you have perceptual input coming in. And is there a simple right-wrong, left-right, you know, intensity-type answer that you want from it? If you do, that's really a machine learning problem at its heart. So that's one type of machine learning. And I think the benefit there of that what a human can do in a second, deep learning can do, especially since in principle on the cloud you could spin up 100,000, 10,000 servers, suddenly you've got

Starting point is 00:19:12 10,000 people working to solve a problem and then they go back to something. else. That's just something you can't do with people. Or you've got 10,000 people working 24-7 as necessary. Can't do that with people. But there's another type of machine learning, which is to do things people can't. Or maybe more specific, do things individual people can't, but maybe crowds could. So like we see this in radiology, right, where the machine learning can have accuracies greater than an individual akin to what, let's say, the consensus would be, which would be the gold standard. That's maybe the real exciting part, sort of the so-called superhuman intelligence. Where are the boundaries of possibilities there?

Starting point is 00:19:49 One of the biggest problems really with deep learning is that you have some like strange and crazy prediction. Now, I think that there's a fallacy that people fall into of trusting the machine too easily because 90% of the time that's going to be garbage. And I think really kind of the challenge of picking out these bits of superhuman insight is to know how to shave off the trash predictions. Yeah. Is 90% an exaggeration, or is it really 90%? I like nice round numbers, so that might have just been something I picked out.

Starting point is 00:20:23 But there's like this great example, I think, in medicine. So there's scans coming in, and the deep learning algorithm was doing, like, amazing at predicting it. And then, like, they dug into it, and it turned out that the scans came from three centers. One of them had, like, some type of center label that was, like, the trauma center or something. There was the other non-trauma center. The deep learning algorithm had, like a kindergartner told to do this,

Starting point is 00:20:48 learn to identify the trauma flag and flag those and uptick those. So if you did this, like, naive statistics of blending them all out, you'd look amazing. But really, it's looking for a sticker. Yeah. I mean, there's tons of examples like that, one with the pathologist with the ruler in there, and it's becoming a ruler detector and so on.

Starting point is 00:21:04 Like, you know, this AUC, like a sense of accuracy of close to 1.0. We all got to be very suspicious of that because just running a second experiment wouldn't predict the first experiment with that type of accuracy. Anything that's too good to be true probably is. Yeah, I think then you get into the really subtle challenges, which is that, you know, the algorithm tells me this molecule should be non-toxic to a human

Starting point is 00:21:26 and should have effect on this, you know, indication. Do I trust it? Is it possible that there's a false pattern learned there? Humans make these types of mistakes all the time, right? Like, if you have any type of, like, actual biotech, you know that there's going to be molecules made or theses that are disproven. So you're getting into the hard core of learning, which is that, is this real?

Starting point is 00:21:49 The reality is we don't have answers to these. We're really kind of trending into the edge of machine learning today, which is that, is this a causal mechanism? Does A cause P? Is it a spurious correlation? And now we're getting to that place where humans aren't necessarily better. We talk about some techniques for interpreting, for looking at kind of what informed the decision of the deep learning algorithm.

Starting point is 00:22:11 And we do provide a few kind of tips and tricks. to start thinking about it. But the reality is that's kind of the hard part of machine learning. It's the edge. The interpreting chapter is one of my favorite ones because it's often sort of become so-called common wisdom that machine learning is a black box. But in fact, it doesn't

Starting point is 00:22:28 have to be and there's lots of things to do and we are quite prescriptive there. So the interpretability I think also is frankly what's going to make human beings more at peace with this. And this isn't anything unique to machine learning. Like if you had some guru who's like just spouting off stuff and said, you know,

Starting point is 00:22:44 buy this stock X and short stock Y and put all your life savings into it, you probably would be thinking, okay, well, maybe, but why? So I think this is just human nature and there's no reason why our interaction with machines would be any different. What I think is interesting is human beings are notoriously bad at causality. Like we kind of attribute things to be causal when they're not causal at all. We do that in our lives from like, you know, why did that person give me that cup of coffee to why did that drug fail? Like all these different reasons. There's two big misconceptions about machine learning. One is lack of interoperability.

Starting point is 00:23:18 The second one is correlation doesn't mean causation, which is true, but somehow people take that to mean it's impossible to compute causality. And that's the part that I think people have to really be educated on because there are now numerous theories of causality, and you could use probabilistic generative models, PGMs. There's lots of ways to go after causality. The whole trick, though, is you need time series data. What's beautiful about biology, or at least in health care, is that we've got time series data in many cases. So now perhaps finally there's the ability to really understand causality

Starting point is 00:23:50 in a way that human beings couldn't because we're so bad at it and machines are good at it and we've got the data. Can you think of like a place where, you know, in your experience the algorithms have succeeded in teasing out a causal structure that people missed? Yeah. So, you know, I think in healthcare, we always think about what is leading to various changes like, you know, this drug having adverse effects, this diet having, probably. positive or negative effects, all of these things are being understood and in the sort of the category of real world evidence, which is a big deal in pharma these days. And if you think about it, like a clinical trial is really a poor man surrogate for not

Starting point is 00:24:28 understanding causality. Because if we don't understand causality, you've got to do this thing where it's double blind, we start from scratch, and I'm following it in time, and we see it. If you understood causality, you might be able to just get a lot of results from just mining the data itself. As a great example, you can't do clinical trials for all pairs of drugs. I mean, just doing for a single drug is ridiculously expensive and important, but all pairs of drugs would never happen.

Starting point is 00:24:52 But people take pairs of drugs all the time. And so their adverse effects from real world data is probably the only way to do it. And we can actually get causality. There's tons of interesting journal medicine papers saying, aha, we found this from doing data analyses. I think that's just starting out. Honestly, I think that AI, bio-AI drug discovery needs to take a page. from the self-driving car companies in the neighboring self-driving car worlds simulators are all the rage and really because it's that same notion of like

Starting point is 00:25:22 causality almost like there's a structure to the world like pedestrians walk out chickens alligators whatever crazy things I saw this for the picture it happens yeah it happens so I think there they've built this amazing infrastructure of being able to run these repeated experiments almost a randomized clinical trials but informed by real data. We don't yet have that infrastructure in bio-world. And I know there's a couple of exciting startups who are starting to kind of move towards that direction, but I think it's when we can really probe the causality at scale. And then in addition to just probing it, when the simulator is wrong, use the new data point that came in and have the simulator, you know,

Starting point is 00:26:03 learn to fix itself. That's when you get to this really amazing feedback loop that could really revolutionize biology. So, you know, we talked about some basic nuts and bolts about about how to get started and the framing of questions, which is a key part. So let's say people, they're set up, they've got their question, where do they go from there? I mean, in a sense, we're talking about something closer to open source biology.

Starting point is 00:26:26 And to the extent that biology is programmable and synthetic biologies, I think, very much, you know, it's been around for a while, but I think it's really starting to crest. How do these pieces come together such that we could finally get to this sort of open source biology, democratization of biology? The big part of this is really,

Starting point is 00:26:43 the growth of community. There are people behind kind of all these, you know, GitHub pages that you see. There's real decentralized powerful organizations that if you look at the Linux Foundation, if you look at, say, the Bitcoin Core Foundation, there are networks of open source contributors, really, that form this like brain trust. It's very diffuse. It's not centralized in the Stanford, Harvard Med Department or whatever. And I think what we're going to see is the advent of similar decentralized brain trust

Starting point is 00:27:11 in the bio world. as in a network of experts who are kind of spread across the world and who kind of contribute through these code patches. And that, I think, is not at all new to the software world. We've seen that for decades. It's totally new to biology. It's alien. You'd be surprised how much skepticism there can be at the idea

Starting point is 00:27:30 that a non-Harvard trained, say, biologists can come up with a deep insight. We know that, to be a fact, right? There is multiple PhDs worth of work in just like the Linux kernel that that community really doesn't care to get that stamp of approval. So I think we're going to see the similar parallel kind of knowledge base that grows organically. But it takes time because you're talking about the building of another kind of almost educational structure, which is this new and exciting direction. Here's the challenge I worry about the most, which is that, like, so you're building a Linux kernel,

Starting point is 00:28:01 you can test whether it works or doesn't work relatively easily. Even as it is, there's this huge reproducibility crisis in biology. So how does one sort of become immune from that or at least not tainted by that? How do you know what to trust? And this is a really, really interesting question. And this is kind of shading a little bit almost into the crypto world, right? Like you could potentially think about this experiment where you have like a molecule. You don't know what's going to happen to it.

Starting point is 00:28:29 But maybe you create a prediction market that talks about the future of this molecule. And you could then begin to create these historical records of predictions. And we all know they're kind of like expert drug pickers at big pharma who can like eyeball and say that is going through, that is failing. And five years later you're like, well, shit, okay, yes, I was right. There is the beginnings of infrastructure for these feedback mechanisms, but it's a really hard problem. Yeah. I'm trying to think, though, what that would be. Like, the cute thing is like you could imagine if it was a simple question, like, is this drug soluble?

Starting point is 00:29:00 Someone might run a cheap software calculation. Someone might do the experiment. And there's different levels of cost of having different. levels of certainty. You're essentially describing a decentralized research facility. Maybe the question is who would use it? This is, I think, the really hard part because I think that biopharma tends to be a little more risk-averse for good reasons than many other industries.

Starting point is 00:29:24 But I actually think that in the long run, this could be really interesting. Because if you have multiple assets in a company, you could kind of like disbundle the assets. then you could start to get this much more granular understanding of like what assets actually do well, what assets don't. And if you make it okay for people to like place a bet on these assets, all of a sudden it's de-risk because if you're a big farmer and you're like, I don't really believe that Alzheimer's molecule does what is claimed. But I'm going to say like 15% odds that goes through.

Starting point is 00:30:01 I'll just invest 15% of what I would have in another world. Well, the trick is, and especially what we're talking about now, is the world of financial instruments as well, is the trick is you have to be able to know how to risk an asset. And so it could be, in the end, one of the first interesting applications of deep learning, machine learning, is to use all the available data to give the ML maximum likelihood estimator of what we think this asset is going to be. It prices the asset, and then people can go from there. It's kind of a fun world where we're sort of thinking about how the financial world, machine learning world, and biology come together to kind of decentralize it and democratize it.

Starting point is 00:30:34 I think there's opportunities to kind of like allow for more risks, the long tail to be played out. You don't have as many interesting hypotheses that grow dead in the water because it wasn't, you know, de-risk enough for a big bet. So, you know, what I think the big takeaway for me here is that there is that possible world, but I forget if this is the way you learned how to program, the way many of us did, is I learned when I was like 11 on like a, actually a T-I-99A. and I was just playing around with it and I learned so much because it was I could just get my hands right in it and I think kind of my hope for the book

Starting point is 00:31:11 is that it's kind of the equivalent in biology that people can get their hands in it and I don't know where they're going to go with it it would be super cool if they go where we're describing and that's one of the possible of any futures but I think that's what we're hopefully being able to give people we are opening out the sandbox here's what we've learned in kind of these very exclusive academic institutions

Starting point is 00:31:30 let's throw the gate open, say, here's as much as we know as we can try to distill it down and do what you will with it. Like open source means no permission, so go to town and hopefully do something good for the world is kind of the dream. That sounds fantastic. Well, thank you so much for joining us. Thank you for having me.

Your Ad Here

a16z Podcast - a16z Podcast: Deep Learning for the Life Sciences

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.