The a16z Show - a16z Podcast: Revisiting the Gene

Starting point is 00:00:00 The content here is for informational purposes only, should not be taken as legal business, tax, or investment advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com slash disclosures. Hello and welcome to the A16Z podcast. Sequencing the human gene has dramatically changed how we understand how we as human beings are coded. We're now entering a phase of building an applications layer on top of the sequencing layer. So how do we make sense of and apply all this new information that genomics gives us? And what will this translate into as it meets the realities of the healthcare system?

Starting point is 00:00:40 This conversation, which took place at our annual summit event in November 2017, includes Carlos Orya, co-founder and CEO of Jungla, and Gabe Ott, co-founder and CEO of Freenome, and was moderated by A16Z general partner Jorge Condé. The first Human Genome Project took $3 billion over 13 years. to generate a single human genome. Today, we can do that same amount of work for about a thousand bucks in a couple of days. In 1999, a writer by the name of Matt Ridley

Starting point is 00:01:10 writes a book called The Genome, an autobiography of a species in 23 chapters. That book was a fascinating example of the optimism that we all had as the first human genome project was coming to an end. But what we've also learned is that reading the DNA isn't the same thing as understanding it. We've gone through this period,

Starting point is 00:01:29 of really trying to make sense of all of this information. But what's extraordinary is that now that we can sequence DNA quickly and reliably and cheaply, we've created this incredible sequencing layer on top of which we can build applications. So I think that's one of the big questions is how do we think about actionability? How do we think about deriving meaning from our ability to interpret? Both of you are developing applications in the clinical space. So let's start with Junglowe. Carlos, if memory serves every time a new human,

Starting point is 00:01:59 genome sequence is completed, there are on the order of 3 million new variants identified. That's right. So, you know, people talk about missing the forest for the trees. How do you make sense of all of that information? How do you figure out which trees matter in that forest? You start by really trying to form context. And so although we nowadays have access to our genomes, when we go to interpret our genomes, there's really only a certain number of places in the genome that are relevant for any given condition that we're considering. And putting the information in context of a condition, in context of a family history, in context of other tests, is really important.

Starting point is 00:02:43 And so you start there, and then you then look at individual genes that are associated with these. Typically, it's genes. There are other types of elements. And we look for the effects of variation in there. And one of the interesting things is that while, yes, it costs roughly $1,000 and going down to the hundreds now per genome to acquire the data, the cost of interpreting that data is actually really high. Although there are 3 million variants identified, there will be roughly 100 variants that are novel variants in disease-associated genes. So these are places of the genome that we know really matter. And interpreting each one of those under the current clinical practices costs $50 to $100. So we're talking about a hundred to thousand-fold increase in the cost of interpretation

Starting point is 00:03:27 relative to the cost of data acquisition. So we build models computational and experimental to provide variant interpretation teams guidance that can tell them or help them understand how these variants relate to a molecular and cellular effects so they can interpret these. And for the most part, are you looking at variants that are associated with an increased risk of disease, or are you looking at variants that are causing a disease? Most part, it's increasing risk of disease. There are, of course, causal variance.

Starting point is 00:03:59 For many genes, there's a spectrum. There are mutations that have very strong effects, and then a gradient of mutations that have lower effects. And understanding these differences is really, you know, it's an ongoing challenge. If we look across all of the disease-associated genes that we know today, we only have clinical interpretations for roughly 0.6% of the possible mutations in them.

Starting point is 00:04:20 So 0.6%. Point six percent. Wow. Yeah. How do you actually communicate this information to a physician? Because the vast majority of doctors out there are not geneticists. Agreed. And, you know, something interesting that's happened is that while it is, you know,

Starting point is 00:04:33 physicians and doctors who order the test, increasingly this interpretation is happening in the back end in really the genetic test provider space, where you have dedicated teams of interpreters that are doing the work of classifying these mutations. And so the goal is really. I think to give to physicians really very clear guidance on what the effects of mutations are. And when we don't know, we need to say also we really don't know. What we do is really we build these models that can say, okay, for all these other mutations, you know, 9.4% of mutations that we don't know.

Starting point is 00:05:08 Here is a subset for which we can make predictions. And this is how well the predictions do. And here's the diagnostic metrics of value. They're prospectively tracked and they can be audited. Speaking of diagnostics, Gabe, you're working on a diagnostic application as well. using DNA. So walk us through what you're doing. I want to understand how what we're doing now with this kind of technology is different than how we've all sort of historically viewed diagnostics. Sure. It's good to take a step back and see or talk about like what do we mean by a gene or what do we

Starting point is 00:05:38 mean by the DNA right now, right? So a bunch of people got 23 and me done. The truth of the matter is is 23 and me sequences is trying to capture an information about you from less than 1%. And that's just the static DNA, what we think of as like DNA that we're born with. But DNA is not static. DNA is actually incredibly dynamic and it changes in all sorts of ways. This is the reason why twins with the same DNA have very different outcomes. Think of it within your own bodies as well, right? In your bodies, there are neurons that are a meter long, right? Single cell, meter long. And then there are white blood cells that are literally turning over every day. But they have the same DNA. How does that happen? How do we get these radically different phenotypes, the truth of the matter is, is what is in your DNA

Starting point is 00:06:23 is probably far less consequential than how your DNA is being used. And when these genes are being turned on and off, because the truth of the matter is, is less than 1% of your DNA is being used by any particular cell. And so it really matters what that 1% is that ultimately makes your cells what they are, makes you who you are. So when you're looking at it from sort of a non-scentrales, static perspective, the DNA that's, for example, floating in your blood that is turning over every 20 minutes can give you an insight of what's happening in your body. Why are the cells in your body dying at that moment? And what is the composition of the cells in your body? These are all relevant, dynamic information that can be read out from your dynamic DNA that you can't get

Starting point is 00:07:13 from your static DNA. And the majority of the focus in the past 10, 20 years on our DNA has been really focused on that static DNA, and not to mention a very small sub fraction of that static DNA. What we're looking at at Freenome is that dynamic DNA as assayed from our blood sample that allows us to get an instantaneous snapshot of your molecular health, which will then allow us to know whether you have a particular disease like cancer. So this is an important distinction because I think one of the things that was so incredibly exciting and promising regarding this ability to have a sequencing layer on top of which we could build diagnostic applications was this idea that you sequence once and every time you want

Starting point is 00:07:57 to test it's just a software query. So the idea of every time you want to do a new test, you basically have to redo the biology, that goes away if you're looking at inherited risk. But what you're describing is that since DNA is dynamic, you would actually sequence in time periods, much like we get dental x-rays every year. Yeah, if you're really talking about being able to get a sense of how your body is changing over time, you'd have to sequence multiple times because the DNA that you're born with, that static DNA is not deterministic enough where you can predict everything from that single point of sequencing.

Starting point is 00:08:34 And so we're talking about how your body, how DNA is so dynamic in your body. Does that mean that the applications for what you could diagnose using this approach are very broad? Or is it just cancer? Well, I think it really depends on what type of DNA you're looking for, right? If you're looking at DNA fragments that are in your bloodstream that are coming from the cancer cells, and that's all you're focusing on, then presumably you can only detect cancer. What we're detecting is DNA fragments that are actually coming from the immune cells that are turning over in your body. And if you can capture how your immune system is changing at different times, it's sort of the common denominator to all disease conditions.

Starting point is 00:09:13 If there's something wrong with you, chances are your immune system is changing in some way. And yes, that signal is extremely convoluted, and it's very hard to figure out what type of change is specific to a particular disease state. But that's where things like machine learning, artificial intelligence comes in to help us figure out the specific signal so that we can turn that into a specific diagnostic for a disease. but the underlying biology should theoretically enable us to detect any diseases where there is an immune change. So you guys have laid out the promise of this approach. Let's talk a little bit about the potential peril. Carlos, you were talking about what to do when you find a variant that has an unknown clinical significance. And there was a lawsuit in Oregon where a woman had a hysterectomy

Starting point is 00:09:59 because her physician misinterpreted the genetic test. Yeah, that's right. And it's really unfortunate because this is a very common situation where she had a family history. She got a genetic test. The genetic test actually came back negative, but they included in the test information that said that there was a variant that was found that was of unknown significance. And so the test very clearly indicated there was no clinically significant mutation found by the sort of practicing guidelines. It was a negative test. Yet for, I think, reasons that these to me are unknown. The recommendation for her was, that she would have a bilateral mastectomy and the hysterectomy.

Starting point is 00:10:38 And I think it was particularly sad because the mutation was actually in a gene that's not associated strongly, at least, with breast cancer. It was in MLH1, I believe. And, you know, there's a strong association with colorectal cancer and endometrial cancer, but there really was no strong support for a breast cancer association. To me, this poses the challenges of how complex, really, this data has gotten to communicate even to physicians, and then to have that message go clearly to patients.

Starting point is 00:11:07 That's an incredible story. So, look, thinking about Freenome, you know, perhaps this is unfair comparison or an example, but if we think about early screening, we've been using mammograms for 30, 40 years, and the data suggests that while we've actually done a lot of early detection of breast cancer using mammography, the number of late-stage cancers, breast cancers, actually hasn't gone down. So that implies that we've actually over-diagnosed things. or in some cases diagnose the wrong things. Is there an analogy here, or is that a worry for you?

Starting point is 00:11:38 And if so, how do you control it? So there's a lot of concerns in terms of launching a diagnostic. I think what you're talking about is really on the technology, the science side of things, right? And this is one of the really interesting things for us, because breast cancer and specifically mammography as a screening method has a false positive rate of 50%. 50%. So from a false positive perspective, you're better off flipping a coin than doing a mammography, really. And there's a reason for this, which is that no clinical trial that we've done has ever been large enough.

Starting point is 00:12:10 You don't actually compensate for all the false positive cases that could potentially happen, all the false negative cases that could potentially happen in a clinical trial. And so once you launch these diagnostics into the market, the only direction the performance goes is down. It really doesn't really go up, right? Because there are all these edge cases that you didn't account for them. for the first time in our history, we're able to compensate for that problem, where because we can now make a diagnostic that's fundamentally AI-based, even after we launched a test into market, we can actually work with our partners to get results of the test that we sell back so that we can teach the artificial intelligence that it made mistakes after we've sort of launched a test, and it never makes that mistake again. So for the first time, we have an opportunity to make the direction of the accuracy of a test after we launch it go up as opposed to go down.

Starting point is 00:13:08 And that's really the only way we're going to compensate for this, because the largest clinical trial that's ever been announced but hasn't been performed yet is 120,000 people. Last year, 35 million people should have gotten screen for colorectal cancer in the United States alone and didn't. 35 million. 35 million. Right.

Starting point is 00:13:25 So if you can capture even a fraction of that market and learn from that, information and make sure that you don't make mistakes from those tests ever again, then all of a sudden you have a clinical trial that's an order of magnitude, if not two, greater than the largest clinical trial that's ever been announced. So what you're both doing is so potentially transformative for our health care system? Who pays for this? How does this get covered? Especially on the diagnostic side, reimbursement is a particularly tough issue. There's so many stakeholders in health care that it is not clear who pays for it. If you're looking at the average statistics, generally when you're launching a diagnostic test, only about 20% of the test that you sell

Starting point is 00:14:05 actually get fully reimbursed. So 80% of the test that you're selling is actually not being paid for properly. Long story short is most of the time, payers don't see a clear return on investment that the diagnostic test represent. So if I pay $500 for this test now, am I actually going to make that money back because I'm detecting this disease earlier so we don't have to spend as much money curing this person when the disease has progressed further. That's a question that we need to be able to answer clearly before we can get these tests paid for by the pairs. I think we're stuck in this model where we're relying on pairs to pay for these tests. There are new models that are coming out that's leveraging life insurance companies that's leveraging these closed systems

Starting point is 00:14:56 where these hospitals, their own payers, that are easier to sort of convince of the value of the tests. I think we're going to see leveraging of these new models much more, but it's still into early days. Excellent. Thank you, Carlos. Thank you, Gabe, for being here.

Starting point is 00:15:11 You're working in a fascinating space, and it clearly are going to change how we think about disease forever and for always. So thank you. Thank you, guys.

The a16z Show - a16z Podcast: Revisiting the Gene

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.