The a16z Show - a16z Podcast: Revisiting the Gene
Episode Date: January 10, 2018The complete sequencing of the human genome is one of the most powerful examples of technology and science in action: We've gone from needing $3 billion and over 13 years to read a single human genome... to today, to where we can do that same amount of work for about $1,000 in roughly 2 days -- and the price will only continue to drop. But beyond pricing, what does understanding the gene -- and moving from the sequencing layer to the applications layer -- mean to us; what new questions arise now that we can sequence DNA quickly, reliably, and cheaply? This conversation -- with co-founder and CEO of Jungla Carlos Araya and co-founder and CEO of Freenome Gabe Otte, moderated by a16z General Partner Jorge Conde (based on a discussion that took place at a16z’s annual Summit in November 2017) -- takes a step back and considers all these questions. Every time a human genome sequence is completed, there are on the order of 3,000,000 new variants identified. So how do we think about interpreting all that data? Actionability? And how do we derive meaning from all this, for applications in the clinical space? ––– The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
The content here is for informational purposes only, should not be taken as legal business, tax,
or investment advice, or be used to evaluate any investment or security and is not directed at any
investors or potential investors in any A16Z fund. For more details, please see A16Z.com
slash disclosures. Hello and welcome to the A16Z podcast. Sequencing the human gene has dramatically
changed how we understand how we as human beings are coded. We're now entering a phase of building
an applications layer on top of the sequencing layer.
So how do we make sense of and apply all this new information that genomics gives us?
And what will this translate into as it meets the realities of the healthcare system?
This conversation, which took place at our annual summit event in November 2017, includes
Carlos Orya, co-founder and CEO of Jungla, and Gabe Ott, co-founder and CEO of Freenome,
and was moderated by A16Z general partner Jorge Condé.
The first Human Genome Project took $3 billion over 13 years.
to generate a single human genome.
Today, we can do that same amount of work
for about a thousand bucks in a couple of days.
In 1999, a writer by the name of Matt Ridley
writes a book called The Genome,
an autobiography of a species in 23 chapters.
That book was a fascinating example of the optimism
that we all had as the first human genome project
was coming to an end.
But what we've also learned is that reading the DNA
isn't the same thing as understanding it.
We've gone through this period,
of really trying to make sense of all of this information.
But what's extraordinary is that now that we can sequence DNA quickly and reliably and
cheaply, we've created this incredible sequencing layer on top of which we can build applications.
So I think that's one of the big questions is how do we think about actionability?
How do we think about deriving meaning from our ability to interpret?
Both of you are developing applications in the clinical space.
So let's start with Junglowe.
Carlos, if memory serves every time a new human,
genome sequence is completed, there are on the order of 3 million new variants identified.
That's right. So, you know, people talk about missing the forest for the trees. How do you make
sense of all of that information? How do you figure out which trees matter in that forest?
You start by really trying to form context. And so although we nowadays have access to our genomes,
when we go to interpret our genomes, there's really only a certain number of places in the genome
that are relevant for any given condition that we're considering.
And putting the information in context of a condition, in context of a family history,
in context of other tests, is really important.
And so you start there, and then you then look at individual genes that are associated with these.
Typically, it's genes. There are other types of elements.
And we look for the effects of variation in there.
And one of the interesting things is that while, yes, it costs roughly $1,000 and going down to the hundreds now per genome to acquire the data, the cost of interpreting that data is actually really high.
Although there are 3 million variants identified, there will be roughly 100 variants that are novel variants in disease-associated genes.
So these are places of the genome that we know really matter.
And interpreting each one of those under the current clinical practices costs $50 to $100.
So we're talking about a hundred to thousand-fold increase in the cost of interpretation
relative to the cost of data acquisition.
So we build models computational and experimental to provide variant interpretation teams guidance
that can tell them or help them understand how these variants relate to a molecular and cellular
effects so they can interpret these.
And for the most part, are you looking at variants that are associated with an increased
risk of disease, or are you looking at variants that are causing a disease?
Most part, it's increasing risk of disease.
There are, of course, causal variance.
For many genes, there's a spectrum.
There are mutations that have very strong effects,
and then a gradient of mutations that have lower effects.
And understanding these differences is really, you know,
it's an ongoing challenge.
If we look across all of the disease-associated genes that we know today,
we only have clinical interpretations
for roughly 0.6% of the possible mutations in them.
So 0.6%.
Point six percent.
Wow.
Yeah.
How do you actually communicate this information to a physician?
Because the vast majority of doctors out there are not geneticists.
Agreed.
And, you know, something interesting that's happened is that while it is, you know,
physicians and doctors who order the test, increasingly this interpretation is happening
in the back end in really the genetic test provider space, where you have dedicated teams
of interpreters that are doing the work of classifying these mutations.
And so the goal is really.
I think to give to physicians really very clear guidance on what the effects of mutations are.
And when we don't know, we need to say also we really don't know.
What we do is really we build these models that can say, okay, for all these other mutations,
you know, 9.4% of mutations that we don't know.
Here is a subset for which we can make predictions.
And this is how well the predictions do.
And here's the diagnostic metrics of value.
They're prospectively tracked and they can be audited.
Speaking of diagnostics, Gabe, you're working on a diagnostic application as well.
using DNA. So walk us through what you're doing. I want to understand how what we're doing now
with this kind of technology is different than how we've all sort of historically viewed diagnostics.
Sure. It's good to take a step back and see or talk about like what do we mean by a gene or what do we
mean by the DNA right now, right? So a bunch of people got 23 and me done. The truth of the matter is
is 23 and me sequences is trying to capture an information about you from less than 1%. And that's just
the static DNA, what we think of as like DNA that we're born with. But DNA is not static. DNA is
actually incredibly dynamic and it changes in all sorts of ways. This is the reason why twins with
the same DNA have very different outcomes. Think of it within your own bodies as well, right?
In your bodies, there are neurons that are a meter long, right? Single cell, meter long. And then there
are white blood cells that are literally turning over every day. But they have the same DNA. How does that happen? How do we
get these radically different phenotypes, the truth of the matter is, is what is in your DNA
is probably far less consequential than how your DNA is being used. And when these genes are being turned
on and off, because the truth of the matter is, is less than 1% of your DNA is being used by any
particular cell. And so it really matters what that 1% is that ultimately makes your cells what
they are, makes you who you are. So when you're looking at it from sort of a non-scentrales,
static perspective, the DNA that's, for example, floating in your blood that is turning over every
20 minutes can give you an insight of what's happening in your body. Why are the cells in your body
dying at that moment? And what is the composition of the cells in your body? These are all
relevant, dynamic information that can be read out from your dynamic DNA that you can't get
from your static DNA. And the majority of the focus in the past 10, 20 years on our DNA has been
really focused on that static DNA, and not to mention a very small sub fraction of that static DNA.
What we're looking at at Freenome is that dynamic DNA as assayed from our blood sample
that allows us to get an instantaneous snapshot of your molecular health, which will then
allow us to know whether you have a particular disease like cancer.
So this is an important distinction because I think one of the things that was so incredibly
exciting and promising regarding this ability to have a sequencing layer on top of which we could
build diagnostic applications was this idea that you sequence once and every time you want
to test it's just a software query. So the idea of every time you want to do a new test,
you basically have to redo the biology, that goes away if you're looking at inherited risk.
But what you're describing is that since DNA is dynamic, you would actually sequence
in time periods, much like we get dental x-rays every year.
Yeah, if you're really talking about being able to get a sense of how your body is changing
over time, you'd have to sequence multiple times because the DNA that you're born with,
that static DNA is not deterministic enough where you can predict everything from that single
point of sequencing.
And so we're talking about how your body, how DNA is so dynamic in your body.
Does that mean that the applications for what you could diagnose using this approach are very
broad? Or is it just cancer? Well, I think it really depends on what type of DNA you're looking for,
right? If you're looking at DNA fragments that are in your bloodstream that are coming from the cancer
cells, and that's all you're focusing on, then presumably you can only detect cancer. What we're
detecting is DNA fragments that are actually coming from the immune cells that are turning over
in your body. And if you can capture how your immune system is changing at different times,
it's sort of the common denominator to all disease conditions.
If there's something wrong with you, chances are your immune system is changing in some way.
And yes, that signal is extremely convoluted, and it's very hard to figure out what type of change is specific to a particular disease state.
But that's where things like machine learning, artificial intelligence comes in to help us figure out the specific signal
so that we can turn that into a specific diagnostic for a disease.
but the underlying biology should theoretically enable us to detect any diseases where there is an
immune change. So you guys have laid out the promise of this approach. Let's talk a little bit about
the potential peril. Carlos, you were talking about what to do when you find a variant that has an
unknown clinical significance. And there was a lawsuit in Oregon where a woman had a hysterectomy
because her physician misinterpreted the genetic test. Yeah, that's right. And it's really
unfortunate because this is a very common situation where she had a family history. She got a
genetic test. The genetic test actually came back negative, but they included in the test
information that said that there was a variant that was found that was of unknown significance.
And so the test very clearly indicated there was no clinically significant mutation found
by the sort of practicing guidelines. It was a negative test. Yet for, I think, reasons that
these to me are unknown. The recommendation for her was,
that she would have a bilateral mastectomy and the hysterectomy.
And I think it was particularly sad because the mutation was actually in a gene that's not
associated strongly, at least, with breast cancer.
It was in MLH1, I believe.
And, you know, there's a strong association with colorectal cancer and endometrial cancer,
but there really was no strong support for a breast cancer association.
To me, this poses the challenges of how complex, really,
this data has gotten to communicate even to physicians,
and then to have that message go clearly to patients.
That's an incredible story.
So, look, thinking about Freenome, you know, perhaps this is unfair comparison or an example,
but if we think about early screening, we've been using mammograms for 30, 40 years,
and the data suggests that while we've actually done a lot of early detection of breast cancer using mammography,
the number of late-stage cancers, breast cancers, actually hasn't gone down.
So that implies that we've actually over-diagnosed things.
or in some cases diagnose the wrong things.
Is there an analogy here, or is that a worry for you?
And if so, how do you control it?
So there's a lot of concerns in terms of launching a diagnostic.
I think what you're talking about is really on the technology, the science side of things, right?
And this is one of the really interesting things for us,
because breast cancer and specifically mammography as a screening method has a false positive rate of 50%.
50%.
So from a false positive perspective, you're better off flipping a coin than doing a mammography, really.
And there's a reason for this, which is that no clinical trial that we've done has ever been large enough.
You don't actually compensate for all the false positive cases that could potentially happen,
all the false negative cases that could potentially happen in a clinical trial.
And so once you launch these diagnostics into the market, the only direction the performance goes is down.
It really doesn't really go up, right?
Because there are all these edge cases that you didn't account for them.
for the first time in our history, we're able to compensate for that problem, where because we can now make a diagnostic that's fundamentally AI-based, even after we launched a test into market, we can actually work with our partners to get results of the test that we sell back so that we can teach the artificial intelligence that it made mistakes after we've sort of launched a test, and it never makes that mistake again.
So for the first time, we have an opportunity to make the direction of the accuracy of a test
after we launch it go up as opposed to go down.
And that's really the only way we're going to compensate for this,
because the largest clinical trial that's ever been announced but hasn't been performed yet
is 120,000 people.
Last year, 35 million people should have gotten screen for colorectal cancer in the United States alone
and didn't.
35 million.
35 million.
Right.
So if you can capture even a fraction of that market and learn from that,
information and make sure that you don't make mistakes from those tests ever again, then all of a
sudden you have a clinical trial that's an order of magnitude, if not two, greater than the largest
clinical trial that's ever been announced. So what you're both doing is so potentially transformative
for our health care system? Who pays for this? How does this get covered? Especially on the
diagnostic side, reimbursement is a particularly tough issue. There's so many stakeholders in
health care that it is not clear who pays for it. If you're looking at the average statistics,
generally when you're launching a diagnostic test, only about 20% of the test that you sell
actually get fully reimbursed. So 80% of the test that you're selling is actually not being
paid for properly. Long story short is most of the time, payers don't see a clear return on
investment that the diagnostic test represent. So if I pay $500 for this test now, am I actually
going to make that money back because I'm detecting this disease earlier so we don't have to spend
as much money curing this person when the disease has progressed further. That's a question that
we need to be able to answer clearly before we can get these tests paid for by the pairs. I think
we're stuck in this model where we're relying on pairs to pay for these tests. There are new models
that are coming out that's leveraging life insurance companies that's leveraging these closed systems
where these hospitals, their own payers,
that are easier to sort of convince
of the value of the tests.
I think we're going to see leveraging
of these new models much more,
but it's still into early days.
Excellent. Thank you, Carlos.
Thank you, Gabe, for being here.
You're working in a fascinating space,
and it clearly are going to change
how we think about disease forever and for always.
So thank you.
Thank you, guys.
