a16z Podcast - a16z Podcast: Revisiting the Gene
Episode Date: January 10, 2018The complete sequencing of the human genome is one of the most powerful examples of technology and science in action: We've gone from needing $3 billion and over 13 years to read a single human genome... to today, to where we can do that same amount of work for about $1,000 in roughly 2 days -- and the price will only continue to drop. But beyond pricing, what does understanding the gene -- and moving from the sequencing layer to the applications layer -- mean to us; what new questions arise now that we can sequence DNA quickly, reliably, and cheaply? This conversation -- with co-founder and CEO of Jungla Carlos Araya and co-founder and CEO of Freenome Gabe Otte, moderated by a16z General Partner Jorge Conde (based on a discussion that took place at a16z’s annual Summit in November 2017) -- takes a step back and considers all these questions. Every time a human genome sequence is completed, there are on the order of 3,000,000 new variants identified. So how do we think about interpreting all that data? Actionability? And how do we derive meaning from all this, for applications in the clinical space? ––– The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.
Transcript
Discussion (0)
The content here is for informational purposes only, should not be taken as legal business tax
or investment advice or be used to evaluate any investment or security and is not directed
at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com
slash disclosures. Hello and welcome to the A16Z podcast. Sequencing the human gene has
dramatically changed how we understand how we as human beings are coded. We're now entering a phase
of building an applications layer on top of the sequencing layer. So how do we make sense of
and apply all this new information that genomics gives us? And what will this translate into as it
meets the realities of the health care system? This conversation, which took place at our annual
summit event in November 2017, includes Carlos Orya, co-founder and CEO of Jungla, and Gabe Ott,
co-founder and CEO of Freenome, and was moderated by A16Z general partner Jorge Condé.
The first Human Genome Project took $3 billion over 13 years to generate a single human genome.
Today we can do that same amount of work for about $1,000 in a couple of days.
In 1999, a writer by the name of Matt Ridley writes a book called The Genome,
An Autobiography of a Species in 23 chapters.
That book was a fascinating example of the optimism that we all had as the first human genome project was coming to an end.
But what we've also learned is that reading the DNA,
isn't the same thing as understanding it.
We've gone through this period
of really trying to make sense
of all of this information.
But what's extraordinary
is that now that we can sequence DNA
quickly and reliably and cheaply,
we've created this incredible sequencing layer
on top of which we can build applications.
So I think that's one of the big questions
is how do we think about actionability?
How do we think about deriving meaning
from our ability to interpret?
Both of you are developing applications
in the clinical space.
So let's start with Junglo.
Carlos, if memory serves every time a new human genome sequence is completed, there are on the
order of three million new variants identified.
That's right.
So, you know, people talk about missing the forest for the trees.
How do you make sense of all of that information?
How do you figure out which trees matter in that forest?
You start by really trying to form context.
And so although we nowadays have access to.
our genomes. When we go to interpret our genomes, there's really, you know, only a certain number of
places in the genome that are relevant for any given condition that we're considering. And putting
the information in context of a condition, in context of a family history, in context of other tests,
is really important. And so you start there, and then you then look at individual genes that are
associated with these. Typically, it's genes. There are other types of elements. And we look for
the effects of variation in there. And one of the interesting things is that while, yes, it costs
roughly a thousand dollars and going down to the hundreds now per genome to acquire the data,
the cost of interpreting that data is actually really high. Although there are three million
variants identified, there will be roughly a hundred variants that are novel variants in disease
associated genes. So these are places of the genome that we know really matter. And interpreting
each one of those under the current clinical practices costs $50 to $100.
So we're talking about a hundred to thousand-fold increase in the cost of interpretation
relative to the cost of data acquisition.
So we build models computational and experimental to provide variant interpretation teams guidance
that can tell them or help them understand how these variants relate to a molecular and cellular
effects so they can interpret these.
And for the most part, are you looking at variants that are associated with an increased
risk of disease, or are you looking at variants that are causing a disease? Most part, it's
increasing risk of disease. There are, of course, causal variants. For many genes, there's a
spectrum. There are mutations that have very strong effects, and then a gradient of mutations
that have lower effects. And understanding these differences is really, you know, it's an ongoing
challenge. If we look across all of the disease-associated genes that we know today, we only have
clinical interpretations for roughly 0.6% of the possible mutations in them. So 0.6%.
0.6%. Wow. Yeah. How do you actually communicate this information to a physician? Because the vast
majority of doctors out there are not geneticists. Agreed. And, you know, something interesting
that's happened is that while it is, you know, physicians and doctors who order the test,
increasingly this interpretation is happening in the back end in really the genetic test provider space,
where you have dedicated teams of interpreters that are doing the work of classifying these mutations.
And so the goal is really, I think, to give to physicians really very clear guidance on what the effects of mutations are.
And when we don't know, we need to say, also, we really don't know.
What we do is really we build these models that can say, okay, for all these other mutations, you know, 9.4% of mutations that we don't know.
Here is a subset for which we can make predictions.
And this is how well the predictions do.
and here's the diagnostic metrics of value.
They're prospectively tracked and they can be audited.
Speaking of diagnostics, Gabe, you're working on a diagnostic application as well using DNA.
So walk us through what you're doing.
I want to understand how what we're doing now with this kind of technology is different
than how we've all sort of historically viewed diagnostics.
Sure.
It's good to take a step back and talk about like what do we mean by a gene or what do we mean by the DNA right now, right?
So a bunch of people got 23 and me done.
The truth of the matter is, is 23 and me sequences
is trying to capture an information about you from less than 1%.
And that's just the static DNA,
what we think of as like DNA that we're born with.
But DNA is not static.
DNA is actually incredibly dynamic,
and it changes in all sorts of ways.
This is the reason why twins with the same DNA
have very different outcomes.
Think of it within your own bodies as well, right?
In your bodies, there are neurons that are a meter long,
single cell meter long and then there are white blood cells that are literally turning over every day
but they have the same DNA how does that happen how do we get these radically different phenotypes
the truth of the matter is is what is in your DNA is probably far less consequential than how
your DNA is being used and when these genes are being turned on and off because the truth of the matter
is is less than 1% of your DNA is being used by any particular cell and so it really matter
what that 1% is that ultimately makes your cells what they are, makes you who you are.
So when you're looking at it from sort of a non-static perspective, the DNA that's, for example,
floating in your blood that is turning over every 20 minutes can give you an insight of what's
happening in your body. Why are the cells in your body dying at that moment? And what is the
composition of the cells in your body? These are all relevant dynamic information that can be
read out from your dynamic DNA that you can't get from your static DNA. And the majority of the
focus in the past 10, 20 years on our DNA has been really focused on that static DNA and not to
mention a very small sub fraction of that static DNA. What we're looking at at Freenome is that
dynamic DNA as assayed from our blood sample that allows us to get an instantaneous snapshot of
your molecular health, which will then allow us to know whether you have a particular disease
like cancer. So this is an important distinction because I think one of the things that was so
incredibly exciting and promising regarding this ability to have a sequencing layer on top of which
we could build diagnostic applications was this idea that you sequence once and every time you
want to test, it's just a software query. So the idea of every time you want to do a new test,
you basically have to redo the biology, that goes away if you're looking at inheriting.
risk. But what you're describing is that since DNA is dynamic, you would actually sequence
in time periods, much like we get, you know, dental x-rays every year. Yeah, if you're really talking
about being able to get a sense of how your body is changing over time, you'd have to sequence
multiple times because the DNA that you're born with, that static DNA is not deterministic enough
where you can predict everything from that single point of sequencing. And so we're talking about how
of your body, how DNA is so dynamic in your body. Does that mean that the applications for what
you could diagnose using this approach are very broad? Or is it just cancer? Well, I think it really
depends on what type of DNA you're looking for, right? If you're looking at DNA fragments that are
in your bloodstream that are coming from the cancer cells, and that's all you're focusing on,
then presumably you can only detect cancer. What we're detecting is DNA fragments that are
actually coming from the immune cells that are turning over in your body. And if you can
capture how your immune system is changing at different times, it's sort of the common denominator
to all disease conditions. If there's something wrong with you, chances are your immune system
is changing in some way. And yes, that signal is extremely convoluted, and it's very hard to figure
out what type of change is specific to a particular disease state. But that's where things like
machine learning, artificial intelligence comes in to help us figure out the specific signal
so that we can turn that into a specific diagnostic for a disease.
But the underlying biology should theoretically enable us to detect any diseases where there is an immune change.
So you guys have laid out the promise of this approach.
Let's talk a little bit about the potential peril.
Carlos, you were talking about what to do when you find a variant that has an unknown clinical significance.
And there was a lawsuit in Oregon where a woman had a hysterectomy because her physician misinterpreted
the genetic test. Yeah, that's right. And it's really unfortunate because this is a very common
situation where she had a family history. She got a genetic test. The genetic test actually
came back negative, but they included in the test information that said that there was a variant
that was found that was of unknown significance. And so the test very clearly indicated there
was no clinically significant mutation found by the sort of practicing guidelines. It was a negative
test. Yet for, I think, reasons that, at least to me, are unknown. The recommendation for her
was that she would have a bilateral mastectomy and the hysterectomy. And it's, I think it was
particularly sad because the mutation was actually in a gene that's not associated strongly,
at least, with breast cancer. It was in MLH1, I believe. And, you know, there's a strong association
with colorectal cancer and endometrial cancer, but there really was no strong support for a breast
cancer association. To me, this poses the challenges of how complex, really, this data has gotten
to communicate even to physicians, and then to have that message go clearly to patients.
That's an incredible story. Yeah. So, look, thinking about Freenome, you know, perhaps this is
unfair comparison or an example, but if we think about early screening, we've been using mammograms
for 30, 40 years, and the data suggests that while we've actually done a lot of early detection
of breast cancer using mammography, the number of late-stage cancers, breast cancers,
actually hasn't gone down. So that implies that we've actually over-diagnosed things,
or in some cases diagnosed the wrong things. Is there an analogy here, or is that a worry for you,
and if so, how do you control it? So there's a lot of concerns in terms of launching a diagnostic.
I think what you're talking about is really on the technology, the science side of things, right?
And this is one of the really interesting things for us, because breast cancer and specifically
mammography as a screening method has a false positive rate of 50%.
50%. So from a false positive perspective, you're better off flipping a coin than, you know,
doing a mammography, really. And there's a reason for this, which is that no clinical trial that
we've done has ever been large enough. You don't actually compensate for all the false positive
cases that could potentially happen, all the false negative cases that could potentially happen
in a clinical trial.
And so once you launch these diagnostics into the market,
the only direction the performance goes is down.
It really doesn't really go up, right?
Because there are all these edge cases
that you didn't account for.
For the first time in our history,
we're able to compensate for that problem,
where because we can now make a diagnostic
that's fundamentally AI-based,
even after we launched a test into the market,
we can actually work with our partners
to get results of,
the test that we sell back so that we can teach the artificial intelligence that it made mistakes
after we've sort of launched a test and it never makes that mistake again. So for the first time,
we have an opportunity to make the direction of the accuracy of a test after we launch it go up
as opposed to go down. And that's really the only way we're going to compensate for this because
the largest clinical trial that's ever been announced but hasn't been performed yet is 120,000
thousand people. Last year, 35 million people should have gotten screened for colorectal cancer
in the United States alone and didn't. 35 million. 35 million. Right. So if you can capture
even a fraction of that market and learn from that information and make sure that you don't make
mistakes from those tests ever again, then all of a sudden you have a clinical trial that's
an order of magnitude, if not too, greater than the largest clinical trial that's ever been
announced. So what you're both doing is so potentially transformative for our health
system. Who pays for this? How does this get covered? Especially on the diagnostic side, reimbursement
is a particularly tough issue. There's so many stakeholders in health care that it is not clear
who pays for it. If you're looking at the average statistics, generally when you're launching a
diagnostic test, only about 20% of the tests that you sell actually get fully reimbursed.
So 80% of a test that you're selling is actually not being paid for properly. Long story short is
most of the time, payers don't see a clear return on investment that the diagnostic test
represent. So if I pay $500 for this test now, am I actually going to make that money back because I'm
detecting this disease earlier so we don't have to spend as much money curing this person
when the disease has progressed further? That's a question that we need to be able to answer
clearly before we can get these tests paid for by the pairs. I think we're stuck in this model
where we're relying on pairs to pay for these tests. There are new models that are coming out
that's leveraging life insurance companies that's leveraging these closed systems where these
hospitals, their own payers that are easier to sort of convince of the value of the tests.
I think we're going to see leveraging of these new models much more, but it's still into early
days. Excellent. Thank you, Carlos. Thank you, Gabe, for being here. You're working in a fascinating
space, and it's clear you're going to change how we think about disease forever and for always.
So thank you. Thank you. Thank you, guys.