a16z Podcast - 2024 Big Ideas: Miracle Drugs, Programmable Medicine, and AI Interpretability

Episode Date: December 8, 2023

Smart energy grids. Voice-first companion apps. Programmable medicines. AI tools for kids.
We asked over 40 partners across a16z to preview one big ideathey believe will drive innovation in 2024.Her...e in our 3-part series, you’ll hear directly from partners across all our verticals, as we dive even more deeply into these ideas. What’s the why now? Who is already building in these spaces? What opportunities and challenges are on the horizon? And how can you get involved?View all 40+ big ideas: https://a16z.com/bigideas2024 Stay Updated: Find a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://twitter.com/stephsmithioPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Transcript
Discussion (0)
Starting point is 00:00:00 GOP-1s are to obesity, what eyeglasses are to nearsightedness. Where are the reusable rockets for biotech? Traditional drug development is painstakingly, time-consuming, risky, and expensive. One molecule has no bearing on the next molecule that gets developed. Like traditional rockets, they're one-time use only. That's changing. Now, as these models begin to be deployed in real-world situations, the big question is why?
Starting point is 00:00:26 Why do these models say the things they do? Why do some prompts produce better results than others? And perhaps most importantly, how do we control what they do? Lots of companies are potentially going to go bankrupt if they have to pay for these drugs, if they have even one employee who ends up being eligible for one of these therapies. And so I do think there is a siren call already from the industry that will sort of ignite a cycle of innovation. In a lot of ways, these new programmable medicines are just a fundamentally new superpower.
Starting point is 00:00:56 Smart Energy Grids, Programmable Medicines. voice-first companion apps, and crime-detecting computer vision. We asked our investment partners across A16Z to preview one big idea that they believe will spur innovation in 2024. Now, our team compiled a list of 40-plus builder-worthy pursuits for the coming year that you can now find at A16Z.com slash big ideas 2024. Or you can click the link in our description. But here in our three-part series, you will hear directly from our partners across all our verticals, from bi-on health to games to American dynamism and more. As we dive even more deeply into these ideas.
Starting point is 00:01:39 We'll cover the why now, who's already building in these spaces, what opportunities and challenges are on the horizon, and of course, how you can get involved. On deck today, we'll cover what it'll take to democratize, quote, miracle drugs like GLP-1s, but also how programmable medicine is taking a page out of the reusable rocket playbook and whether we can take AI from black box to clear box. Let's dive in. As a reminder, the content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice,
Starting point is 00:02:14 or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the company discussed in this podcast. For more details, including a link to our investments, please see A16C.com slash disclosures. First up, can miracle drugs like GLP1s make it to mass market? Or will they, alongside other new carative therapies, break or bankrupt our system? Let's find out. Hi, everyone. I'm Julie Yu, general partner on the biohealth team here at Andrews and Horowitz, and this is my big idea
Starting point is 00:02:55 for 2024. It is about democratizing miracle drugs. So in 2023, a wave of therapies hailed as miracle drugs, including GOP-1s and cell and gene therapies, had a profound impact in patients' lives. But our current health care insurance system is just not set up to bear the cost of these therapies or to accurately gauge their value, given that some are curative. Nor are our healthcare providers prepared to manage the complex logistics, data collection, and clinical operations needed to realize the full benefits of these therapies. We look forward to seeing builders innovating at the intersection of policy, biopharmaceutical manufacturing, financing, and clinical operations so that we have a viable means to bring these miracle drugs to market
Starting point is 00:03:42 without bankrupting or breaking the system. All right, Julie, so I feel like this will be a rarity, but for listeners who may be unfamiliar, let's start with GLP1s. Why are people touting these as miracle drugs? Yeah, so GLP-1s are all the rage, as you're implying. So the premise of GLP-1s is that a GOP-1 is actually a hormone that is naturally occurring in our bodies, and it's generally secreted in response to food intake. So when we eat, this hormone is secreted by our intestinal tract. And the main job of that hormone is to help manage blood sugar levels. And so the initial core use case for the drug form of this hormone was to treat type two diabetes. And obviously type two diabetes has huge prevalence in the American population. It affects over
Starting point is 00:04:25 11% of us. And so one in 10 of the people that you know are likely to be diagnosed with type two diabetes at some point in their life. So that in of itself is obviously hugely impactful. But I think probably what's driving more of the sort of popular awareness of GLP1s is that it has this quote unquote side effect that it also has been shown to lead to weight loss. And part of the reason for that is that it actually suppresses appetite as part of the mechanism of action. And so, therefore, on the basis of that observation, a subset of GOP-1s have actually been approved for treating obesity. And obesity opens up the aperture even broader in terms of the applicability across our population. Sadly, it affects 42% of the American population, which is remarkable.
Starting point is 00:05:09 And there's been a wave of celebrities on TikTok touting their ability to lose upwards of 20% of their weight from taking these drugs. And there's a popular analogy that people say GLP-1s are to obesity, what eyeglasses are to nearsightedness. And so it has that level of sort of societal impact in terms of fixing a disease that has such negative implications for health of our population. So there's been a lot of hype about these drugs. I think the miracle part of it sort of stems from two things. One is its weight loss capabilities. And obviously we as a society are just obsessed with weight loss in general. And so that's obviously been one reason that these drugs are part of the zeitgeist. But then, two, from a clinical perspective, I think what we are all so excited about is that
Starting point is 00:05:49 there appears to be a really compelling set of side benefits related to all of the comorbidities of obesity. So if you are obese, you're very likely to have other illnesses, namely things like cardiovascular disease that lead to heart attacks and strokes and early death, basically. And there's been a number of recent studies that show potentially very significant benefits of these drugs as it related to cardiovascular benefits. which, as we'll get to, has very big significant implications in terms of how health insurance companies would look at this and in general how we think about this as a benefit to society from a health care perspective. So you mentioned one in 10 for type 2 diabetes and 40 plus percent
Starting point is 00:06:29 of Americans are obese. How does that compare to the number of people who are actually on these drugs and maybe where does cost or insurance come into play there as it relates to how many people have access. So just to baseline, everyone's sort of view on this, there's a drug called Humera, which was the best-selling drug ever in the history of American drug industry. And that drug was used by roughly 300,000 Americans each year. And so it had a huge level of impact, but a fairly finite number of humans who were impacted by it. In the case of GLP-1s, there was a study that showed in 2022, based on prescription claims data analysis, that there was roughly 3.6 million prescription claims for GLP ones in that year. And the interesting thing is that I think we're just
Starting point is 00:07:15 getting started, right? So that's obviously a small subset of those who are like clinically eligible, technically speaking, as far as type due diabetes and obesity go. But the insurance policies for covering these drugs are still very, very immature. And that's a lot of the sort of recent media attention that's been paid to this drug class is that you hear about all these insurance denials and, you know, people struggling to get coverage for these drugs through their employer sponsored benefit plans and even on Medicare and Medicaid. And so there's still a ton of questions about under what circumstances it actually makes sense to reimburse for these drugs because they are very, very expensive, as we'll talk about. And that these drugs also, if you stop taking them,
Starting point is 00:07:55 are likely to very much lose their benefits. So you could revert in terms of your weight loss and your type two diabetes management. And so there needs to be probably some degree of demonstration of compliance with the drugs in order for health insurance companies to get comfortable reimbursing them into perpetuity over the course of your life. And so that same study that I refer to showed that is probably only about a fourth, 25% of employer-sponsored insurance plans that today cover these drugs in terms of the benefit. And so I don't think the floodgates have even been close to be opening on these drugs being made accessible to all those who might actually medically qualify for them. Absolutely. You also mentioned curative cell
Starting point is 00:08:33 and gene therapies. Can you give a couple more examples of those? Because it sounds like maybe it's not just GLP-1s that are poised to potentially make a greater impact in a way that maybe our providers or insurers aren't quite ready for. Yeah, I would even argue that cell and gene therapies are probably the more extreme version of what we just described with GLP-1s, both in terms of the miraculous nature of what they do, but also in terms of the cost burden and the clinical burden to our current healthcare system. So there are many chronic or fatal diseases that stem from genetic code mishaps. So you might be born with a gene mutation in your actual native genetic code that results in some kind of debilitating disease. One common example is sickle cell anemia
Starting point is 00:09:14 in which there's a single gene mutation that arises when you're born and that impacts the shape of your red blood cells and has all sorts of very negative implications on your health status throughout your life. And you actually have to get sort of ongoing blood transfusions basically is kind of the current state of the art of how we treat that. So it's both expensive but also extremely taxing for patients and providers. There's also other types of diseases like cancer in which mutations might arise during your life, so after you're born. And so historically, we've had a thesis that if you were to be able to program drugs, cells, and genes essentially to be able to address those kinds of diseases that you could actually entirely cure those diseases so that you
Starting point is 00:09:54 no longer have to deal with that for the rest of your life. And so we now are in an era, where we finally have the first versions of those drugs that are now available on the market. These are programmable medicines, as we call them. So you're either programming a cell to go target a certain cancer type within your body, and it'll be programmed to kill those cancer cells to completely eradicate it from your body. Or you can do a gene therapy which actually changes the genetic makeup of your body so that the disease that you were born with is completely gone. And so those genes are, generally speaking, orders of magnitude more expensive than really anything that we've seen in society to date.
Starting point is 00:10:35 And they're also very, very complex to administer. But effectively, the way I'd put it is that our current system just is not designed at all to be able to handle the adoption and absorption and access to these kinds of therapies today. the ones that are available on the market today are administered in a very outstapestpoke and kind of one-off fashion in terms of how insurance companies cover them in terms of how doctors are sort of managing the therapeutic administration piece and how patients are being handled as far as their patient journeys. And so it's a huge problem. And there are dozens of drugs of this ilk that are projected to be approved and brought out to the market in the next several years. And so we think there's a very sort of imminent why now dynamic around.
Starting point is 00:11:19 and why we need to solve for this sooner than later. Yeah, and I mean, these are the kind of drugs that we've always wanted, right? We've always wished for you use the term miracle drugs. So why is it that our health care system isn't really poised to cushion the introduction of these drugs and the gLP ones that we talked about earlier? What really needs to change in order for the insurers and the providers and the patients to get access, but also, as you said, not completely steamroll or break down the system? So the original sort of blurb here referred to two things. One is that we could bankrupt the system and one is that we could break the system. So on the bankrupting of the system side, some of these cell and gene therapies that I mentioned can be a one-time cost of two to three million dollars and sometimes more. And so if I'm a health insurance company and you're telling me, okay, I've got this individual. I obviously would love to administer a drug that can literally save their life and eradicate this disease that they would have to deal with for the rest of their life. But it's going to cost me $3 million up front.
Starting point is 00:12:19 And as an employer who covers the health insurance benefit for my employees, the average tenure of any employee in my company is maybe three to four years on average and even lower for tech companies. And so what incentive do I have to pay this upfront fee for something that will benefit this person over the entirety of their life? But that person is going to leave my insurance plan in maybe two or three years. And so I'm not actually going to reap the benefits of that upfront cost for many, many years after that person's gone. And so that's kind of the premise of how the current insurance system is designed and why it's not set up to incentivize people to be willing to pay for these drugs up front. And so that's one huge area where there just
Starting point is 00:12:59 needs to be sort of fundamental innovation on just financing mechanisms to underwrite that risk profile in such a way that any individual payer only really has to pay their fair share, let's call it, while that person is on their plan, and that you can sort of spread the risk as the person moves across insurance plans or create a portable product that sort of follows that person throughout their entire life. And that just requires a lot of, again, innovation on the underrunning side in terms of the services that need to wrap around that and lots of other things that relate to that. So that's kind of the financing piece. And then on the sort of breaking the system piece, these drugs, as I mentioned earlier, are very, very complicated to administer. So in the case
Starting point is 00:13:40 of cell and gene therapies, which are probably the most extreme case, you are literally, you know, removing cells from a person's body who has been diagnosed with cancer. You need to transport them to a manufacturing facility where you reprogram the cells. You genetically engineer them. You regrow those cells and then you package them to be sent back to a hospital that is trained and qualified and certified to actually deliver this highly complex therapy. These cells are infused into the patient, and you need to really monitor the patient while this is happening because you might have an immune response that could be really severe and things of that sort. And then that patient generally needs to be monitored in the hospital for many, many weeks subsequently so that you can see
Starting point is 00:14:22 that the drug is working. And all of that, it requires highly specialized expertise on the clinical side, on the operational logistics side, on the manufacturing side. There's entire companies that are being built just to create competencies around manufacturing these kinds of therapies because it's so fundamentally different than any of our previous sort of pill-based therapies or even the injectable therapies is just a very different paradigm. And so that whole landscape is also something that, again, the current status quo biopharma value chain is just not suited to handle, and therefore we need new capabilities to handle it. Yeah, absolutely. Could you speak to maybe the GLP ones a little bit there just in terms of the therapies that you just mentioned are curative, and that introduces all
Starting point is 00:15:03 types of questions around like, what is a life worth? And like you said, who gets access and how does that get kind of rationed over a lifetime. But in the case of something like a GLP1, can you just speak to maybe like the logistics or insurance changes that are required when we're talking about something that is not necessarily curative, but certainly helpful and certainly also has downstream effects, right? If someone's no longer obese, like you said, their cardiovascular risk is lower, but also the way that they operate in the world, what they're buying, what they're involved in, also changes. So what are your thoughts around maybe what needs to happen there? Yeah, so GOP-1s are also relatively expensive, cheaper than cell and gene therapies, but we're talking a thousand dollars a month, roughly speaking, for this class of drugs. And so tens of thousands of dollars over the course of someone's tenure at a given employer that you'd have to cover with the hope that, again, in that time horizon, you are also going to see the cost savings associated with that person avoiding certain downstream health implications. And so there is probably a certain price point at which it becomes a no-brainer. So if you think about something like a high blood pressure drug,
Starting point is 00:16:08 which are these pills that, you know, a vast majority of the population in the U.S. is popping these pills on a daily basis. They cost maybe a few dollars per month for the health plan. And so at that price point, it's really a no-brainer because over a whole population, you will see the threshold for that financial benefit is relatively low. And so there is actually hope that there's a whole pipeline of cheaper versions of the current GLP1 therapies that are in the pipeline expected to be approved in the next several years. But between now and then, it's really that price point that makes it really, really challenging to justify, again, paying this annual cost when you're not sure that you're going to actually see the cost savings associated with it within the time horizon
Starting point is 00:16:47 that the person is on that given plan. That makes sense. Well, maybe as a final question, we are looking to 2024 for this big idea. So I guess I'm going to bunch a few questions in here. What solutions are you already seeing built? Maybe what also roadblocks do you expect on the road to implementing some of these changes. And also, where do you think policy might play a role? What do you see really coming in 2024? Yeah. So the exciting thing is that we've already seen quite a number of great entrepreneurs who are out there building various aspects of solutions that address all the different components of that problem space that I just described. So there are companies that are certainly innovating on the manufacturing side, as I mentioned,
Starting point is 00:17:27 to systematize and really industrialize what today is a highly bespoke process to actually produce these cell and gene therapies in a scalable fashion. There are companies who are helping hospitals and physicians deliver the operational and clinical logistics services related to everything from transporting the drugs to kind of the care management services that the patients need, both pre and post the administration of these drugs, even things like hardware for remote monitoring of these patients such that they don't have to stay in the hospital for many weeks on end. There are companies really addressing the data aspect of this. So one of the things that I always think about as an ex-product person is like imagine as a product manager, if you didn't have any
Starting point is 00:18:06 closed loop of data on how your product was being used in the real world or any feedback from your end users, that is unfortunately like the state of affairs for many drugs that we have on the market today is that once the drug is prescribed and it's out there, there's not really great ways for the manufacturing companies to get feedback back. But in the case of these really expensive therapies, you obviously, it's critical to have some degree of feedback loop, both to continually justify the price of those therapies and make sure that they're working. but also, frankly, to iterate and just make sure that you understand any of the side effects and potential implications of those drugs. And so there's companies that are just building really data infrastructure to enable the collection in a continuous fashion of how these drugs are performing post-market.
Starting point is 00:18:46 And then the last piece is on the financing side. I think we've seen both like traditional incumbent insurance companies trying to spin up new solutions for this area, but they still tend to be pretty one-off and not systematic and scalable across the full class of drugs that could benefit from those approaches. And so we are seeing a number of startups start to say, okay, how can we design sort of new fintech a purchase, basically, to be able to spread this risk in a very different way. So all of those things are out there today and sort of what I would call kind of point solution form. And I think the big opportunity and why this is kind of a hard problem space to go after is that there is a bit of a cold start problem, right? To actually build kind of a holistic solution that solves the entirety of this problem space, you actually need to convene payers, you need to convene providers, you need to convene, manufacturers and obviously also the patients themselves who require a ton of services to get this right. And so I think that remains sort of the big unmet need and huge opportunity for
Starting point is 00:19:41 entrepreneurs to go after is can you thread the needle on bringing together these multiple players with a holistic solution that can actually unlock this close start problem to be able to address this at scale across the entire industry. Do you think innovation will be able to basically thread that needle and ensure that people who need these drugs get them or coming back to that question around policy, do you think that there needs to be a role played there, which basically says that we need to get these drugs out to the people that need them? Yeah, I think policy will definitely play a critical role in the long run. And we already have seen certain sort of guidance briefs come out from various government bodies around this particular issue, a lot of requests for
Starting point is 00:20:22 feedback from the sort of private sector on how they should be approaching this. But I think probably namely two bodies. So the FDA, I think there's going to need to be a lot of change in terms of they handle approvals of these drugs because they are so different and also ongoing monitoring of these drugs to be able to design the right type of approach for these therapies. And then CMS, which administers Medicare and Medicaid, the government-sponsored insurance programs in our country, they generally tend to be at the tip of the sphere of any payment innovation for new classes of drugs and new classes of services. And so the expectation is that they will likely roll out some sort of program that teaches the industry, how they should be approaching.
Starting point is 00:21:02 these kinds of therapies and financing them. But we're in an election year and so I think expectations are that that cycle will be slow. However, we are, I think, getting to a tipping point where both the number of these kinds of drugs that have already been approved, they're already on the market that are already crippling individual companies and employers and insurance companies and even the manufacturers are already hitting a tipping point where lots of companies are potentially going to go bankrupt if they have to pay for these drugs, if they have even one employee who ends up being eligible for one of these therapies. And so I do think there is a siren call already from the industry that will sort of ignite a cycle of innovation, as we're already
Starting point is 00:21:40 seeing, as I mentioned, with a lot of startups that are spitting up in this space. So I think as most things in healthcare, it'll be a combination of both sort of top-down regulation and policy combined with some of the bottoms up activity that we already see starting to happen from those who are just feeling the pain today and just need to kick into action. Next up, we dive even more deeply into some of these therapies, addressing whether changes in technology and policy can usher in a new era of programmable medicine. Hi, my name is Jorge Condi. I'm one of the general partners here at Adreason Horowitz. I'm on the bio-n-health team where I focus on investments in the life sciences. My big idea for the year is
Starting point is 00:22:19 programming medicine's final frontier. Where are the reusable rockets for biotech? Traditional drug development is painstakingly time-consuming, risky, and expensive. It's highly bespoke, too. One molecule has no bearing on the next molecule that gets developed. Like traditional rockets, they're one-time use only. That's changing. SpaceX's rocket reusability has transformed space travel, lowering costs and expanding horizons. Similarly, potentially curative programmable medicines like gene therapy can reuse components
Starting point is 00:22:53 like the delivery vehicles used to target specific cells while swapping out the genetic cargo. The next mission uses the same rocket to deliver a different payload to a new destination. The FDA is looking to the skies and taking a page out of the FAA's approach to aviation safety, rigorous yet adaptive, recently launching its own new office for therapeutic products and piloting Operation Warp Speed for Rare Disease to create more transparent and flexible processes for evaluating and approving programmable medicines. Imagine a future where we redeploy, not reinvent innovation. It will revolutionize how we make medicines and where these medicines can take us.
Starting point is 00:23:34 All right. So Jorge, I feel like this big idea is so compelling. This idea of programming medicine certainly sounds like something that the whole world could benefit from. But before we get into that, maybe you could just break down a little bit further why traditional drug development is, as you say, so painstakingly, time-consuming, risky, and expensive. And maybe also just put a number to that, like how long. does it really take for drugs to be developed in 2023 terms?
Starting point is 00:24:03 So first of all, the reason why it's so painstaking, it's so time consuming, it's risky and expensive is because we're putting something into human beings. And so, of course, the bar of what we're going to do should be and is exceedingly high. In terms of how long it takes, these are averages of averages, of course, but on average, it could take anywhere between 10 to 15 years to develop a drug to get it to patients. And that's obviously a very, very long time, especially in diseases where people are desperately in need for better treatments. So why does it take 10 to 15 years, typically?
Starting point is 00:24:39 Well, typically there are three stages in developing a medicine, right? The first one is what we would call the actual drug discovery stage, which is the work that goes into finding a target in a disease that you would like to hit with a medicine. In some cases, that target is already known. In some cases, you're looking to discover new targets to go after to have better treatment options for a given disease. That can take many, many years, even in that first phase. The second phase is what we call preclinical development.
Starting point is 00:25:11 Once you have a target you think is worth hitting and you have a molecule that you think hits that target, you have to do all of the work to develop that molecule into a medicine to ensuring that it has all the qualities of what you want a medicine to have in terms of how it's absorbed and how it's metabolized and where it goes to into body and whether or not it's toxic and if so how much of it is toxic all of that work we do in what we call preclinical development outside of humans we do this in dishes and cells we can do this in animal models like mice or even monkeys and that as you can imagine also takes many many years yes and then of course there's the third phase which is the most important phase which is what we call clinical development which is the terms that many would be familiar with in terms of human clinical trials where there's a phase one trial, a phase two trial, and a phase three trial. And that process can take anywhere between five to seven years. And so when you add up all of those phases, the drug discovery phase, the pre-clinical development phase, the clinical development phase, that's how you get to these 10 to 15 years. And I should add that once you're done with clinical trials,
Starting point is 00:26:18 you now have to file for regulatory approval. So here in the U.S. you file with the FDA. The process by which these applications get reviewed by the FDA, despite the FDA's best efforts, can take one, sometimes even two years to go through. So again, when you sum all of that up, you can see why it can often take well over a decade to make a medicine. Absolutely. And to your point, the bar being high is a very reasonable concept. But where does programmable medicine come into play here? And how does that maybe change the paradigm across each of those stages? Or where does it have the potential to really reshape that arc?
Starting point is 00:26:57 Yeah. So I think this is where the concept of a programmable medicine potentially could be very transformative to how we think about developing drugs. And that is because in a programmable medicine, there are multiple components that could be, as I described, redeployed for different applications. So let me give you an example. A gene therapy is this idea that you can deliver a genetic payload. In other words, you can deliver a gene to a cell that has a defective version of that gene.
Starting point is 00:27:26 If we're able to make a medicine that does that, that can deliver one gene to a given cell type, it becomes increasingly likely that we'll be able to deliver a different gene to a different cell type for a different disease. And that's very different than the way we traditionally make medicines. If you think about a traditional medicine, like a chemical, a molecule, that is a chemical that is tested and designed for a specific disease, a specific target. The second you switch the target, we don't reuse the atoms in the molecule and try to fit them into a new target. We just design a different molecule.
Starting point is 00:28:04 And that's why I say in traditional drug development, one molecule has very little bearing on the next molecule you design. But in the case of these programmable medicines where all the components can be reused, you just have to essentially redirect or redeploy components like delivery vehicles for gene therapies or in the case of a gene editing medicine, just redirect what edit you want the enzyme or the protein to make. And that is where the programmability comes in. And that sounds huge, right? Just so I understand you correctly, when you're talking about the reusability of these medicines, it's basically the equivalent of the reusable rocket is almost like you basically would have already gotten a certain drug approved to be
Starting point is 00:28:52 able to deliver something. Each time you're reiterating on that, would you then need to have the specific genetic, almost like payload in there, reapproved? But most of that legwork has already been done. Am I thinking about that correctly? That's right. In this analogy, what we would love to see become possible is that the rocket, in this case, is the vehicle by which you ensure that your payload is delivered, right? And some of the most common ones that we know in terms of what would be rockets here are the LNP molecules that all of the COVID vaccines that many of us received were delivered. The COVID vaccine was MRNA. MRNA had the instruction for what it wanted your body to make, and that was encapsulated in this LNP
Starting point is 00:29:35 particle, lipid nanoparticle. It's like, a little ball of fat. Similarly, for a lot of gene therapies, instead of using an LMP particle, we use something called an AAV, which is an adapted virus. It stands for a Dino-associated virus. There have been examples in the clinic where payloads have been delivered with LMP or payloads have been delivered with AAB. For different applications, all you're swapping out is the cargo, the instruction of MRNA that's in that LMP or the instructions or the genetic cargo that's in that AAB. And this is very timely because the FDA is starting to signal that they are looking for ways to be increasingly adaptive to ensure that they can adequately
Starting point is 00:30:21 review these therapies that are reusing components, but do so in a way that will be both rigorous but also adaptive and therefore hopefully more speedy. And more people don't have to start from scratch. And you don't have to start from scratch. Because I already have that rocket that is re-applicable. So tell us a little more about that, where the FDA seems to be taking maybe some inspiration, or at least that's what you allude to with the FAA. Right. So what can they learn from the rocket reusability in this analogy? And what also may be fundamentally different? And what are you taking from some of the new announcements like the new office of therapeutic products? So I think there's a lot of things to take from this. The first one is that I think there's
Starting point is 00:31:01 real recognition here that we are seeing meaningful innovation. terms of the kinds of medicines we can make. And the FDA is appropriately trying to find ways to retain their rigor, because again, they have an extraordinarily high obligation to ensure safety and efficacy in patients, but to move hopefully more quickly to attempt to keep up with all the innovation we're seeing. And so they've pushed forward on several fronts that I think are already pointing us in that direction. The first one is they've announced an office of therapeutic products whose mandate and mission is to find ways to do exactly what we've been describing. And so I think the industry, the drug development industry now has a partner
Starting point is 00:31:48 in the FDA in trying to find the best paths forward along the lines of what we talked about. The second is they are also trying to innovate themselves. The FDA is looking to be more nimble and is running experiments to find ways to do exactly that. So, for example, many of us would be familiar with Operation WarpStreed, which was the effort by the government to try to ensure that the COVID vaccines could reach all of us in a very, very timely manner, given the nature of the pandemic. Well, the FDA has said there are so many intractable diseases, many rare genetic diseases, that have no good therapeutic or treatment options.
Starting point is 00:32:30 So they are launching a pilot program mirroring the concept of Operation Warp Speed, but to apply that to rare diseases where they are going to run some experiments to see different ways that the agency can interact with industry to get these medicines more quickly to the patients that so desperately need them. And so I think those are very, very promising signs that both the industry and the government are looking for ways to ensure these innovations reach patients in a timely and responsible manner. And there's also other proof points that we can point to that are happening every day right now. So for example, several companies that are developing cutting edge gene editing medicines are starting to get approvals by the FDA to move forward with some of
Starting point is 00:33:14 their key clinical trials. And there had been a moment in time where the FDA was being more hesitant, but I think as they've started to evaluate these technologies more carefully, they've started to develop a path forward for these medicines to continue to get developed. And the breaking news is that the first CRISPR therapy has been approved. This is a therapy for sickle cell anemia and beta thalassemia that was developed by a pharmaceutical company called Vertex Pharmaceuticals, working with another company called CRISPR therapeutics. And this is a therapy for essentially curing a genetic disease, in this case sickle cell anemia, by taking the relevant cells out of the body, editing them with CRISPR gene editing technology
Starting point is 00:33:58 and putting them back in the body in a way that results in a functional cure. And this is a big deal for so many reasons. The first one is it's the first time that a CRISPR therapy has been approved as a medicine. The second reason why this is a big deal is I think it's an important milestone to this point about the future
Starting point is 00:34:14 of what programmable medicines could look like. Now that you have a first approval, you've demonstrated that a CRISPR editor could be safely used as a medicine. And so now if it adds something differently, the process for getting that approved should hopefully be shorter and faster. And the third reason, I think that this is a big deal in terms of a milestone, is relatively speaking, how quickly this has happened?
Starting point is 00:34:37 So you asked me at the beginning, how long does it take to make a drug? And I said about 10 to 15 years, on average, it's barely been 10 years since the concept of CRISPR was discovered and described in the scientific literature. Right. So we went from the initial discovery of CRISPR as a potential use as a medicine, all the way to an approval in just over 10 years. That is lightning fast in this world. So it's just an exciting moment in time, I think, for the industry and hopefully for all the various patients that are looking for better solutions and treatments for their disease. Absolutely.
Starting point is 00:35:13 And I mean, it really does feel like a new paradigm, or at least we're moving towards that, where you see this combination of changes in innovation and regulation coming together. where you see things like the news that you just shared, maybe you could just speak a little more to that. Like, if we are able to see this confidence that you're discussing and this idea of programmable medicine becoming a reality, what does that really mean in terms of like the speed of therapies coming, the number of them, the new business models that might be unlocked, how this might ultimately end up impacting patients?
Starting point is 00:35:45 Any just kind of high-level thoughts about if we are moving to this new paradigm, what that really means? Yeah, I think it means a couple of things. The first one is, I think it means that for the benefit of patients, every time we run a clinical trial with these types of medicines, we're not starting from square one because we already know something about the various components in the therapy. So that's the first thing. The second thing is that, generally speaking, these programmable medicines are going after diseases
Starting point is 00:36:16 where the cause of the disease is very well known. In other words, it's a known mutation, and it's the ability to intervene in an effective way has been what's been elusive for us. So in a lot of ways, these new programmable medicines are just a fundamentally new superpower. We can go after diseases that we weren't able to go after before. And as a result, the third thing that I think this means for all of us is that we may be on the cusp where the elusive C word is a reality, that we might actually have cures for lots of very intractable problems. And that is a very new day indeed, right? Like that is something
Starting point is 00:36:52 that it just has not been very common in our industry. So I think there's lots of reasons to be excited. Yeah. I mean, as you're talking, I'm like actually smiling because it's impossible not to be excited. But I guess just to close things off, you've painted this beautiful picture of what may be to come. Could you share a little bit more about the blockers, if any, whether regulatory, whether it has to do with the fundamental science that's coming on board? What would you say if this reality that you're painting were to not come to pass, what would those reasons be? And also, maybe for those listening, how can builders get involved? How can they actually help make this reality come true? First of all, in terms of the blockers, I think there are several,
Starting point is 00:37:33 and I think they are important. And some of the blockers should be there. So the first one is, everything I'm describing in terms of these programmable medicines, it has another side of the coin to it, which is these are largely speaking, permanent medicines as well. So if you take a pill and you have a bad reaction or adverse event or toxic reaction to that pill, you just stop taking the pill. And eventually your body will clear it and hopefully the toxicity has been addressed. In the case of making a permanent edit in DNA, if there is an error, if there is a toxicity that comes from making that edit, something ined,
Starting point is 00:38:13 advertent, it's permanent, or at least it has the potential to be permanent. And so for that reason, the FDA appropriately has an extraordinarily high bar for how they think about evaluating the safety of these medicines and how they think about which diseases are probably the most appropriate to go after with something that is potentially a cure, but also potentially permanent. Yes. So that's one blocker. It's going to need to be addressed systematically. And it should be. A second blocker, the only medicines that work are the ones that get to patients. And these programmable medicines have a couple of challenges in terms of accessibility. The first one is these are not pills you get over the counter. These are very complex medicines. And so therefore, the process by which
Starting point is 00:39:02 you get treated can be a very long and difficult process. So I just described the approval of CRISPR therapy for sickle cell anemia, the process by which you get the treatment could take months because you have to go, you have to get your cells removed, the cells get edited. In this case, you actually have to get a form of therapy to kill your internal cells so you can replace it with the corrected ones afterwards. And that whole, and that could be a long stay at the hospital. So that whole process could take several months. So the accessibility of these therapies, will limit how many people can get them and when. And the second element to accessibility is cost.
Starting point is 00:39:48 Yes. And today, these therapies can cost on the order of millions of dollars. Now, on the one hand, that millions of dollars of cost is justified because a lot of R&D went into it, they're very expensive, expensive to make and manufacture. It's very expensive to manufacture all these components I'm describing. And that's number two. And then number three, they provide a fair amount of benefit. When the case of treating a baby that had something like somatic muscular atrophy that would have otherwise died, a one and done gene therapy from Novartis, this is a therapy that costs about $2.1 million, effectively saves that baby's life.
Starting point is 00:40:28 So there's an incredible amount of value that comes from that. Absolutely. But the cost of discovering, developing, manufacturing these therapies and the benefit that they come from also comes with a very hefty price tag. and that's going to limit accessibility. So that's one of the other key blockers, I would say, that we're seeing in this space. Now, how can builders help? I think builders can help in a very important way, which is if there is one technology on this planet, that can scale better than anything else we know of, it's biology.
Starting point is 00:40:59 And so all of the things I'm describing in terms of blockers to access at some point can be addressed by improving our ability to engineer biology to address some of these limitations. So improved biology can make manufacturing much more scalable and therefore much less expensive. Improving the kinds of interventions and the precision by which these programmable medicines work can address some of these questions of permanence or toxicity. So where the builders can really help is just become better programmer in biology. And from that, we will get better applications at higher scale and at lower cost and hopefully get them in the hands of patients more quickly. A lot of our listeners are familiar with the idea of exponentially decreasing cost and things like software.
Starting point is 00:41:48 Do you see that same kind of curve being applied here when you're talking about decreasing costs? Is that really the future that you're painting where these things, instead of costing millions of dollars, we're talking thousands? Is that really in the future? That's the hope. And the possibility is there. Because again, biology can scale exponentially. We all did come from one cell, and here we are. We're sitting here.
Starting point is 00:42:08 That's such a great point. Yeah. But there's work to be done there. We haven't seen it yet. But that's the promise and that's the hope. So as our health system looks to the skies for inspiration, what inspiration can we take from our own biology to understand how large language models work? And will we ever move from Black Box to Clearbox?
Starting point is 00:42:29 My name is Anjene Midda. I'm a general partner here at A16Z. and I'm talking to you today about AI interpretability, which is just a complex way of saying reverse engineering AI models. Over the last few years, AI has been dominated by scaling, which is a quest to see what was possible if you threw a ton of compute and data at training these large models. But now, as these models begin to be deployed in real-world situations,
Starting point is 00:42:56 the big question on everyone's mind is why? Why do these models say the things they do? Why do some prompts produce better results than others? And perhaps most importantly, how do we control them? Anjane, I feel like most people don't need convincing that this is a worthwhile endeavor for us to understand these models a little better. But maybe you could share where we're at in that journey. What do we or don't we understand about these LLM black boxes and their interpretability?
Starting point is 00:43:24 It might help the reason by analogy here, because this is a set of abstract ideas, but to make it a little bit more concrete, pretend one of the first. of these AI models is like a big kitchen with hundreds of cooks. And when you ask the kitchen to make something, each cook knows how to make certain foods. And when you give the kitchen ingredients and you say, take go cook a meal, all the different cooks debate about what to make. And eventually they come to an agreement on a meal to prepare based on these ingredients. Now, the problem is where we are in the industry right now is that from the outside, we can't really see what's happening in these kitchens. So you have no idea how they made that decision.
Starting point is 00:44:02 on the meal. You just get the cake. You just get the cake or taco or whatever it might be. Right. So if you ask the kitchen, hey, why did you choose to make lasagna? It's really hard to get a straight answer because the individual cooks don't actually represent a clear concept like a dish or a cuisine. And so the big idea here is what if you could train a team of head chefs to oversee these groups of cooks? And each head chef would specialize in one cuisine. So you'd have the Italian head chef who controls all the pasta and pizza the cooks. And then you have the baking head chef in charge of cakes and pies. And now when you ask why lasagna, the Italian head chef raises his hand and says, I instructed the cooks to make a hearty Italian meal. And these head chefs represent clear, interpretable concepts inside the neural
Starting point is 00:44:45 network. And so this breakthrough is like finally understanding all the cooks in that messy kitchen by training these head chefs to organize them and to tie the sort of cuisine categories. And we can't control every individual cook, but now we can get insights into the bigger, more meaningful decisions that determine what meal the AI chooses to make. Does that make sense? It does, but are you saying that we do actually have a sense now of those like head chefs or the people responsible for parts of what might be happening within the AI? Obviously, it's not people in this case, but have we actually unlocked some of that information with some of the new releases or new papers that have come out? We have. We have. And you can break the world of interpretability.
Starting point is 00:45:27 down into a pre-20203 and a post-2020 world, in my opinion, because there's been such a massive breakthrough in that specific domain of understanding which cooks doing what. More specifically, what's happening is that these models are made up of neurons, right? A neuron refers to an individual node in the neural network, and it's just a single computational unit. And historically, the industry sort of tried to analyze and interpret and explain these models by trying to understand what each neuron was doing, what each cook was doing in that situation. Of feature, on the other hand, which is this new atomic unit that the industry is proposing now as an alternative to neuron, refers to a specific pattern of activations across multiple
Starting point is 00:46:09 neurons. Okay. And so while a single neuron might activate in all kinds of unrelated contexts, like whether you're asking for lasagna or you're asking for a pastry, a feature, which is this new atomic unit, represents a specific concept that consistently activates a particular set of neurons. And so to explain the difference using the cooking analogy, a neuron is like an individual cook in the kitchen. Each one knows how to make certain dishes, but doesn't represent a clear concept. A feature would be like a cuisine specialty controlled by a head chef. So, for example,
Starting point is 00:46:39 the Italian cuisine feature is active whenever the Italian head chef and all the cooks they oversee are working on an Italian dish. And that feature has a consistent interpretation, which in this case is Italian food, while individual cooks do not. And so, So in summary, these neurons are individual computational units that don't map neatly to concepts. These features are patterns of activations across multiple neurons that do represent clear interpretable concepts. And so the breakthrough here was that now we've learned how to decompose a neural network into these interpretable features when previous approaches focused on interpreting single neurons.
Starting point is 00:47:16 And so the short answer is, yes, we have a massive breakthrough where we actually now know how to trace what was happening in the kitchen when a dish was being made. And maybe could you give an example that's specific to these LLMs when we're talking about a feature? I know there's still so much research to be done, but what's an example of a feature that you actually see represented in the outputs from an LLM? Yeah, this is a great question. So I think if you actually look at the paper that moved the industry forward a bunch earlier this year, there's a paper called decomposing language models with dictionary learning. This came out of Anthropic.
Starting point is 00:47:49 Interpretability is a large field, but this paper, I think, took a specific approach called Mechlor. mechanistic interpretability. And the paper has a number of examples of features that they discovered in a very small, almost toy-like model because smaller models prove to be very useful petri dishes for these experiments. And I think an example of one of these features was a god feature where when you talk to the model about religious concepts, then a specific God feature that mapped to this concept of a god fired over and over again. And they found when they talked to the model about a different type of concept like biology or DNA
Starting point is 00:48:26 or I think biological concepts was one of the sort of types of questions they were asking the model a different feature that was unrelated to the God feature fired. Whereas the same neurons were firing for both those concepts. And so the feature level analysis
Starting point is 00:48:42 allowed them to decompose and break apart the idea or the concept of religion from biology, which is something that wasn't possible to tease apart in the neuron work. Yeah, and maybe you could speak a little bit more to why this is helpful. I mean, maybe it's obvious for folks listening, but now that we have these concepts that we see and maybe can also link pretty intuitively, like, oh, okay, I understand biology. Oh, I understand religion as a concept that's coming out of these LLMs. Now that we understand these linkages a little more, what does that mean? Like, why does this now open things up? Are we in a new environment? You kind of said pre some of these unlocks and now we're post. What does post look like?
Starting point is 00:49:24 Yeah, this is a great question. So I think there's three big things that are sort of so-wats from this breakthrough. The first is that interpretability is now an engineering problem as opposed to an open-ended research problem. And that's a huge sort of sea change for the industry because up until now, there were a number of hypotheses on how to interpret how these models were behaving and explain why. But it wasn't quite concrete. It wasn't quite understood which one of those approaches would work best to actually. explain how these models work at very large scale, at frontier model scale. But I think this approach, this mechanistic interpretability approach and this paper that came out earlier this year
Starting point is 00:50:00 shows that actually the relationships are so easily observable at a small scale that the bulk of the challenge now is to scale up this approach, which is an engineering challenge. And I think that's massive because the engineering is largely a function of the resources and the investment that goes into scaling these models, whereas research can be fairly open-ended. And so I think one big takeaway from 2023 is that interpretability is gone from being a research area to being an engineering area. I think the second is that if we actually can get this approach to work at scale, then we can control these models. In the same way that if you understood how a kitchen made a dish and you wanted a different outcome, you wanted less lasagna and more pasta and the next time you had
Starting point is 00:50:44 the kitchen come together, now you can go to the Italian chef and say, could you please make that change next time around. And so that allows controllability. And that's really important because as these models get deployed in really important sort of mission critical situations like healthcare and finance and in defense applications, you need to be able to control these models very precisely, which unfortunately today just isn't the case. We have very blunt tools to control these models, but nothing precise enough for those mission critical situations. So I think controllability is a big piece that this unlocks. And the third is sort of the byproduct of having controllability, which is once you can control these models, you can rely on them more.
Starting point is 00:51:21 Increased reliability means not only good things for the customers and the users and developers using these models, but also from a policy and regulatory perspective, we can now have a very concrete, grounded debate about what models are safe and not, how to govern them, how to make sure that the space develops in a concrete, empirically grounded way, as opposed to reasoning about these models in the abstract without a lot of evidence. I think one of the problems we've had as an industry is that because there hasn't been a concrete way
Starting point is 00:51:52 to show or demonstrate that we understand these black boxes and how they work, that a lot of the policy work and policy thinking is sort of worst-case analysis. And worst-case analysis can be fairly open to fear-mongering and a ton of fud. And I think instead, now we have an empirical basis to say, here are the real risks of these models
Starting point is 00:52:09 and here's how policy should address them. And I think that's a big improvement or big advance as well for us all. Totally. I mean, it's huge. And it's kind of interesting because we don't know every little piece of physics, but we're able to deploy that in extremely effective ways and build all of the things around us through that understanding that has grown over time. And so it's really exciting that these early building blocks are getting in place.
Starting point is 00:52:35 Maybe you can just speak to that engineering challenge or the flippening that you said happen where we previously had a research challenge, which will. was somewhat TBD. When is this going to be unlocked? How is it going to be unlocked? And now we have, again, those early building blocks where we're now talking about scale. And I'll just read out a quick tweet from Chris, who I believe is on the Anthropic team. And he said, if you asked me a year ago, superposition would have been by far the reason I was most worried that mechanistic interpretability would hit a dead end. I'm now very optimistic. I go as far as saying it's now primarily an engineering problem, hard, but less fundamental risks.
Starting point is 00:53:13 So I think it captures what you were just mentioning. But maybe you can speak a little bit more to the papers and the scope that they've done this feature analysis within and what the steps would be to do this when we're talking about those much, much larger foundational models. So I think stepping back, the way science in this space is done often is you start with a small, almost story-like model of your problem, see if some solution is promising. And then you decide to scale it up to a bigger and bigger level, because if you can't get it to work at a really small scale, rarely do these systems work at large scale? And so I think while, of course, the Holy Grail challenge with interpretability is explaining how frontier models that are the GPT4s and Claude Twos and Bards of the world, which are several hundred billion parameter in their scale, I think that one of the challenges with trying to attack interpretive.
Starting point is 00:54:09 of those models directly is that they're so large and such complex systems, it is very untractable to try to tease apart all the different neurons in these models at that scale. And so I think what is the sort of classic scientific method of identifying a petri dish at a really small scale, getting a successful set of experiments to work at that scale before then figuring out how to increase the scope of these experiments has largely worked well for AI. That's how scaling laws worked in the first place. We had GPT2 before we had GPD 3.5 because that was a much more tractable problem and scale to demonstrate that these models are capable of very high quality next token prediction. And I think that's what's happening with interpretability
Starting point is 00:54:52 as well. And so the current breakthroughs we have as an industry have been demonstrated at this sort of toy prototype level with models that are in the tens of parameters. And I think the next step would be to figure out how to get these to work in the hundreds of millions of parameters, and then you could get the billions of parameters. And I think from the outside looking in, the state of interpretability can often seem underwhelming because these models are so small right now. But I think that's a little bit misleading because once we can get them to work at small scale, usually the industry has been fairly good at then getting to replicate those approaches at larger and larger scale. Now, I should be clear, it's not easy. And there are a ton of unsolved problems in the same.
Starting point is 00:55:36 scaling part of this journey as well. Yeah, if I could just interrupt real quick, you mentioned the scaling laws, and those have continued to scale, but we didn't necessarily know if that would be the case. It has, of course, proven to be the case as we move forward. But what are the challenges that you see that might be outstanding as we look to scale up some of this mechanistic interpretability research? What open challenges do you see on that path? To borrow our analogy earlier of the kitchen, And I think we as an industry now have a model of what's going on and some proof of what's going on with these features with a kitchen which has, let's say, three or four chefs. And so to figure out if this would work at frontier scale where you have thousands and thousands of chefs in each kitchen, and in the case of a model, you have billions of parameters, I think there are two big open problems that need to be solved in order for this current scale approach to work at scale. The first is increasing the auto encoder, which is conceptually you can kind of think about as the model that makes sense of what's going on with each feature.
Starting point is 00:56:42 And the auto encoder here is pretty small in the paper that came out in October. And so I think there's a big challenge where the researchers in the space have to figure out how to scale up the auto encoder in the order of magnitude of almost 100x expansion factor. And so that's a lot. and that's pretty difficult because training the underlying the base model itself often requires hundreds and often billions of dollars worth of compute. And so I do think it's a fairly difficult and compute-intensive challenge to scale the auto-encoder. Now, I think there's a ton of promising approaches on how to do that scaling without needing tons and tons of compute, but those are pretty open-ended engineering problems. I think the second is to actually scale the interpretation
Starting point is 00:57:28 of these networks. And so as an example, if you find all the neurons and all the features related to, let's say, pasta or Italian cuisine, and then you have a separate set of features
Starting point is 00:57:39 that map to pastries, right? Now the question is, how do you answer a complex query and you ask the AI, hey, if I asked you a provocative question
Starting point is 00:57:51 about whether people have a certain ethnicity enjoy Italian cuisine or not, right? You need to figure out how those two features actually interact with each other at some meaningful scale. And that is a pretty difficult challenge to reason about too. And I think that's the second big open problem that the researchers
Starting point is 00:58:08 call out in their work. And so I think the combinator complexity of each of those sets of features interacting with each other at increasing scales is a nonlinear increase in complexity that has to be interpreted. And so at least at the moment, these are the two clear engineering problems that need to be solved, scaling up the auto encoder and scaling up the interpretation. But there probably They are a list of long-tail questions as well that I'm not addressing here. But those are sort of the two big ones that need to get solved before this black box world has entirely moved to a transparent, clear box world. Yeah, we've got a lot of steps along the way.
Starting point is 00:58:39 And something that dawned on me as you were sharing that and potentially how compute-intensive it might be, which also is very costly. Who is funding this? Like, obviously the paper we talked about is coming out of Anthropic and other companies like OpenAI are also interested in understanding their technology better. But it's very easy to say, hey, we're putting together compute and economic resources to build a model because there's the consumer on the other side who uses that model. Right. But what is the justification for folks to invest in building this mechanistic interpretability?
Starting point is 00:59:13 Is it a policy thing? Or is it that actually having this interpretability makes the models better as well? Everybody who needs reliable AIs is essentially incredibly incentivized. to invest in this area because as we were talking about earlier, if you are in the business of buying cakes from a kitchen and you don't like the cakes they're producing, but you can't tell them what to change about it and you can't tell them how to improve the cakes, it can be pretty hard for you to get to the outcomes you want. And so I think what's happening right now in the industry is that interpretability research is largely done at labs that have the resources to do it and are
Starting point is 00:59:55 incentivized to make their models more reliable and more steerable. I think there's a separate sort of body of work that's being done by academics and independent researchers, which is really important to support, which is safety research that is really hard for folks outside of large labs to do unless the open source ecosystem becomes more and more vibrant. I think these scaling problems are difficult to reason about unless you have access today to of compute to actually scale up the models and then study them. And I think one of the most exciting pieces of work that's being done around interpretability is by academic researchers, independent researchers who are reading what's happening at some
Starting point is 01:00:39 of the large labs and then trying to replicate that work on open source models. And so I think there are a number of really interesting experiments online by independent researchers where they've taken open source models like Lama 2 at 70B scale and have started to try to get interpretability experiments to host. hold at that scale. And I think that's an increasingly important body of work because if we can't have the entire academic community and the independent researcher community participate in this interpretability research, we're going to end up in a situation where only a very limited number of people have access to doing that research. And so it's going to go slower, almost definitionally.
Starting point is 01:01:17 So at the moment, the funding and the resources are coming from the largest labs. And I think there's a glimmer of hope where more and more open source work is being done. But we've got to make sure that continues and grows in volume and doesn't get paused. Completely. Yeah. And maybe one other way to ask the question that I just asked is what does this all unlock, right? If we have these new tools, if we turn black box to clear box, how does this change the game? And maybe you could speak to what you're excited for specifically coming into 2024 as it relates to mechanistic interpretability. Yeah. So to be clear, I'm excited about all kinds of interpretability or at explainability. I'm broadly very excited about 2024 as the first time.
Starting point is 01:01:54 or at least the year where the most amount of interest and attention is being paid to explainability. The last few years, the attention was all on the how and the what. People are just incredulous at the capabilities of these models. Can we get them to be smarter? Can we get them to reason about entirely new topics that maybe weren't in the original pre-training data set? And that's been totally reasonable. But I think more attention on the why of these models and to explain how they work has been the big blocker on these models getting deployed outside of just a few consumer use cases where the
Starting point is 01:02:31 costs of the model not being as reliable or steerable are low. And so low precision environments, consumer use cases where people are more forgiving and tolerant of mistakes by the model and so on is largely where the bulk of the value has been generated in AI today. But I think if you want to see these models take over some of the most impactful parts of our lives that they currently aren't deployed in, things like healthcare, things like financial parts of the world where a lot of really tedious work is being done by folks who would love to have these models automate a lot of our decisions. I think that those mission-critical situations require a lot more reliability and predictability. And that's what interpretability ultimately unlocks. If you can explain why
Starting point is 01:03:15 the kitchen does something, then you can control what it does. And that makes it much more reliable. and therefore it's going to be used in more and more and more situations and in more use cases and then more and more impactful customer journeys where today a lot of the models don't actually make the cut. No, it's so true. Actually, something that also just dawned on me as you were talking is almost everything in this world has margin for error, right? There is error inherently in most things.
Starting point is 01:03:40 However, if you can understand if you can explain that error and constrain it to something that other people can get behind, it's just. much more likely that people will want to engage with that thing because they can at least understand what is coming out of it. And so, yeah, I feel like that picture is very compelling and I hope we can get there. I hope so too. I think to be clear, we're not there yet, but we've got the glimmers now of approaches that might work. And I think what I'm excited about 2024 is a lot more investment, a lot more energy, a lot more of the best researchers in this space spending their time on interpretability. Yeah. Well, we have some of the smartest
Starting point is 01:04:15 people in the world working on AI, and we saw how quickly things moved in 2022, 2023. So hopefully in 2024, some of this interpretability work moves just as quickly. I hope so. I've got my fingers crossed. All right. I hope you enjoyed these three big ideas. We've got a lot more on the way, including the new age of maritime exploration that takes advantage of AI and commuter vision, AI-first games that never end, and whether voice-first apps may finally be having their moment. If you can't wait and want to see our full list of 40-plus big ideas today, you can head on over to a16.com slash big ideas 2024. It's time to build.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.