a16z Podcast - AI Revolution: Disruption, Alignment, and Opportunity

Episode Date: September 28, 2023

The AI Revolution is here. In this episode, you’ll learn what the most important themes that some of the world’s most prominent AI builders – from OpenAI, Anthropic, CharacterAI, Roblox, and mor...e – are paying attention to. You’ll hear discussion around the real-world impact of this revolution, on industries ranging from gaming to design, and the considerations around alignment along the way.This footage is from an exclusive event, AI Revolution, that a16z ran in San Francisco recently. If you’d like to access all the talks in full, visit a16z.com/airevolution. Topics Covered:00:00 - AI Revolution02:39 - Putting technology in users’ hands08:21 - AI alignment and safety21:44 - Future opportunities Resources: Catch the all the talks at https://a16z.com/airevolution Stay Updated: Find a16z on Twitter: https://twitter.com/a16zFind a16z on LinkedIn: https://www.linkedin.com/company/a16zSubscribe on your favorite podcast app: https://a16z.simplecast.com/Follow our host: https://twitter.com/stephsmithioPlease note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Transcript
Discussion (0)
Starting point is 00:00:00 Experiences where creation amongst 65 or 70 million people is just part of the way it goes. You really want to think about, okay, what's somewhat possible today? What do you see glimpses of today? If this is like you have to fill out a thousand pages of paperwork and get 15 different licenses from different bodies to make an AI system, that's never going to work. Entertainment is like this $2 trillion a year industry. And like the dirty secret is that entertainment is imaginary friends that don't know you exist. And I wouldn't bet against startups in a general sense there.
Starting point is 00:00:39 It's so early. The AI revolution is here. But as we collectively try to navigate this game-changing technology, there are still many questions that even the top builders in the world are grappling to answer. That is why A16Z recently brought together some of the most influential founders from Open AI, Anthropic, Character AI, Rehblocks, and more to an exclusive event called AI Revolution in San Francisco. Today's episode continues our coverage of this event as we discuss the very real-world impact of this revolution on industries ranging from gaming
Starting point is 00:01:16 to design and the considerations around alignment along the way. Now, if you miss part one, do yourself a favor and cue that up next so that you can eavesdrop on these top builders breaking down the current economics of this wave. Plus, whether scaling laws will continue and how these models will evolve to capture more of the world around us. Plus, if you'd like to listen to all the talks in full today, head on over to A16.com slash AI Revolution.
Starting point is 00:01:49 As a reminder, the content here is for informational purposes only. Should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
Starting point is 00:02:09 For more details, including a link to our investments, please see A16C.com slash disclosures. As this wave continues to unfold, it is worth reflecting on just how wide-reaching it is. So in this episode, we start with Mira, Murak. Co-founder and CTO of OpenAI, explaining how she ended up focusing her career here of all places, especially after a degree in mechanical engineering and after working as an aerospace engineer. There's not going to be a more important technology that we all build than building intelligence.
Starting point is 00:02:45 It is such a core unit, and the university it affects everything. But in order for AI to impact everything, we'll need a lot of. of compute. The good news is that it's on the way. Here is Num Shazir, co-founder of Character AI and lead author on the Seminole 2017 Transformer paper, calculating just how much compute will soon be available. I think I saw an article yesterday, like Nvidia is going to build like another 1.5 million H-100s like next year. So that's roughly a quarter of a trillion operations per second per person, which means that could be processing on the order of one word per second on a 100 billion parameter model for everyone on Earth.
Starting point is 00:03:33 It's often not hard to convince people that this compute is coming or that it'll impact a wide variety of industries. But what can be hard to convince is that this disruption is a positive thing. But instead of pontificating, let's take a look back at what the democratization of technology has yielded in the past, from people-reading platforms with millions of users. We'll start with Dillonfield, co-founder and CEO of Figma, commenting on how design has been shaped by technology for years. A16Z general partner, David George, coming in with the... Fiery question, is AI actually going to take the job of the designer in the future? You know, it's kind of interesting and puts it in the question is like, okay, well, there'll be less things to design, or is AI going to do all the design work, right?
Starting point is 00:04:19 So it's like, you're on one of those paths, maybe? on the first one of will there be less things to design if you look at every technological shift or platform shift so far it's resulted in more things to design so you got like the printing press
Starting point is 00:04:34 and then you have to figure out what you put on a page and you've got even more recently mobile you would think okay less pixels less designers right but no that's when we saw the biggest explosion of designers and so maybe if you'd ask me this beginning of the year you might have said okay well, we'll all have these chat boxes
Starting point is 00:04:53 and people will be asking questions in them and that's going to be our interface for everything. You know, look at Open AI. They're on a hiring, acquisition spree, trying to get product people and designers right now so that they're able to make great consumer products. It turns out design kind of matters. The second one of will AI be doing the design is,
Starting point is 00:05:12 I think, pretty interesting. So far, we're not there. Right now we're at a place where AI might be doing the first draft. Right. And getting from first draft to final product actually turns out that's kind of hard. And usually it takes a team. But if you could get AI to start to suggest interface elements to people
Starting point is 00:05:29 and do that in a way that actually makes sense, I think that could unlock a whole new era of design in terms of creating contextual designs, designs that are responsive to what the user's intent is at that moment. And I think that would be a fascinating era for sort of all designers to be working in, but I don't think it replaces the need for human designers. So fewer pixels to design does not actually equate to fewer designers.
Starting point is 00:05:55 And it turns out that many of the experiences where people are already spending hours a day have a lot of room for upside. Noam here on how AI can drastically improve entertainment. Entertainment is like this $2 trillion a year industry. And like the dirty secret is that entertainment is imaginary friends that don't know you exist. like the reason people interact with TV or any of these other things, it's called like these parasocial relationships, like your relationship with TV characters or book characters or celebrities. And everybody does it. It's actually a cool first use case for AGI.
Starting point is 00:06:36 Like essentially, there was the option to like go into like lots of different sorts of applications. And a lot of them have a lot of like overhead and requirements like you want to launch something that's a doctor, it's going to be a lot slower because you want to be really, really, really careful about not providing false information. But friend, you can do like really fast. It's just entertainment. It makes things up. That's a feature. And we likely won't build these fundamentally new experiences by dreaming them up in some lab. We'll get to create by iterating and putting these products into the hands of users. Here's Mira on how this approach underpinned chat GPT success thus far.
Starting point is 00:07:17 We did make a strategic decision a couple of years ago to pursue product. And we did this because we thought it was actually crucial to figure out how to deploy these models in the real world. And it would not be possible to just sit in the lab and develop this thing in a vacuum without feedback from users from the real world. And also with Chad GPT, you know, the week before we were worried that it wasn't good enough. And we put it out there, and then people told us it is good enough to discover new use cases. And you see all this emergent use cases that I know you've written about.
Starting point is 00:07:57 And that's what happens when you make this stuff accessible and easy to use and put it in the hands of everyone. There is beauty in putting such powerful tools into the hands of everyone. But how do we measure how powerful these tools are? Since 1950, people look to the Turing test as one guidepost. but it turns out that the popular benchmark had its flaws. For one, it's surprisingly easy to trick humans. Now, as the AI community looks for new guideposts and benchmarks,
Starting point is 00:08:28 here is David Bazuki, co-founder and CEO of Roblox, proposing a, quote, new Turing Test, vetting whether an AI can reason past the explicit data it's trained on. I have a touring test question for AI, and that would be if we took AI in 1633 and trained on all the available information, at that time, would it predict the Earth or the sun is the center of the solar system, even though 99.9% of the information is saying
Starting point is 00:08:56 the Earth is the center of the solar system. So I think five years is right at the fringe of if we were to run that AI touring test, it might say the sun. Interesting. Do you have a different answer if it was 10 years? 10 years, I think it'll say the sun. Now, here's Dylan's version.
Starting point is 00:09:14 What's the modern-day turning test? And I feel like this question kind of comes up everywhere now. And we're now seeing from these systems that it's easy to convince a human that you're human. It's hard to actually make good things. Yeah. Like, I could have TP4 create a business plan and come pitch you. That doesn't mean you're going to invest. When you actually have two businesses side by side and they're competing and one of them is run by an eye and another one's run by a human and you invest in the AI business, then I'm worried.
Starting point is 00:09:44 Yeah. We're not there yet. Finally, here's Mira commenting on how Open AI thinks about the threshold for AGI. How do you define AGI? In our Open AI charter, we define it as a computer system, basically, that is able to perform autonomously the majority of intellectual work. Passing the Turing test is one thing, but ensuring these models perform the goals that humans intend is another.
Starting point is 00:10:12 Here, Mira shares how arguably the most successful AI product, chat GPT, was born out of open AI trying to align the underlying model using reinforcement learning with human feedback. If you consider how chat GPT was born, it was not born as a product that we wanted to put out there. In fact, the real roots of it go back to more than five years ago when we were thinking about how do you make this safe AI systems. You know, you don't necessarily want humans to actually write the goal functions because you don't want to use proxies for complex goal functions or you don't want to get it wrong. And so this is where reinforcement learning with human
Starting point is 00:10:58 feedback was developed. What we were trying to really achieve was to align the AI system to human values and get it to receive human feedback. And based on that human feedback, it would be more likely to do the right thing, less likely to do the thing that you don't want it to do. And after we developed GPD3, and we put it out there in the API, this was the first time that we actually had safety research become practical into the real world. And this happened through instruction following models. So we used this method to basically take prompts from customers using the API.
Starting point is 00:11:39 And then we had contractors generate feedback for the model to learn from. and we fine-tuned the model on this data and built instruction following models. They were much more likely to follow the intent of the user and to do the thing that you actually wanted to do. And so this was very powerful because AI safety was not just this theoretical concept that you sit around and you talk about,
Starting point is 00:12:07 but it actually became, you know, was sort of like how do you integrate this into the real world? And obviously, with large language models, we see great representation of concepts, ideas of the real world. But on the output front, there are a lot of issues. And one of the biggest ones is obviously hallucinations. So how do you get these models to express uncertainty? And the precursor to chat GPT was actually another project that we called WebGPT. And it used retrieval.
Starting point is 00:12:44 to be able to get information and site sources. And so this project then eventually turned into child GPT because we thought the dialogue was really special because it allows you to sort of ask questions, to correct the other person, to express uncertainty. There's just so much. You found the error because you're interacting. Exactly. There is this interaction and you can get to a deeper truth.
Starting point is 00:13:08 We started going down this path and at the time we're doing this with GPT3 and then GPT 3.5. But, you know, one thing that people forget is that, actually, at this time, we had already trained GPD4. And so internally, at Open AI, we were very excited about GPD4 and sort of put chat GPDT in the rearview mirror. And we kind of realized, okay, we're going to take six months to focus on alignment and safety of GPD4. And we started thinking about things that we could do. And one of the main things was actually to put CHAPT in the hands of researchers out there that could give us feedback since we had this dialogue modality.
Starting point is 00:13:53 And so this was the original intent to actually get feedback from researchers and use it to make GPT4 more aligned and safer and more robust, more reliable, and eventually plan. I mean, just for clarity, when you say aligned in safety, do you include in that, like, correct and does what it wants? So do you mean actual, like, protecting from some sort of harm? By alignment, I generally mean that it aligns with the user's intent. So it does exactly the thing that you wanted to do. But safety includes other things as well like misuse, where the user is intentionally trying to use the model to create harmful outputs.
Starting point is 00:14:34 In this case, with chat GPT, we're actually trying to make the model more likely to do the thing that you want to do. to do, to make it more aligned. And we also wanted to figure out the issue of hallucinations, which is always an extremely hard problem. But I do think that with this method of reinforcement learning with human feedback, maybe that is all we need if we push this hard enough. Given that this field is so early, so are the methods of alignment. Here's another approach that Dario from Anthropic has proposed, one that involves a guiding constitution and AI that reinforces those principles. Here's Dario in conversation with A16 Z general partner, Anjni Midha.
Starting point is 00:15:17 The method that's been kind of dominant for steering the values and the outputs of AI systems up until recently has been RL from human feedback. I was one of the co-inventors of that at OpenAI, but since then it's been, you know, improved to power chat GPT. And the way that method works is that humans give feedback on model outputs, say which model outputs they like. better. And over time, the model learns what the humans want and learns to emulate what the humans want. Constitutional AI, you can think of it as the AI itself giving the feedback. So instead of human raiders, you have a set of principles. And, you know, our set of principles is in our constitution. It's very short. It's five pages. We're constantly updating it. There could be
Starting point is 00:16:02 different constitutions for different use cases, but this is where we're starting from. And whenever you train the model, you simply have the AI system, read the Constitution, look at some task like, you know, summarize this content or give your opinion on X, and the AI system will complete the task. And then you have another copy of the AI system say, okay, was this in line with the Constitution or was it not? At the end of this, if you train it, the hope is that the model acts in line with this guide star set of principles. So as a result of that approach, you know, the seed of the Constitution captures some set of values
Starting point is 00:16:38 of the constitutional authors, right? How are you grappling with the debate that that means you are imposing your values on the constitutional system? Yeah, a couple directions in that. So first, when we took the original Constitution, you know, we tried to add as little of our own content as possible.
Starting point is 00:16:53 We added things from like the UN Declaration on human rights, just kind of like generally agreed upon kind of, you know, deliberative principles, some principles from like Apple, terms of service. They're very vanilla. They're things like, you know, produce content that, you know, would be acceptable if shown to children or things like this or, you know, don't violate fundamental human rights. I think from there, we're going in two directions. One is that different use cases, I think,
Starting point is 00:17:18 demand different operating principles and maybe even different values, like a psychotherapist probably behaves in a very different way from a lawyer. So the idea of having a kind of very simple core and then specializing from there in different directions is kind of a way not to have this kind of mono constitution that applies to everyone. Second, we're looking into the idea of, I don't want to say crowdsourcing, but some kind of deliberative democratic process whereby people can design constitutions. To folks who aren't sort of privy to what's going on inside of Anthropa, you can often seem paradoxical. Because we found a way to efficiently sort of scale and keep the scaling laws proceeding. At the same time, we're big advocates of making
Starting point is 00:17:59 sure that this doesn't happen very fast. What is the thinking behind that paradox? Yeah, a few points on that. I think it's just kind of an inherently tricky situation with a bunch of tradeoffs. I think one of the things that most drives the tradeoffs is, and you see it a bit in constitutional AI, that the solution to a lot of the safety problems, the best solutions we found almost always involve AI itself. So, you know, there's a community of very theoretically oriented people who tries to work on AI safety kind of separate from the development of AI. And at least my assessment of this, I don't know if others would say it was fair, is that hasn't been that successful. And that the things that have been successful, even though there's
Starting point is 00:18:40 much more to do, we've only made limited progress so far, are areas where AI has kind of helped us to make AI safe. Now, why would that happen? Well, as AI gets more powerful, it gets better at most cognitive tasks. One of the relevant cognitive tasks is judging the safety of AI systems, eventually doing safety research. So there's this kind of self-referential component to it. And then we even see it with areas like interpretability looking inside the neural nets where we thought at the beginning, we've had a team on that since the beginning, that that would be very separate. But I think it's converged in two ways. One is that powerful AI systems can help us to interpret the neurons of weaker AI systems. So again, there's that recursive process. And second,
Starting point is 00:19:21 that interpretability insights often tell us a bit about how models work. And when they tell us how models work, they often suggest ways that those models could be better or more efficient. As the industry continues to explore alignment and safety, we're already seeing this technology completely reshape industries, with a lot of opportunity on the horizon, even if AI systems are not always reliable yet. In general, I think we're sort of amongst tasks
Starting point is 00:19:51 kind of like intern level, I would say. That's what I generally say. The issue is reliability, right? Of course. You know, you can't fully rely on this system to do the thing that you wanted to do all the time. And, you know, how do you increase that reliability over time and then how do you obviously expand the capabilities,
Starting point is 00:20:12 the new, the emergent capabilities, the new things that these models can do? I think, though, that it's important to pay attention to this emergent capabilities. even if they're highly unreliable. And especially for people that are building companies today, you really want to think about what's somewhat possible today. What do you see glimpses of today?
Starting point is 00:20:34 Because very quickly, these models could become reliable. Here are some glimpses of what may be to come. First, in games. I think there's three categories. There is one category where people on our platform don't even think of it as AI, even though it's been going on for two or three or four years. Quality of personalized discovery, quality of safety, civility, voice and text, monitoring,
Starting point is 00:21:02 asset monitoring, quality of real-time, natural translation, how good is our translation versus others? So that's the one that people don't notice. The next one is, I think, the one that's really exciting right now, which is generative, either code generative, 3D object generative, avatar generative, game generative, which is very interesting. And then the future one, which is really exciting, is how far do we get to a virtual doppelganger or a general intelligence agent inside of a
Starting point is 00:21:32 virtual environment that's very easy to create by a user? You want George Washington in your 12-year-old school project. How good is George Washington? Or I'm not on Tinder, but if someday Tinder has a Roblox app, can I send my virtual doppelganger for the first 3D meeting kind of thing? So I think going all the way from the things we don't notice to the things that are exciting around generative to future, then general intelligence, these are all going to change the way this one is. When you think about the parts that go into building the game today, there are just so many pieces, right? There's the concepting, there's storyboarding, there's the writing, there's the creation of the 2D images, the 3D assets, and there's the code and the physics engine. And so Roblox has built many of these pieces into its own studio and its platform. What parts do you think will be most affected by this new generation of generative battles that you just spoke about?
Starting point is 00:22:27 Yeah, it's almost worth saying the antithesis, what will not be affected? Because ultimately there will be acceleration on all of these. We have a bit of an optimistic viewpoint right now because of the, say, 65 million people on Roblox. Most of them are not creating at the level they would want to. And we, for a long time, imagined simulation of project runway. where in the early days of Roblox we imagine Project Runway is just pretty skeuomorphic. You have
Starting point is 00:22:56 sewing machines and fabrics and it's all 3D simulated and that's how you would do it. But when we think about it even that's kind of complex for most of us. And I think now when Project Runway shows up on Roblox, it will be text prompt, image prompt,
Starting point is 00:23:12 voice prompt, whatever you want, as if you're sitting there and if I was helping you make that, I'd say I want kind of a blue denim shirt, I want some cool things. I want some buttons, make it a little more trim, fit it. We'll see those kind of creation. So I actually think we're going to see an acceleration of creation. For example, experiences where creation amongst 65 or 70 million people is just part of the way it goes, have not been possible. An experience where there's millions of people acting as
Starting point is 00:23:45 fashion designers and voting and picking who's got the best stuff. And then possibly, imagining some of that, you know, going off and being produced in real life or some of them being plucked up by Parsons and saying, okay, the future designer, you can imagine other genres like this, where you actually create on platform and then get identified as a future star. Most of the AI tools today operate in the second dimension. But naturally, the Roblox team has its site set on the third. I think one area we're really watching, that's a very difficult problem right now, is true high-quality 3D generation as opposed to 2D generation.
Starting point is 00:24:27 There's lots of wonderful 2D generation stuff out there. We're really double down on 3D generating. A couple of weeks ago, you had tweeted that the Roblox app on MetaQuest had actually hit a million downloads in just the first five days in its beta form. It was being out on the actual Oculus store. So what are your thoughts in VR, spatial computing?
Starting point is 00:24:48 Yeah, so our thesis has been, that just as when the iPhone shipped and all of a sudden we had 2D HTML consumable on a small screen rather than a large screen with the pinch and zoom and now we take it for granted. I think my kids probably don't realize there was some cheesy mobile web thing 10 years ago pre-Iphone where browsers were large-screen things. Now we just assume 2D HTML is everywhere. I think 3D we feel is the same. It's immersive multiplayer in the cloud.
Starting point is 00:25:20 simulated 3D. And because of that, every device has better, optimal for the device camera, optimal for the device user interaction, different levels of immersiveness. Your phone is not as immersive as your VR headset, but your phone is more spontaneous. So I think we felt that. And we think the market ultimately figures out which device you assume this with. For any founders excited to build at the intersection of gaming and AI, here are some themes that are top of mind at Roblox.
Starting point is 00:25:56 What's the future of training cheaply at mega volume? What's the future of running inference cheaply at mega volume? What types of technology abstracts away different hardware devices? How can you run a mixed CPU, GPU environment over time? We're very interested in that. So I think we're watching those types of text acts a lot. Another area of opportunity is a newfound ability to interact with unstructured data, especially as context windows lengthen.
Starting point is 00:26:31 Here's Dario. One thing that I think people are starting to realize, but I think is still underappreciated, is the longer contexts and things that come along with that we're working on, you know, things in the direction of retrieval or search, really open up the ability of the models to talk to very large databases. You know, one thing we say is like, oh, yeah, you can talk to a book, you can talk to a legal document, you can talk to a financial statement. And I think people have this, there's this chatbot. I ask it a question, and it answers the question. But the idea
Starting point is 00:27:06 that you can upload a legal contract and say, you know, what are the five most unusual terms in this legal contract or upload a financial statement and say, summarize the position of this company, what is surprising relative to what this analyst said two weeks ago. So all these kind of knowledge, manipulation, and processing of large bodies of data that take hours for people to read, I think much more is possible with that than what people are doing. We're just at the beginning of it. And, you know, that's an area I'm excited about. I'm particularly excited about because it's an area where I think there are a lot of benefits and all the costs that we've talked about. And what about infinite context windows?
Starting point is 00:27:43 Really, the main thing holding back infinite context windows is just, you know, as you make the context window longer and longer, of course, the majority of the compute starts to be in the context window. So at some point, it just becomes too expensive in terms of compute. So we'll never have literally infinite context windows, but we are interested in continuing to extend the context windows and to provide other means of interfacing with large amounts of data. Another area of interest for Dylan, science.
Starting point is 00:28:13 I feel like when it comes to science, just the applications of all this technology that's happening right now are still completely under-explored. Whether it's using deep learning to get approximations of systems faster or figuring out how we can just accelerate human progress in general. And why it is still time to build. I understand the arguments for why incumbents may benefit in a disproportionate way.
Starting point is 00:28:39 Basically, every platform shift that's happened, people have claimed that, and then it hasn't been the case. Yeah. And so I think that if you're a startup, this is a pretty good time to basically pick the area that you think could really benefit from the technology
Starting point is 00:28:52 and go after it. And I wouldn't bet against startups in a general sense there. It's so early, and most of what I see coming right now is still at the foundational slash sort of base model area. And, you know, if it's not that,
Starting point is 00:29:07 It's like infrastructure or dev tools and not at all like, okay, how do we use this all the way of the stack? And so I think that enterprise is coming. You know, there's a lot of stuff that will show up in all these areas, but it's going to take some time. Plus, here is Dario's take on why you can still take part, even if you don't have a deep background in AI. So my view is basically that there's two kinds of fields at any given point in time. There's fields where an enormous edifice of experience and acute. accumulated knowledge has been built up, and you need many years to become an expert in that field. The canonical example of that would be biology. Very hard to, you know, contribute groundbreaking
Starting point is 00:29:48 or Nobel Prize work in biology if you've only been a biologist for six months. Then there are fields that are very young or that are moving very fast. AI was and still is to some extent very young and is definitely moving very fast. And so when that's the case, really talented generalists can often outperform those who have been in the field for a long time because things are being shaken up so much. If anything, having a lot of prior knowledge can be a disadvantage. Finally, if you needed any more convincing, a timely reminder from A16Z general partner, Martine Casato. The punchline is if you've ever wanted to start a startup or join a startup, now is a great time to do it. All right, thank you so much for listening to part two of our coverage of AI Revolution. We really hope you leave inspired to build
Starting point is 00:30:36 and be a part of this wave. And if you'd like to visit all the talks in full today, don't forget to visit A6CZ.com slash AI Revolution. We will be back soon with two more episodes covering how AI is or isn't impacting the enterprise and the timely collision between machine learning and genomics. We'll see you then.
Starting point is 00:30:59 If you liked this episode, if you made it this far, help us grow the show. Share with a friend, or if you're feeling really ambitious, you can leave us a review at rate thispodcast.com slash A16C. You know, candidly, producing a podcast can sometimes feel like you're just talking into a void. And so if you did like this episode, if you liked any of our episodes, please let us know. I'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.