No Priors: Artificial Intelligence | Technology | Startups - Waymo’s Journey to Full Autonomy: AI Breakthroughs, Safety, and Scaling

Starting point is 00:00:00 Hi, listeners, welcome to No Pryors. Today, we're hanging out with Dmitri Doolgav, co-CEO of Waymo. Waymo started as the chauffeur project within Google back in 2009 and eventually spun off as its own company. Now, it provides over 100,000 paid rides each week across San Francisco, L.A., Austin, and Phoenix. I love taking Waymos, and I'm regularly campaigning for better South Bay coverage. We're excited to dig into all things Robo

Starting point is 00:00:30 Taxis, self-driving, what it takes to deploy this technology on a mass scale, and what's next for Waymo. Well, Dimitri, thank you so much for joining us today. Thank you for helping me. Maybe we can start off with just a little bit of a history of self-driving at Google, how you got involved, and how things have evolved over time. I've been doing this for quite a few years, I think about 18. Now, I got started in around 2006. This was the time of the DARPA Grand Challenges. This is when DARPA organized a few competitions that they called the grand challenge in robotics for the purpose of advancing research in autonomous vehicles.

Starting point is 00:01:12 So the first competition they had was the first grand challenge, the challenge there was to create a car that could drive autonomously in a desert. I just complete this deck and wiremer drive for about 100 miles. Nobody succeeded, but there's a lot of great progress that was made and then they repeated the challenge and a few team succeeded. So on the heels of that, they created another challenge that called that DARPA Urban Challenge, where the setup was kind of a mock city that was supposed to imitate what driving on public roads is like. And that's the one that I was involved in. I was on Stanford's team. This was kind of my, you know, a moment where it clicked for me. I saw the future and the benefits and never looked back. That's what I've been

Starting point is 00:01:56 doing ever since. And then we started this project at Google. in 2009, a small group of us, and then that grew into what now is way more when he started the company at the very beginning of 2017. It seems like a lot of the lineage or history of this field all traces back to a handful of labs. You know, it's like Sebastian Threms Lab at Stanford and a few others, and it seems like the founders of a lot of the companies that ended up eventually existing in this ecosystem all came out of the same sort of cohort of people, which I always think is fascinating to think about in terms

Starting point is 00:02:27 of lineages. Yeah, yeah, definitely a small world. The CMU team and the Stanford team, a few people who came and started this project at Google in 2009 came from those teams. I think when you started working on this, this was considered a little bit of a crazy thing to do, right? It was early on a lot of parts of the waves of deep learning hadn't really quite happened yet in terms of applications across all sorts of areas.

Starting point is 00:02:50 Like AlexNet hadn't existed yet, like all these other sorts of things that... Nothing existed, right? And you're absolutely right. People, you know, we heard a lot about us being crazy. and it's never going to work in full honesty with ourselves. We're not exactly sure if we are just a little bit crazy

Starting point is 00:03:07 or are completely insane when we're trying to go after this problem. Is you treated as an open-ended research project or you treated as like, hey, you had an idea in your mind of like an endpoint date where this would be viable on public roads? More of the former,

Starting point is 00:03:21 but it was not research, right? So, you know, in the University of Doppur challenge day, it was a research project. Then when we started at Google, it was under the belief that we can make it work. And if we can, then the impact, the positive impact of this technology on the world and the mission is worth it. But it was early days. So we actually had very little data to go by in terms of thinking about how long it's going to take and how hard the problem.

Starting point is 00:03:56 How long did you think it was going to take? Like at the time that you started looking at that. Oh, I don't know if we had a specific date, but that was actually the first question that we posed. It's not built a product, right? In the first couple of years or so, we didn't have like a product in mind or a target data mind. The first order of business was to explore the space.

Starting point is 00:04:14 So we, you know, towards that end, we created some milestones for ourselves with the goal of prototyping and learning and just understanding. So after those two years, they said, ah, okay, you know, there's something there. Let's start talking about, you know, what a product could be. and actually our first product that we thought was going to be viable was, you know,

Starting point is 00:04:30 what nowadays you would call kind of an advanced driver system, right? And we had some expectations of, you know, a small number of years that it would take for us to get there. When, you know, we, after working on it for a while and making more progress on kind of the core of the technology, we decided that was not the right path for us that we want to go after full autonomy. That was, you know, that made that pivot around 2013. You're now doing something like 100,000 rides a week. So five million rides a year sort of annualized out, which is incredible. What was the inflection point or what suddenly caused that sort of volume to happen or all these things

Starting point is 00:05:02 to come together? Because it feels like a reasonably recent phenomenon in some sense. Yeah, you're right. But there was a discrete, you know, some discrete jumps. So, you know, if, you know, we rewind the clock a little bit, I think in my mind, there were a few, kind of generational, discontinuous steps on kind of that progression from, you know, that point in 2013 when we said, let's go for it, to where we are today. I just exactly, right, 100,000 trips per week, more than a million miles per week and growing, you know, exponentially. So some interesting ones were, you know, in 2015, that was kind of our zero-to-one moment. This is when, you know, for the first time, we put a car in the road that was that, you know,

Starting point is 00:05:33 what we called the third generation of our system. This was the third generation of our, you know, self-driving hardware suite, you know, sensor of the computer, and we put it on a custom-designed vehicle that we called the Firefly. We took a few rides, but nobody behind the wheel, zero to one moment. Then the next evolution, that kind of generational skip, was our four-genital. of our driver, those are the Pacific minivans with the fourth generation of the Waymo Hardware Suite, and we deployed those in a full autonomous mode in Arizona, in Chandler, and we actually opened it out to the public in 2020. But at that point, the focus was on doing

Starting point is 00:06:13 it repeatedly, and the focus was on maturing the technology, the building of the driver, the evaluation of the driver, and of doing it. releases in a regular cadence, getting it out to real customers, hearing from the customers, understanding the feedback and iterating. So that was the focus of that fourth generation, was not, you know, to grow and scale and capture the market. And then at that point, we made the decision to jump to what, you know, we now call the fifth generation of the Waymo driver. It's on the JLR. IPAS is this is what you see in, you know, in the fleet today in those four cities. Phoenix, San Francisco, LA, and, you know, Austin.

Starting point is 00:06:52 I thought it was very smart, actually, that you started in Arizona. versus in California. And I think, you know, for example, I think Cruz ran into some issues in San Francisco where there was activists like putting cones on the cars and trying to stop them and doing other things. And so it seems smart to start in Arizona. I was just sort of curious,

Starting point is 00:07:08 what are the criteria that led you to start that as a sort of a test bed or a place to... So I guess, you know, it depends on the different time, yeah, horizon. So in the fourth generation, we picked a deployment area that was kind of medium complexity. And the goal there was,

Starting point is 00:07:23 dimension is to go end-to-end. So we picked an environment where we thought it was, we took enough of the boxes to help us learn the most important things that we wanted to learn, and de-risk, and that was a deployment. And then there's the development of the system. So for the development of the system, you want to go after the hardest problems possible, right?

Starting point is 00:07:39 You want to go after the densest environments. You want to go after the harshest weather. So we've kind of in parallel been doing that. So we've made a decision to deploy in Chandler in 2020, to learn from the end-to-end system while pushing on development of system. Then when we've learned enough and we made that discontinuous jump to the fifth generation of our driver, and then I said, okay, that's the platform we believe that we want to take to scale.

Starting point is 00:08:01 What's the hardest environment for a self-driving car? So density matters, speed matters, weather matters. Okay. So where you have those come together is where the most complexity is. So New York from the winter is really bad and then, you know. That's right. Yeah, that's right. But then this is why we picked San Francisco.

Starting point is 00:08:20 It's good for learning and advancing the driver. It is a very interesting commercial market. We also at the same time when after downtown Phoenix, it has a little different mix, more of the higher speed roads. And that gives us kind of the way we think about it in terms of the development and evaluation of the driver is kind of in the space of the operating domain, not necessarily areas or zip codes.

Starting point is 00:08:42 And then you kind of take cities and you map that to the operating domain and you deploy it, right? And then looking forward is the kind of in terms of how we think about future cities. It is, you know, a few lenses. It is market. Is there a good market from the commercial perspective? What is the technical complexity? You know, what is the regulatory environment?

Starting point is 00:09:02 And that's how, kind of the lens that we apply. Yeah. What were the big technology breakthroughs that got you to this fifth-generation driver that is really the one that you think you can scale now? So the biggest ones in AI, probably, you know, not surprising. Yeah, there was generational breakthroughs. And, you know, with every generation of the hardware, you know, of the driver, there's new hardware, so it's getting more capable,

Starting point is 00:09:20 it's getting simpler, it's getting, you know, more expensive, and there's a lot of simplification, but that's a boost, really it's all about, you know, AI, as you mentioned. And what specifically in AI was the shift or change that was important? Was it moving to sort of end-to-end DL for everything? Was it a transformer backbone? I'm just sort of curious, like, was there a specific thing there? So for that last jump, the models, you know,

Starting point is 00:09:41 transformers, they mentioned, played a huge role. Before that, you know, we had the big breakthrough. Before that, you know, it was a common-ness. You mentioned AlexNet. that was around 2013, so that gave us a big boost, but it was still kind of plateaued at the wrong place. And then it was a few of those things coming together. It is transformers, it is bigger models,

Starting point is 00:09:59 more compute, coupled with kind of the whole evaluation. We often talk about the architectures, and what's really more important of the machines surrounding the architecture. You need the architecture is an enabler, but really to make it work at the level that we care about, you need everything around it. The data engine, the flywheel of training the system,

Starting point is 00:10:16 evaluating it, and you can you kind of have to think about the problem of evaluating the driver in tandem of building it, right? And, you know, you need the simulator, you need data. So all of those coming together, I think is what leads to, you know, the breakthrough and discontinuity that you're seeing with where we are today. Can you explain how you guys think about evaluation internally and then also, you know, how that might differ from how regulators evaluate this from the safety case perspective? So that is a big question. But I think I'm glad you're asking about it because it is. a super important and insanely difficult problem, right?

Starting point is 00:10:52 We often talk again about the building of the driver, but there's two problems, the building of the driver and the evaluation of it, and they go hand in hand. So internal, it starts with figuring out what metrics you care about. Then bringing the data to support the evaluation of those metrics, which we have hundreds. Then you need all the infrastructure,

Starting point is 00:11:12 including things like the simulation. Some things you can evaluate an open loop, and some things you need closed loop simulation So you need to build a realistic, scalable simulator to support all of that. There are all of these metrics that guide our development of the system that help us improve and kind of train the Waymo AI, the Waymo driver. And then it funnels into what we call validation and evaluation. The aggregate of the evaluation validation methodologist is what we call the readiness and safety

Starting point is 00:11:40 framework. You know, if you look at miles driven in an urban setting or, you know, some comparison to human drivers in a given city that you operate. in. What is a relative safety level of what Waymo is doing versus a human driver at this point? We are pretty proud of our record. I think we have now, now that we've driven tens of millions of miles in fully autonomous, we're called writer-only mode, and driving today more than a million miles per week. We can, you know, with pretty good confidence in empirical data, say that we are, you know, better than human benchmarks. So we published some of that, you know, very recently

Starting point is 00:12:11 on what we call the Safety Hub. The latest data point that we shared was based on 22 million, autonomous rider only miles. And we compared our performance versus human benchmarks by different severity levels of context, collisions, different severity levels. So we see, depending on the severity level, we see that, you know, for the lowest severity, lower severity outcomes, we're about, you know, a factor of two better than the human benchmark force. And as you look at more severe outcomes, the gap increased. So for kind of the most of the year, that's exactly what we want to see, So for airbag, you know, deployment type of collisions, we're about a factor of six better than human drivers.

Starting point is 00:12:52 And that's without any notion or attribution of causality or fault. So if you bring that into the picture, you know, most of those are unavoidable, there's very little it can do. So we have done a study with Swiss RE, the global, kind of think the biggest global ratio in the world. And we partnered with them, we shared the data. They've done the analysis and they found that, you know, for damage claims, we had about a four X reduction versus the human baseline.

Starting point is 00:13:16 And for bodily injury claims, we had an 100% reduction. It was a smaller data set. This was earlier, but that was based on just a little bit less than 4 million miles, but also, you know, it was from their point of view, statistically significant. And that, again, we feel pretty part of that. Yeah, it's amazing. Given that, what do you think the regulatory stance should be? I think we want to make sure that, yeah, we enable this technology and service of the mission of making roads safe.

Starting point is 00:13:46 And, you know, we've been engaged with regulators for many years and have that dialogue. And, you know, so far we've had good success, getting all the necessary permits and all of, to a lot of scale. The way we think about it internally, and that's how kind of we have the dialogues with regulators, with communities, with writers. It needs to be based on transparency, and it needs to be in responsible, iterative, gradual process. Because this thing is very new. The technology is very new. The product system is very different. If you're doing a million miles a week now, what prevents faster rollout?

Starting point is 00:14:19 In other words, it seems like you've proven out that this is a safe solution. It's working very well at scale. You have your Generation 5 driver that seems to be quite performant. Why not go bigger or faster? We are scaling exponentially. It took us about three months to get from 50,000 miles to 100,000 miles. So, you know, we're moving at a good rate. But the main thing is...

Starting point is 00:14:43 What is the total number of? miles that are driven in the U.S. per year? I'm sort of curious on a fraction of old vehicles, I think it's, you know, in different modalities, you know, large vehicles, small things, I think it's just over 3 trillion miles. Okay. So quite a bit. Yeah, we have, we have

Starting point is 00:14:56 out of the way. But the way, again, you know, the way we think about it is that it's important for this to be an iterative process where we earn trust. It is not a thing where, you know, you, you know, build it and then you, you know, just turn on the switch and as you're ready. It needs to be, it's new, different needs to be a dial,

Starting point is 00:15:14 It needs to be, like, we talked about establishing the safety record, right? And we needed to build up to that, right? We collected tens of millions miles. Now we feel pretty good about that. So then we, you know, that gives us confidence that earns us, you know, trust. So you have to be transparent about where we are. And then we, you know, grow up scale. What do you think is the biggest shift that's coming from a technology basis going forward?

Starting point is 00:15:35 We had Andrei Kovratia on our podcast a couple weeks ago, and he sort of contrasted what he viewed as the Tesla approach, which is more, in his view, sort of software driven to the Waymo approach, which was more sort of hardware-centric. Do you think that's a correct characterization? And also, how do you view things sort of changing over the next, you know, a couple years or a year or two in terms of your right now? I would think, yeah, I mean, it's all about AI, full stop. We talked about, you know, a few big breakthroughs that allowed us to get to where we

Starting point is 00:16:05 are today, you know, Contonest, Transformers, Big Ness, now, most recently, kind of combining the Waymo AI with the congenital knowledge of, you know, VLN. This is at the core of it. And, you know, hardware matters, right? We have to operate in the physical world. So the hardware that we have on our cars gives us, I tend to think about it as an advantage, right? Like, you have to see well, right?

Starting point is 00:16:28 So, you know, as a human driver, right? Like, if you can't see, you know, if you close one eye, you don't lose depth, if you, like, if you're, you know, your vision is not 20-20, you don't have your glasses. Like, you'll still drive, right? We'll drive a kid, but, you know, you're not going to ask of a driver. But the point, I guess, The way I see it is it's all about AI,

Starting point is 00:16:46 it's all about building the system, building the AI, building the software, and being able to evaluate it. And that's what we have today. So far we have achieved good quality of the driver. We have built all the machinery to evaluate it and know what it takes. So now for us, it's an optimization. It's an optimization, simplification.

Starting point is 00:17:04 And if you have that, the thing that works, and you have a good mechanism to value it, then it really is drastically different in terms in terms of how fast you can move. I guess I've been in this other mode for years where we have not solved the kind of, they haven't cracked the nut. And you're kind of hypothesizing what it would take

Starting point is 00:17:23 and like, you know, what is the yield of this technical breakthrough? And you're kind of climbing uphill, right? And this is this old analogy of going to a mountain, you see the peak and you get there and you thought it was the summit, but no, now you see it like that, you know, the landscape. And that's what it felt like for us, you know, for many years, that's kind of the trend that the industry usually falls.

Starting point is 00:17:39 And I just find it that it's a qualitatively different place to be in when you've cracked the nut and you have the valuation, and now you can optimize in scale. As part of your optimization, do you think of reducing or simplifying the sensor suite as an important priority? Every generation of the hardware would increase capability, but also simplify drastically. And then the cost comes down every generation. So that was true. For the previous generations, we made a big jump from the previous generation to days,

Starting point is 00:18:08 and then from the 4th to the 5th, and then now going to the 6th generation. this was the primary focus simplification and bringing down the cost as well as with a vehicle making it about the user experience so absolutely and then of course the economies of scale right there's nothing like fundamentally look at the components that we use uh you know in our hardware there's just you know all of them are you know fairly you know commoditized and it was scale they just you know you can get to ride the typical you know curves of the hardware you know radars used to be expensive in the past right when put them all the cars you bring down the cost the same with computers right i guess qualitatively we know that uh

Starting point is 00:18:42 people can drive well with like a very simple vision system, right? And obviously, in the self-driving world, we have dramatically more in different types of sensors. Is there any sort of analytical framework that you all use in terms of the amount of data slash types of data you need to collect in order to have a performance system relative to the curb on the AI side? In other words, can you follow some sort of scaling curve on AI that you can pre-predict the set of sensors that you can do without?

Starting point is 00:19:08 What we have now is, you know, now that, kind of to jump to the core of the answer to your question. Now that we have, you know, correct the nod of the driver and the evaluation, we can answer that question with data, right? So we have, you know, we talked about different something. You start with humans, we have just two kind of, you know, cameras, but like on a pivot, right?

Starting point is 00:19:28 And we can drive kind of okay-ish, right? And then, you know, if your vision is not very good, or if you know, you know, you maybe not drive as well as computers can, right? So that's why we have vision, we have cameras, we have lighters, we have raters, and they will give us, you know, some benefits. And we can talk about, you know, what the pros of the deal and kind of how they bring to the table,

Starting point is 00:19:48 the data they bring to the table, and how they're nicely complementary, and what you get from the different sensing modalities just by the kind of the physics of it. But again, in terms of, you know, how do we, you know, answer that question, you know, for years it was more theoretical and you hypothesize, you know, how much of, you know, certain things you need. Now that we have a thing that works and we can evaluate it. We can, you know, bring data to the table. We can empirically, more empirically, you know, answer that question and say, what happens to the when driver, if you take away something, right?

Starting point is 00:20:15 Maybe you add noise to the system, maybe you degrade. Maybe you take away a full sensor modality. Maybe you take away lighter and you take radar away completely and you just drive a camera. You know, we'll drive? Yeah. You know, I can use a human drive with one eye or, you know, with blurry vision. Yeah.

Starting point is 00:20:28 Is that acceptable performance? Will you get a license? No. Same, same for us. We can, you know, take some way and we can answer that question. Like, is the performance? Can you still drive? Of course.

Starting point is 00:20:35 Is the performance good enough for full autonomy? And is it good enough for our bar of readiness? and safety, and the answer is no. But again, this is in the context of full autonomy, it is in a context of scale, and it is in the context of the responsible deployment and the high bar for safety and readiness that we stuff for ourselves.

Starting point is 00:20:59 If you change some of those inputs, let's say you talk about small scale or you talk about not full autonomy, you talk about a driver system, then the answer changes. You might have a different, if you still have a human in the loop and they're responsible for safety,

Starting point is 00:21:12 you know, a different configuration, you know, whether it's, you know, just you get rid of sensor modalities altogether or, you know, you may be useful three modalities, but you pick a different operating points because, you know, it's cameras, it's cameras, right. Their solution, dynamic range, cleaning, you know, same for letters, but, you know, the answer of your operating point would change based on kind of where you set the bar and what the product actually is. If you look forward a year or two, like now, you know, crack the nut, Waymo is looking at scale and probably more about the business. Like, do you think of robotaxies at scale as the near-term business plan? Are there other modalities or, like, deployment? Like, if I think about this as

Starting point is 00:21:46 just like a KAPX problem, like other avenues that are important for you guys to explore. That's the main one. Right-haling is the main one. We're focusing. So we're, you know, focused on technology. We're focused on the product. We are learning from our users. We're everyday, we're earning trust. And we're setting up the ecosystem of, you know, partnerships in that space. So that's our primary focus. We are very excited about the commercial opportunity either. It's not the only one. I always think of Waymo as a technology company. We're building a generalizable Waymo driver with the mission to build the world's most trusted driver. And we want to deploy that driver, not just in right healing. There's more than three trillion miles in the

Starting point is 00:22:25 U.S. There's more than 10 trillion miles driven worldwide. So the vision and the mission is to deploy the Waymo driver in different commercial products and different applications and maybe in different you know, modalities across all of that spectrum. So that includes, you know, things like deliveries, those things like, you know, long-haul trucking. It includes, you know, things like personally owned vehicles. But right now, we're very focused on right-hailing. Do you think you'd license that out,

Starting point is 00:22:51 or you think you'd actually build the vehicles for these different use cases? We're not, you know, we've never been in the business of building vehicles. We partner, right? And I guess this is how, like, this is, the mission is so important. And the opportunity is so massive that we don't want to go it alone. So we think about partnering and weaving together the ecosystem to pursue all of those different commercial applications and all of those products. And you see us doing that in the right healing business.

Starting point is 00:23:19 We don't build our own vehicles, right? We partner with partner with tier ones. We partner with OEMs. We partner with other companies who help us on our operational side. We partner with companies who help us on the network side, so forth and so on. How do you think a car ownership will change over the next decade or two? So if, you know, it's very exciting to see this ramp in terms of, you know, driverless rides that are happening. And one could argue at some point, some proportion of the population, just like people flip to Uber to, you know, ferry them around in major cities versus driving or taking taxis or other things, there could be a flip here to autonomous systems versus owning cars, right?

Starting point is 00:23:53 You should just be able to order something on demand, have it show up at the right time, and you just get in and it takes you wherever you need to go. do you view that as a 10% use case, a 30%? I'm just sort of curious, like, what proportion of miles do you think we'll convert over time? I think over time we'll see more and more. As the technology matures, as it gets deployed on more of these different products and modalities, I think you're starting to see some of that

Starting point is 00:24:15 even today in right healing, right? And it's not uniform, right? But even, you know, kind of the densest cities, if you look at, you know, people who live in San Francisco or, you know, people who live in New York, even before, you know, autonomy, right? There was a shift, you know, fewer people wanted to, in those areas, wanted to own cars, especially the younger generations, right?

Starting point is 00:24:33 It's not what they're excited about, right? And I think with autonomy in those areas and kind of the densest course, you'll see more of that evolution and continued trend, right? And then, you know, over time it will expand and so over and so. But, you know, the thing that I'm most excited about is, and all of those modalities will be bringing the safety benefits of this technology to the ecosystem. One of the reasons I ask is I remember talking with people in the self-driving world, I don't know how long ago, eight years ago, nine years, or whatever it was.

Starting point is 00:24:58 And at the time, everybody thought this wave was coming. I think a lot of people were off in terms of the time frame in terms of what actually happened. And there was sort of a flurry of startups all getting up and running at the time. And a lot of the conversations were around how urban environments would change over time. Where do you actually put a parking lot? Or where do you park your car versus having like a lot outside of the city that would then the cars would come in and pick you up at the right moment versus everything needed to be centralized?

Starting point is 00:25:22 And so one of the reasons I was asking that question was to try and get a sense of your view of the order of magnitude and also time frame by which some of these transformations may occur? You know, I don't want to speculate on specific timeframes. I think division is absolutely the correct one, right? And so we talked about the safety benefits of it. That's the primary one. Then there's accessibility. Once you have, you know, you get those benefits once you can deploy at scale, right?

Starting point is 00:25:42 And once you are deployed at scale, you can do things like, you know, use land in a better way, right? So if you're taking up so much space for parking lots, having, you know, personally on vehicles that you just sit around 90% of the time, you can do better. In a society, we can do better. All of those, but the thing with those, benefits is that come from scale.

Starting point is 00:26:04 And again, safety is the primary one that we're very focused on. We've talked about as kind of the North Star, as the mission, as the vision for many years, but it was always kind of with this once we get the scale. So I think today we're starting to actually earn the right to talk about starting to realize that mission, right? you know, tens of millions of miles that behind and well, but, you know, more than a million

Starting point is 00:26:28 per week, we and the benchmarks, the safety benchmarks that are, you know, statistically significant to seem, you know, empirically ambiguous, we can talk about actually, you know, real, tangible safety benefits and, like, inducing, you know, harm and injuries that are happening on the road today. So that's our primary focus on this, you know, I think that's the primary benefit that we'll see. And then beyond that, I think there's going to be exhilaring ones like the ones we mentioned. What do you think is the role of traditional OEMs? in this world, when, you know, I'd say functionally, like, a car takes you from point to point, ride-haling takes you from point to point with, like, you know, different tiers of comfort level.

Starting point is 00:27:05 But, you know, a very large industry has been built around people buying passenger vehicles for, like, you know, industrial design or brand or all these other things that, you know, you really need a Ford Raptor to run around the Bay Area, right? Well, how do you think that evolves? When, you know, perhaps one of the more primary drivers of value is now AI and the ability to do this autonomously. Right, but I guess, you know, we think of what we're doing as building the driver. Mm-hmm. And the driver could still drive different cars. And the driver, exactly.

Starting point is 00:27:40 You put the driver, you're still in the car. And different four factors, you know, whether it's a car that's good for right healing in certain, you know, urban environments, whether it's, you know, a different vehicle that you need for good transport or a truck or, you know, something that you want to take on longer trips with your family. Yeah, we'll need different cars. We'll need different form factors. And I think it's very, very complementary, what we're doing and what, you know, the car industry is building. Perhaps to a lot's point about how cities will change, the answer has clearly been given the efforts to change both drivers and cars and the environment to just change the driver, right, which is what you guys have done. Do you think there are arguments still to change the infrastructure, right?

Starting point is 00:28:20 Like, you can make, for example, in the public transport space, right? There's other form factors that require the participation of the public sector in order to deploy. Absolutely. Sustainability is very important for us. Safety is their primary thing. But I think all of those modalities can coexist. In fact, just in the last couple of days, we announced something that we're doing where we are incentivizing people to take Waymos in the cities where we operate to public transit hubs and then everybody benefits.

Starting point is 00:28:52 Sure. How do you think about the form back to the car itself? I know that there was companies like Zooks that Amazon bought where they kind of hollowed out the inside of the car because you no longer needed the steering column and everything else and they put seats facing each other almost like a London cab. Do you have any thoughts on what that experience will look like in the future as more and more things move to autonomous, self-driving, ride-hailing systems? Yeah, so designing a car around the passengers makes total sense. In the past, we've designed cars around primarily the driver. If it's the Waymo driver, it's all about the rider experience. So we have done quite a bit of work on the sixth generation of the Waymo driver and the car.

Starting point is 00:29:36 And the car is designed with the passenger in mind. So it is more spacious. It is all about the user experience. You have flat floors, you have lower floor for entry, you have doors that slide to the side. So it's all about getting in. So absolutely, you know, there's different aspects of it. Like, you know, we don't have cars facing each other.

Starting point is 00:29:56 I think there's, you know, it's an open question. Like some people get, you know, nauseous when you do that. Like, you kind of want to, you know, there's benefits on facing forward. But, you know, all of that I think will be for us as an industry to figure out as we move forward. But I think the key point, it becomes, you know, much more like the design is around the rider, not around the driver. It seems like you could do a very interesting excuse, too, where, you know, I've always wanted a car with like a peloton in the back or something. So as you're commuting, you just kind of connect.

Starting point is 00:30:22 I like that. I like it. I want to be in a bike, but I want my bike to be inside of a car. There might be like, let's see if you can, wow, why do you want to be outside? Yeah. You're in a pained environment. Yeah. Especially in California.

Starting point is 00:30:34 That forbid you actually, yeah, take a bike. I want a much less exotic form factor. I just want to be able to take a Zoom with stable internet and have it not look weird. On a bike. But you want to be on your bike, sure. on a bike line. It's very exciting. We do, we do increasingly have, you know, team members calling into meetings from Waymos.

Starting point is 00:30:55 It is, but it is, you know, actually, you know, we joke about, but it gets to the point of privacy. It just becomes, you know, endless if you don't have another human car, right? You can, you know, you can do a work meeting, you can do a, you know, call, you can, like, listen to your favorite music on full volume and not worry about, you know, that interaction of, like, having another human that you're sharing this basis. basis. So we are seeing that was one of the, you know, hypotheses of the benefits of our product. And we are, you know, seeing very positive feedback from our IRAs today, you know, along that specific dimension as well. You know, it's just somebody excited to see this technology expand coverage range. Is the blocking factor to, let's say, you know, a billion miles a week? Is it like putting more cars on the road from a capital perspective? Is it just operationally?

Starting point is 00:31:43 This can only happen so fast. Is it your view? of what you want to see from a safety and trust perspective, like consumer trust perspective, what's the bottleneck? Primarily, it's the latter. Okay. So we've always, you know, our playbook has been to go about it responsibly and gradually and earn trust every step of the way and have this transparent dialogue.

Starting point is 00:32:05 Again, this is a very new thing, new technology, new product, very different from what people are used to. I think it has to be the sort of process. And again, you know, trust is the thing that's hard to, you know, earn. but very easy to lose. So that's the main thing, right? And we see that. You see that in places where, you know, we operate and we've engaged with communities

Starting point is 00:32:25 and there's writers who have, you know, use Waymos. It is, there's a lot of trust. There is, you know, people, you know, use the word magic a lot about the experience. And then you go to a different place where people have not experienced it and there's more anxiety and less of trust. So you can't just get there in one step. You have to do it kind of responsibly and iteratively. So that's the main thing.

Starting point is 00:32:47 Yeah, my sense was back when they had elevator operators, getting rid of the operator was a big deal, right? Because you used to have somebody in the elevator who had closed the door for you and push the button and control that experience. That's interesting to see that evolution of different types of technology over time or people's interpretation of it. How do you think about generalize abilities? You mentioned you're building a general purpose driver

Starting point is 00:33:06 that could potentially port into other types of vehicles. Do you think there's other extensions into other forms of robotics with what you're building? Or do you think those are all more specialized models? or how do you think about where this could go from that perspective? On the driving part, we've kind of designed it to be generalizable, and we're very happy with what we're saying with the fifth-generation driver and, like, the AI generalizes really well. And based on, again, we've been using data from a very broad ODD to build it,

Starting point is 00:33:32 even if we're deploying, you know, responsibly and gradually, once we believe that we've achieved the level of performance that we require for a certain ODD, which maps to, you know, certain areas and subcodes. And you too, the other part of the question of, you know, going beyond autonomous vehicles. Some of the stuff by the nature of the problem and the complexity,

Starting point is 00:33:50 I think some of the research that we do is pretty foundational. When we talk about perception, you know, you can be in a car, you can be in a different modality and like operating in the physical world. A lot of the research that we published, a lot of work we've done, I think can benefit those communities as well. I mean, talk about, you know, AI being deployed in a kind of real-time system,

Starting point is 00:34:11 in a safety critical system, a lot of the work that we have done, I think we can translate to others. You know, so forth, when we talk about the evaluation of the system, kind of what may need robotics applications, beyond time as vehicles, need a good, realistic, scalable simulator. That fundamental work translates.

Starting point is 00:34:29 So in your support and so on, we are very focused on the trillions of miles where we can have the positive benefits. So for us, I think focus is very, very important. So we are being very laser focused on driving. Can I go back and ask, like, perhaps a more technical question here? Like a while back, you said, you wanted to, you know, focus on the full autonomy problem. There were, there are many other teams who actually, you know, have some lineage in the, like, you know, Waymo, chauffeur, Google programs that chose a use case that looks like it was going to be easier. Trucking, like long-haul trucking deliveries.

Starting point is 00:35:12 it's not clear that's much easier. Do you think there's a lesson to be learned here? Or at least, you know, there are more miles being driven autonomously on the road in passenger vehicles by Waymo than in these other applications today. Yeah. What lesson is there for? I think there's a great question. You know, kind of the big differentiation that I would draw, and this is like orders of magnitude,

Starting point is 00:35:32 and the difference between full autonomy versus at scale versus, you know, a driver's system. And that's the big, like, that's the kind. It's your question of different vehicle platforms, different operating domain. You can have slower speed applications where you do local deliveries, or you can have a trucking application on freeways. And they're a little bit different,

Starting point is 00:35:59 but if we're talking about full autonomy, maybe there's second order differences, but the first order complexity is still there. If you think about the core, the heart of the problem of building a generalizable and safe driver and being able to evaluate it, and the incredibly high bar of safety. The complexity of the noisy, messy, physical environment

Starting point is 00:36:19 and the long tail of people doing all kinds of weird things, and the necessity of making real-time decisions where milliseconds matter and how hard that AI problem is, the distribution, the contours change a little bit if you're talking about freeways or all speed, but the fundamentals are like there's no silver bullets, you don't get to skip the core complexity. For example, freeway

Starting point is 00:36:41 On the nominal case, they're a bit more structured, but you still encounter with lower frequency, but at higher speeds where the severity that you encounter all kinds of things. You encounter, you know, construction zones. You encounter, you know, grills and mattresses and all kinds of stuff falling off of the cars in front of you. You encounter cars, you know, having getting into accidents and kind of spinning out in front of you. You encounter, you know, people driving recklessly, you know, on cars with our motorcycles. You encounter, you know, pedestrians jaywalking. You encounter, you know, all kinds of things, right? And it happens much less frequently.

Starting point is 00:37:16 So this is where, you know, it might be unintuitive. If it happens, you know, at once per million miles, none of us have seen, you know, examples like that in our world. So it kind of, it can lead to this, you know, early stage optimism about, like, okay, there's the simplification. But if you want to do it full autonomously and you want to do it at a scale, that complexity is still there. It's just, you know, the flavor changed.

Starting point is 00:37:34 And why is it breaking from like, you know, let's say advanced driver assistance that it seems to work in more and more scenarios versus, let's say, full autonomy. What's the delta? Yeah. It's the number of nines. And it's the nature of this problem, right? If you think about where we started in 2009,

Starting point is 00:37:54 one of our first milestones, one goal that we set for ourselves was to drive, you know, 10 routes. Each one was 100 miles long, all over the Bay Area. You know, freeways, downtown San Francisco, around Lake Tahoe, you know, everything. And you had to do 100 miles with no intervention. So the car had to drive autonomous from beginning to.

Starting point is 00:38:15 That's the goal that we created for ourselves. You know, about a dozen of us, took us maybe 18 months, which you've done. 2009, no image net, no content nets, no transformers, no big models, tiny computers, you know, all these things. Right? Very easy to get started. It's always been the property. And with every wave of technology, it's been all, you know, very easy to get started. But that, the hard problem, and it's kind of like that, the early part of the curve

Starting point is 00:38:39 has been getting even steeper and steeper, but that's not where the complexity is. The complexity is in the long tail of the many, many, many nines. And you don't see that if you go for a prototype, if you go for a driver assist system. And this is where we've been spending all of our, that's the only hard part of the problem. And I guess nowadays, it's always been getting easier

Starting point is 00:38:58 with every technical cycle. So nowadays, you can take with all of the advances in AI, and especially in the generative AI world and the LLMs and BLMs, you can take, take kind of an almost off-the-shelf, you know, Transformers are amazing. VLMs are amazing. You can take kind of a VLM that can accept images or video and is, you know, has a decoder where you can give it, you know, text prompts and a lot of text. And you can fine tune it, you know, with just a little bit of, you know, data to go from, let's say, camera data on a car to instead

Starting point is 00:39:33 of words, to trajectories or, you know, whatever decisions you might. Just, you know, take the thing Because a black box, you take whatever's been trained for a limb, you fine tune it a little bit. And like that was, I think if you ask any good grace student in computer science to build an AV today, this is what they would do. Yeah. And out of the box, you get something that, it's amazing, right? The power of transformers, the power of vehicleism is mind-blowing, right? So with just a little bit of effort, you get something on road, and it works. You can, you know, drive out on tens, hundreds of miles, and it will blow your mind.

Starting point is 00:40:01 But then is that enough? Is that enough to remove the driver and drive millions of miles and have a safety record, you know, that is done this really better than humans, no, right? And I guess this is, you know, with every tech, you know, evolution, technology and a breakthrough in AI, they've seen like that appreciate it. Is it the right way to think of the iteration cycle for Waymo now still like many other AI companies where in eval, some set of cases comes up that you don't handle as well as you want, and then you collect more data and you put it into the pipeline, you retrain and you deploy?

Starting point is 00:40:32 Or are there still architectural changes that are happening even past this point of cracking they're not. Both. So the first thing you mentioned where it's, you know, the data collection and, you know, understanding where performance is not going to have kind of building the whole, you know, data flywheel and evaluation flywheel, that's at the heart of it. Right. But I think there's, this is where it gets, you know, a bit nuanced. You know, what do you do? What is the architecture and what is the training methodology? Right. In particular, the kind of the simplest thing you can do is, you know, an end-to-end model that is trained just on imitative, you know, of imitating human drivers. So, you know, very easy.

Starting point is 00:41:07 Since there's, you know, pixels go in, driving behavior that you have examples of, and you just train it to imitate human drivers. And you can run this, you know, flywheel and kind of run the circle that you described while operating, you know, under, you know, that paradigm. Yeah. And you'll make progress, right? It's a very well understood kind of approach, right? You balance your data, you find some examples, you know, where you're not performing

Starting point is 00:41:28 as well as you would have liked, if you figured out, you know, how to evaluate it. You stimulate more examples that look like that. Well, okay, now you're getting simulation, right? This is exactly how it starts to get interesting, right? Yeah. You know, for what you described, you don't need a simulator. You can run that, you know, you can turn the crank on that, this whole machinery, even without having a simulator.

Starting point is 00:41:44 You just do the open loop to imitate. So you find more examples where, you know, like, for example, things you want to imitate. And you reduce the number. You find examples where, you know, humans did something you don't like. So, you know, you got a few, this is, you know, your dataset balancing. And you will, you know, continue to improve. You might plateau in the wrong place. You might plateau in the right place for a driver assist system.

Starting point is 00:42:01 You will plateau at the wrong place for a fully autonomous system. So then you need to kind of. you know, build that machinery, you'll still like at a high level, that principle of new data and, you know, augmentation holds. But to, you know, really go the distance to full autonomy, you need to do other things, right? You need to do synthetic data.

Starting point is 00:42:17 You need to do closed-loop simulation. Maybe sensor simulation is not enough because when you do it at scale, it's just highly, you know, inefficient, it's not practical, right? Just kind of doing, you know, it's imitating sense and piping it through the whole thing. So then you get into like an immediate representations.

Starting point is 00:42:29 Can you, you know, simulate in that space? And you still are doing kind of at a very high level, you know, the flywheel, The loop holes, right? But what's in the loop, I think it depends on... Exactly, exactly. So I guess where are we today and where do you think we're going? I think we are reflecting on kind of this journey that's been quite a few years.

Starting point is 00:42:49 I find myself more excited than ever about where we are, the momentum and the future. I've been doing this for close to two decades. The vision was always there, but we had these big existential questions. Can we build the thing? Can we figure out what's good enough and how to evaluate it? Will people want to use it? Can we do it in a way that's commercially viable, so forth and so on? And can we go the distance?

Starting point is 00:43:25 Like now, where we are today, operating at the scale we are and scaling, we've demonstrated that we can build a thing, we are proud of our safety record, figure out how to evaluate it. We see that people want to use it and get very positive feedback and people are excited about it. We see that we can do it in a way that's cost-efficient, become likely viable. So I am super excited about what the future holds and we're starting to talk about realizing the mission of actually making, you know, realizing those safety benefits. So now it's all about, you know, optimization, scaling and bringing this technology to more people and more

Starting point is 00:44:00 places. Amazing. Very exciting. Thank you again for joining us today. Thank you for And congratulations on, you know, the breakthrough progress over the last small amount of time. Thank you. Thank you. Find us on Twitter at NoPriarsPod. Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no dash priors.com.

Your Ad Here

No Priors: Artificial Intelligence | Technology | Startups - Waymo’s Journey to Full Autonomy: AI Breakthroughs, Safety, and Scaling

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.