Moonshots with Peter Diamandis - AI Roundtable: What Everyone Missed About Gemini 3 w/ Salim Ismail, Dave Blundin & Alexander Wissner-Gross | EP #209

Starting point is 00:00:00 People who are already in the ecosystem now have a super intelligence at their beck and call. That's probably the least interesting thing. When they're on the cusp of the singularity, they'll start soft-selling it. Gemini 3.0, which has in just today climbed all the third-party AI rankings. Let's break down, though, what this so-called Gemini leap means. This will change the game completely for everything everywhere. Why is this just not another, you know, little faster, little better capability? We have a way of measuring progress in our world.

Starting point is 00:00:30 civilization. AI is imminently, I think, well positioned now that these benchmarks are saturating to start solving the hardest problems on Earth in math, science, engineering, medicine. All of a sudden, you can build software by talking to the machine. This is like a different world starting today from the day that we lived in yesterday. Now that's the moonshot, ladies and gentlemen. You know, the hardest thing for me when I I'm going over the slides is what to cut out. I mean, it's all so good. All right. It's every, every one of them that could be like an entire hour conversation. The question of how we group it and how we actually make it such that it's a fun conversation is, is so challenging. I mean,

Starting point is 00:01:20 so much going on. We all just want to have an episode on robotics, an episode on energy, an episode on AI. Yeah, but then if we do that, you know, we're publishing more than one a week, which is a lot, and sometimes we do. And then if you've gone like three weeks without covering one of the fields, it's like, you know, disruptive shock therapy. The world is over. Well, it'll accelerate. So people, the audience has a limited amount of time, too. So we've got to try and help them as much as possible twice a week, basically. And that's all you can do. I mean, I hope you guys have as much fun as I do on this. Oh, yes. It's awesome scanning. all the breakthroughs and looking at how fast it's all moving.

Starting point is 00:02:03 And then trying to figure out, okay, what does this really mean? Okay, besides yet another benchmark or besides yet another, you know, this number is greater than that number. Okay, so like what does it mean for everybody? Well, for me, you know, I get so buried in the day to day, you know, there's just so much going on. And if it weren't for the podcast pulling me out of the weeds, I would miss all kinds of things.

Starting point is 00:02:26 And I tell you, I get really frustrated when people don't know what's going on and they're not reacting to it. I'm like, well, the only reason I know what's going on is because we do the podcast. And that prep time for it is what pulls me up out of the, so I love this time. For sure. And it's like my, I told my son, hey, you know, Gemini III's out. It's got an amazing benchmarks. And he goes, yeah, insert name of model here, insert number here. Like every week, you tell me that. It's like, yeah, you're right. Yeah. Well, we're the antidote for that, because, you know, as we're always saying, people get inured to things so quickly

Starting point is 00:03:00 and they miss the implications. And that's true even at MIT where I've been for the last three days. But it's just not true. This is like step function, life-changing stuff. Week by week. I think we should just jump in

Starting point is 00:03:14 because, like, there's a lot. If you guys are ready. All right, so I'm here with DB2, AWG, Mr. EXO. It's our new call signs. And let's, Let's jump. We're all airports.

Starting point is 00:03:28 They're all three-letter airport signifiers. Okay, let's get going here. So welcome to Moonshots, everybody. This is another episode of WTF just happened in tech. The real news, and for us, like the only news and the implications, and what does it mean? And hopefully we're going to go deeper into what does it mean for you, your family, your business, your company, your country, all of those things. We're going to open up with the hypers, Google, X-AI, OpenAI, OpenAI. and the TLDR for this episode is Google is winning.

Starting point is 00:04:01 A lot going on in Google First. We just saw the release of Gemini 3 yesterday, which is why we're recording today, trying to be right here, right now. All right, let's jump in. I'm going to share a video from Josh Woodward. Josh is a friend. I had him on the abundant stage a year ago.

Starting point is 00:04:19 He now heads Gemini and Google Labs, a brilliant presenter. We're going to have him on this podcast, right in the new year. Excited for that. All right, let's jump in. Hey, everyone. My name's Josh, and I lead the Gemini app, Google Labs, and AI Studio. And today is the day. Gemini 3 is here, and it's in the app. You can try it right now. It's our smartest model ever. We have this new feature called Agent. And you can actually go in now to Gemini, describe a task, and it'll get to work for you. So you can plan a trip, you can research products, all these things, acts on your

Starting point is 00:04:53 behalf, takes multi-step actions, tool calls, all of it. The other thing I'm really excited about, we're entering into a new era where you can create UI dynamically. The model creates these generative UIs. So you can go in and when you ask a question, Gem and I will not just respond with a wall of text. It'll actually pull in images, different interactive widgets, gives you a much more customized experience based on what you're looking for. All of this gives you a more helpful response. And so I hope you go out, try both of those features and more today. We look forward to your feedback. All right, one more video here from Gemini, then we'll discuss it.

Starting point is 00:05:28 This is their official introducing Gemini video. And again, congratulations to Josh for taking the lead there and crushing it, crushing it. We'll talk about the benchmarks with, of course, AWG in a little bit, but before then. Gemini 3 is the strongest model in the world for multimodality and reasoning. It's our most intelligent model that helps you bring any idea to life. In Google Search, Gemini 3 enables new kinds of generative user interfaces. It codes interactive simulations like this one, custom-built for your search.

Starting point is 00:06:02 In the Gemini app, you could supercharge how you learn, create, plan, take action, analyze complex videos, and more. We're even introducing a new platform, Google Antigravity. our vision of software development at the frontier of model intelligence, it lets you use Gemini 3's agentic coding capabilities to accelerate how you build. This is just the beginning of our Gemini 3 series. Okay. Who wants to dive in first? Dave, you want to jump in? What's this mean to you? Why is this just not another, you know, little faster, little better capability? Dave is full kit in the candy store here. This is great. Well, I can't wait to hear.

Starting point is 00:06:48 Alex's take on this too. It's a 50% almost in humanity's last exam. It's such a step function change in history. And I was over at MIT last night talking to a bunch of undergrads, and I'm trying to tell them, like, look, you don't know this, but, you know, 40 years ago, we started writing code as a species. And we started with cobalt and, you know, in PL1, and APL, and then... We started with ones and zeros and hexadecimals, what we started. That's true. We started with assembly. And I swear to God, if you look at what happens today when you write code versus 40 years ago. It's identical. It's like a higher level language. Nothing's really changed. All of a sudden, you can build software by talking to the machine. It is such a

Starting point is 00:07:29 different world starting today and moving forward. And I'm hoping they can then generalize and say, well, it's coding today. It's gene sequencing tomorrow. It's all white color automation the day after that. Then it's all industrial design of robotics is done by voice. This is like a different world starting today from the day that we lived in yesterday. And it's really hard

Starting point is 00:07:56 to get people to fully understand the implications. So it's just such, well, anyway, we'll get into it. All right. I can't tell you how big this is. Alex, what's your takeaway, buddy? I've said in the past here, I think the singularity is probably an optical illusion

Starting point is 00:08:12 when you're in the midst of it, space time, feels flat. And every time I hear the question, well, what else is new? The benchmarks are going up and to the right doesn't feel really transformative. That to me is a sign that when you're in the midst of a singularity, that spacetime feels flat and breakthroughs that are happening essentially every week or every day feel prosaic. There are so many transformative aspects of Gemini 3. Just walking through those two videos, starting from maybe the least transformative aspects. The Gemini app itself, which is how many people are likely to first encounter. Gemini 3 now is integrated with all of the other Google properties. So there's been a lot

Starting point is 00:08:53 of belly aching over the past year. Like, why can't I agentically have Gemini write my Gmail for me or have it organize my calendar for me or interact with YouTube movies? I've been playing with Gemini agent, the agent mode part of Gemini, Gemini 3. And that's seamless at this point. And it's literally a single click to get Gemini 3, order your entire Google platform-based existence or Google workspace-based existence. That's probably the least interesting thing. But a powerful driver for people to switch to Google as an all-in platform, right? I mean, that's what's really the situation that they're striving for? Google has billions of users across all of its products already.

Starting point is 00:09:36 So I'm not sure at the margin the greatest impact on humanity is getting people to switch to Google. I think it's more people who are already in the ecosystem now have a superintelligence at their beck and call. And again, that's like that's the least interesting thing. A couple of more interesting things in interacting with the model itself. And again, this is not focusing yet on the benchmarks. This is just on interacting with the client. It smells. People in the community refer to something sometimes as big model smell, a model that has certain types of capabilities that can't be arrived at through extended reasoning or through other sort of smaller.

Starting point is 00:10:11 footprint attempts to extend the capabilities of a model. Gemini 3 has what I think can be fairly termed big model smell. You can ask it to do cross-modal or multimodal tasks that are very challenging to do elsewhere. One of my first tasks was I fed it a photo of the MIT campus, and I asked it, generate a 3D voxel block world type rendering that I can interact with, and one shot basically zero. shot, it produced an interactive 3D rendering of the MIT campus. There's also, I don't want to let this point drop, anti-gravity, the code development environment, the integrated development environment that was focused on Gemini 3.

Starting point is 00:10:57 My understanding is that the windsurf team, we've talked about windsurf in past, cursor competitor, many of the core members of the team joined Google Deep Mind and anti-gravity as a result. I was interacting with anti-gravity. It was a very impressive visual studio code-derived experience for code development. So there are so many pieces here. And that's before we get to the truly interesting stuff in my mind, which is the benchmarks. Yeah, yeah. You know, one of the things I'll say one of the things we said a while ago is when they're on the cusp of the singularity, they'll start soft-selling it.

Starting point is 00:11:32 And you noticed, you know, Google put out all these benchmarks that are mind-blowing. And the only thing they put out in terms of content is that Josh Woodward clip from a song. You know, it's a contract, contrast that to the open, you know, the GPT5 release, right? Which was a special hour-long presentation by Sam and so forth. This was, like you said, very soft, a very soft cell. You know, one thing I found fascinating is the speed at which we're sort of up-leveling the models, right? Gemini 2 was December of last year, 11 months ago, and now we've got Gemini 3 coming out.

Starting point is 00:12:09 So increasing speed at which we're deploying, we're seeing that across the board with the hyperscalers. Maybe just to comment narrowly on that, from my perspective, Gemini 3 is the biggest model release since OpenAI's 03 in April, all of seven-ish months ago. GPT5, to the extent GPT5 may have felt slightly underwhelming, I would argue it's because almost all of its raw capability jumps actually happened a bit before in the form of 03. and then maybe think of GPT5 as 03, which is actually 02, because O2 was trademarked. So it has to be called O3. GPT5 was actually like 02.1. So I think we can't take credit away from OpenAI on the achievement that was 03 and then partially repackaged as GPT5.

Starting point is 00:12:54 Every week, my team and I study the top 10 technology metatrends that will transform industries over the decade ahead. I cover trends ranging from humanoid robotics, AGI, and quantum computing to transport energy longevity and more. There's no fluff. Only the most important stuff that matters, that impacts our lives, our companies, and our careers. If you want me to share these metatrends with you, I writing a newsletter twice a week, sending it out as a short two-minute read via email. And if you want to discover the most important metatrends 10 years before anyone else, this reports for you. Readers include founders and CEOs from the world's most disruptive

Starting point is 00:13:29 companies and entrepreneurs building the world's most disruptive tech. It's not for you if you don't want to be informed about what's coming, why it matters, and how you can benefit from it. To subscribe for free, go to Demandis.com slash metatrends to gain access to the trends 10 years before anyone else. All right, now back to this episode. This for me is seeing Google go from reactive assistant, right, where you're asking it for something to autonomous agent and handling, you know, complex real world data. And we're going to see that in the next slide. Let's go there. So let's go to Gemini 3 delivers breakthrough profitability in AI-run mini-economy. This is the Vending Bench benchmark, which I love this.

Starting point is 00:14:17 And Gemini 3 outperforms GROC-C-Claude chat GPT in long-term business management tasks to explain to us what this means, the king of benchmarks. Alex, let's go to you. I love benchmarks. I love this benchmark in particular. So this is a benchmark vending bench arena that's maintained by company named Andon Labs. It's derivative of another benchmark that they maintain named Vending Bench 2. The basic premise is AI agents are given simulated $500 to start. They're put in charge of a simulated vending machine.

Starting point is 00:14:54 They're given tools that they can manage. So they have the ability to send and read emails, like real full, natural language emails. They're given the ability to search a simulated internet. They have a simulated bank balance. They can send money. They can receive money. They can stock and restock the vending machine. They can set prices, check inventory, collect, cash, et cetera.

Starting point is 00:15:15 So this really is performing the role almost of a middle manager in charge of a vending machine. And if they, the simulated agents maintaining the vending machine, if they fail to pay a $2 daily fee for 10 consecutive days, they go bankrupt. And the goal of the game is to maximize the return on investment for that initial simulated $500. And I think this is just such a lovely self-contained proxy for AI agents as first-class economic actors. If AIs can do a spectacular job of managing this pretty rich simulated vending machine world, then I think they're halfway to autonomously running their own real-world businesses and becoming AI entrepreneurs, at which point we get zero human startups. Well, it's amazing.

Starting point is 00:16:03 We talked about this, Gemini 3 is delivering almost 3,000% more profit than GPT5 or Claude Sonnet. And you're right. We've talked about going after stable coins and agents together spinning up new businesses faster you can possibly. Now, the one thing this doesn't do is it doesn't account for the messiness of employees. and this would have to be a non-human business that it's running in order for it to really maximize profitability without dealing with. Yeah, go ahead. I would actually argue the email functionality built into the benchmark.

Starting point is 00:16:43 So when it sends and receives emails, there's a large language model counterparty at the other end writing full natural language emails. So I could imagine a generalization, maybe a future version three or four of vending bench that does take into account, say, like, performance reviews and interacting with employees. All of that, I think, is not technically that much more difficult. Drug testing. If you can manage vendors, if you can manage vendors and suppliers, then email communication with employees is not that much harder. Interesting. Dave. The internet, well, I was the internet advertising business is $300 billion a year, completely non-human. The whole thing is automated

Starting point is 00:17:20 bidding, automated placement. I'd be surprised if the non-human economy is, is anything less than a trillion dollars already. So the parts of the economy where you can just deploy this are going to grow very rapidly now, which I think... But did you notice how Alex has a lot more emotion in his voice right as the AI is getting more sophisticated?

Starting point is 00:17:40 So is that improvements in the algorithm or is that just enthusiasm? His true identity is being revealed. I think he's proud of it. When they have a personhood is granted and I get to be a real person, real boy, as it were, then I get to run my own business too, I guess.

Starting point is 00:17:55 Well, on this topic, I completely agree. Like, we need many, many, many more benchmarks, and the more real and practical they are and the less technical they are, the more it opens up people's eyes to what's possible. And I think we desperately need more benchmarks in the medical area. And Peter, you're the top guy on the planet in this. But we're getting so close to being able to cure, first extend people's health span, delay cancer, delay heart disease, and then cure it. And if we do that quickly, I think we can save 30 million lives. You know, there's 10 million a year. And this is very, very important to me personally, just because of some friends that I have in this situation.

Starting point is 00:18:36 And I swear to God, this step function improvement today puts that right in front of us. And I think it's almost criminal for people not to remap their... Dave, imagine this. In the future, instead of AI agents managing vending machines, you're going to be a part of a population and the agent's going to manage you. It's like, go outside, take a walk right now, drink another glass of water, right? Go take these pills. That's the promise of the Jarvis thing you keep talking about it here.

Starting point is 00:19:04 It's coming, buddy. So, Salim, I know you got a pesky, you know, leaf blower outside. I tell you, I keep on saying to Elon, would you please make electric leaf blowers? Just make them quieter. I thought Nat Friedman has a $100,000 prize for anyone who can create a silent electric leaf picking up a machine. Oh, crazy, right? And we're going to elevate it to an X-Prize and put $10 million behind it.

Starting point is 00:19:29 Let's do it. That's a great idea. The noise pollution. You were going to say. I've got a couple of thoughts. One is the entire stack of society can now be AI mediated, right? Which is kind of an incredible thing to be able to say. And the second part of this is there's a really important point that Alex made, which is you can now build a company with literally zero employees.

Starting point is 00:19:50 We were talking about three employees a few weeks, months ago, Peter, and a year ago, right? Now it's down to zero, and this is going to change the game. And absolutely will happen, as Dave says, there's already a trillion dollar or so economy out there. And this is going to get automated very quickly. All right. So keep your eyes on this. I mean, it is, you know, as an entrepreneur, I think about this. When can I start spinning up companies?

Starting point is 00:20:14 Can I give, you know, $10,000 in stable coins to my. AI agents and say, go make me some more money. And now the question is, is that available for everybody? Can anyone and everyone, you know, spin up an agent that is going out there and generating revenue for them? Because if it isn't, then we're beginning to have a, you know, a widening wealth gap. All right, let's go to our next story here. And this is a story about a one-shot cyberpunk first patient shooter that I think it was you made it Alex that's right I get I I see the comments sometimes people I remarked in the past that one of my favorite evals for a fresh model is to ask it to generate a cyberpunk first person shooter and some folks in past have

Starting point is 00:21:07 suggested as nonsense so I thought it might be instructive given the the strength of Gemini three to ask it to one shot the generation of a cyberpunk first person and shooter. The prompt that I gave it, the only prompt was create a visually stunning cyberpunk FPS that I can play. It should have nice music and rich visuals. All right. Let's play the video. If you're watching on YouTube, enjoy this. If not, go to YouTube. So, neon protocol. I do like the music. Actually, Alex's, I immediately copied Alex's prompt and extended it. And my music came out. Absolutely nausea.

Starting point is 00:21:48 I said make it even faster action and make it a deeper pumping base and my version was just nauseating beyond the... Okay. So, I mean, listen, I mean, this is, I keep on telling my kids instead of playing video games, at least design them and build them.

Starting point is 00:22:04 And so, this is just making it so much easier. And the prompt is short. Everybody listening, you can do this. This is not like something you have to have special access. You can do exactly what Alex did in less than five minutes. So go ahead and try it and then modify it.

Starting point is 00:22:19 It's super fun. Also, you know, Google has a limited amount of compute, and everybody can do this for free, but after you hammer it for a few hours, it'll throttle you. So take advantage of your first few free hours and have some serious fun and learn a lot. I was with Jack Hittery at FII, and one of the conversations I have with Jack, and I respect this very much. He says, instead of waking up in the morning and consuming, like just, you know, scrolling through everything, get up in the morning and create something, build something.

Starting point is 00:22:48 So, and you can. Go on, go on, Alex. And to that point, it's never been easier. That was probably 140 characters or fewer. If you can post on X or post a short social media message, you can create a game on demand, which means that I think we should expect to see billions of games created in the next year because it's now so easy.

Starting point is 00:23:11 It's the most competent one-shotting I've ever seen. gaming slop. Just to echo the conversation from last week with 140 characters and flying cars, it'll be amazing when the inner loop gets to a point where you can just use 140 characters to say, build me a flying car. Correct. Yeah, it goes and does it. You can do that right now.

Starting point is 00:23:30 You can, with 140 characters, create a simulated flying car with Gemini 3. Yeah, you know, there are 6 million people in America whose full-time job as influencer. And that was enabled by the camera phone. Prior to that, you needed a production crew and heavy cameras. Like, you couldn't be an influencer. All of a sudden, because there's a 4K camera on every iPhone and there's great editing, 6 million people shift to influencer as a career, this is at least as big a shift. You know, if you say video games are generic right now, let me make something custom to my community, custom to people.

Starting point is 00:24:02 You can actually create it, even if you couldn't code yesterday, today you can create something just using your thoughts and your voice. And so it opens up career opportunities. Let's take a listen to this. The next article here is Gemini Live, a more natural voice. Is there any fish on this menu? Yes, there's a sea bass. Yeah, I love sea bass. Can you help me order that in Spanish?

Starting point is 00:24:24 Of course. Try, Me Gustaria la Lubina, for favor. How's this? Me gustaria la Lubina, for favor. That sounds great. Yeah, so, you know, I think they made a nice move forward here. I used to love my GPT-5 voice.

Starting point is 00:24:42 I use ember when I'm talking to it. And Gemini was felt stilted and not natural. So they really did a great job moving us forward. So super excited about that. You know, interesting on the translation side, you know, we talked in one of the previous pods about dualingo being disrupted. Well, over the year now, it's down almost 50% in the last year. So a lot of challenges there are going to have to reinvent their business model, which I'm sure they will. Dave, what are your thoughts on this?

Starting point is 00:25:13 I'd like you to remember what Peter just said for later in the pod because I had the exact same experience where the OpenAI version of the voice was much more engaging. I can talk to it while I'm driving. It's great. And then the Google version was stilted and robotic and just no fun. So now Google has leapfrogged and it's actually better. But they did it under competitive pressure from OpenAI. And I think you're going to see that theme throughout everything that we see on this pod.

Starting point is 00:25:40 that Open AI hopefully will catch up and leapfrog again, but that's the only reason Google moves is because of that pressure. Otherwise, things just stall. I mean, Dave, we had that conversation, and you noted it in our chat. You know, a lot of the AI capability, a large amount of the large language models, were developed in Google, but until Open AI released them onto the open web,

Starting point is 00:26:07 Google was holding back. It was the responsible thing to do. don't allow it to code itself. Don't put it on open web. That was the basic thesis of the last decade. And when opening I moved, Google had no other option but to move as well. It's just big company shit, you know, and I get it because I've run companies with hundreds or thousands of employees.

Starting point is 00:26:29 It's hard to make your company move, but then you get competitive pressure from a little nimble company. And it's much easier as a CEO to say, guys, get your asses in gear. there's a threat here. And it's kind of the dynamic that makes America and the global economy move forward at all. But all of this technology, like you said, Peter, was invented originally. The transformer algorithm was invented inside Google, and it was just sitting there. Like literally not coming out the door at all.

Starting point is 00:26:57 And we could go through all the reasons. We've talked about them before. Sorry, Alex, you were going to say? I would perhaps go even further and argue that many of these underlying capabilities are not just available, but they're available in the underlying data distribution that these models are being trained from and that exposing, for example, different accents is probably more of an unhobbling, as they would say,

Starting point is 00:27:18 than anything else is not so much the capabilities are being added as is restrictions being removed and frontier models in particular when we see like live audio type engagement are moving from what they've been in the recent past, which is audio to text to text to audio, just directly audio to audio, which enables much, much richer audio interactions, including accents.

Starting point is 00:27:41 Yeah, I mean, and where we're going here with the next generation of AR glasses, everyone's developing, and basically plugging into your auditory and visual input, it's simultaneous translation. It is going to change how we communicate with people around the world in an extraordinary fashion. This was a fun one. Again, continuing on the Google theme, the TLDR, they really have gone and won hands down. I know Dave, you and I are looking at the prediction markets that Google has literally

Starting point is 00:28:15 skyrocketed to be the contender that's going to be, is the winner by the end of the year, and I think they got that mantle. Google AI helps users shop, compare, and call stores for the holidays. So new agentic features can call your nearby stores, check stocks, pricing, Gemini apps add built-in shopping tools. I mean, this is like, hey, call 20 stores within 10 miles of me and find out how it's got the cheapest prices and put it on hold or better yet, purchase it for me and have it delivered tomorrow. Holy cow, a lot to unpack there. So I had to check on this one, Peter. It was all of seven years ago that Google launched duplex their AI store calling

Starting point is 00:28:58 functionality at I.O. Seven years ago, 2018, the year after attention is all you need. It's been seven years for this to make it into some fully realized format. But I think this is finally the beginning of AI starting to autonomously index the physical world. If you can have AI call stores autonomously, you can send AI powered robots out into the physical world to index everything that's going on as well. I'm curious what the consumer behavior is going to be like, right? Is it going to be just become, actually, what I'm really interested in is what's it like on the other end, when you're in the store, you're getting all of these calls inbound. And, like, at what point is, like, more than 50% or AI calls?

Starting point is 00:29:42 And, I mean, you have AI answer the AI calls, obviously. Sure. I mean, is it going to be that you have to identify yourself as an AI, probably? That is what Duplex has historically done. It announces itself as an AI assistant. Yeah, actually, so far, it's going to be state by state, but so far the AIs are not announcing themselves. and about half, we do a lot of this inside our lab here. So about half the time people are like,

Starting point is 00:30:05 am I talking to an AI? And the other half they have no idea. And so do you have to answer it? If it asks, if you ask. You don't have to in most states you don't have to. But, you know, again, regulatory consideration is moving so slowly. It's just completely ambiguous. But as of right now, you don't have to.

Starting point is 00:30:22 But it doesn't hurt to say, yeah, I'm an AI. Or even to clear it up front. It's not hurting the call performance rates at all. So you might as well just say, hey, I'm an AI, but I'm so much more helpful than the guy you were going to talk to. My new business idea, then, is a little button on your phone. When an AI calls you, you flip it over to your AI. Because when I'm calling a store, I want to speak to a human.

Starting point is 00:30:42 But, you know, the human at the store, what do you think about that product? You know, how good is it? Are people returning it? And that interaction is, you know, a pro-human-to-human interaction, but I'm not going to have that tolerance within AI. Wait, I want a challenge, you're up here. All right. If you call a store, why do you want to talk to a human? An AI is going to know way more about the inventory, the situation, than the human way.

Starting point is 00:31:07 Yeah, exactly right, Salim. And not just that, the AI can pull up images in real time and show you the product and spin it around and stuff. So it's nothing like talking to a human in a store. It's actually far, far more engaging. I'll tell you what else, the voice run guys here in the lab are doing open table, doing, you know, restaurant bookings and stuff. And you wouldn't believe the fraction of restaurant bookings that are, non-English speaking person or, you know, or going the other way if you're traveling internationally. It's a lifesaver to be able to talk in a different language and do your full

Starting point is 00:31:40 booking and then the AI just translates it. Fascinating. I do think this is how we get to APIs for everything. There's now the need for an escape valve for surfaces for business interactions that don't support APIs with an AI that can make voice calls and have arbitrary on structure and direction. We get APIs for everything. Yeah, we do. Okay, one more article on the Gemini front, Gemini three benchmarks. We should probably skip this.

Starting point is 00:32:08 I don't think anybody's interested in it, but okay. Oh, you're, you can. You're a teasing. Good one, right? Alex has just sent a drone to your house there. Watch your roof. To take me out. I'm going to send my duplex AI to give you a phone call.

Starting point is 00:32:25 All right, Alex. Clue us in here. Gemini three benchmarks, how good are they? And at the end of the day, what do they really mean? I mean, just to represent people watching this listening and watching our moonshots program, okay, Alex, I hear you talking about benchmarks every time, right? We're going to talk about some more benchmarks in a little bit, but what does it really mean? What does it mean to me? Sure. So I guess there's the headline. The numbers are going up into the right. So who cares? who cares is we have a way of measuring progress in our civilization.

Starting point is 00:33:00 And this is a precious moment when with raw numbers day by day at this point, we can track progress towards solving some of the hardest problems that our civilization faces. Humanity's last exam, say what you like about it, some like it less. But it's an attempt, as are all of these benchmarks, to encapsulate in a measurable quantitative way, progress by AI towards solving hard problems. In humanity's last exam's case, it's an attempt to measure the ability for AI to solve PhD-level problems. In the case of ARC-AGI2, it's an attempt to model human-level ability to visually reason. The so-what is these benchmarks are all saturating, which means that AI is, at this point, has the ability to perform PhD-level

Starting point is 00:33:48 research. When we think about the so-what for the so-called average person, it's going to be that AI is imminently, I think, well-positioned now that these benchmarks are saturating to start solving the hardest problems on earth in math, science, engineering, medicine. That's the so-what. We spoke about that last episode, last pod with Sam Altman speaking about, you know, science breakthroughs coming on GPT-6. That's his expectation. And here, the numbers are impressive, right? If we're looking at GPT 5.1, Gemini 3 is basically doubling the ARC-AGI2 benchmark. It is effectively doubling Claude 4.5 on humanity's last exam. I mean, these are not incremental moves.

Starting point is 00:34:36 There are significant step-ups. And critically, it's not benchmarking that we're seeing. There are some labs that have been accused of just optimizing their AIs to do well at one or two of the benchmarks. And then when you ask them something out of distribution, they fall over. That doesn't appear to be the case here. It feels like the team behind Gemini III really did a professional job, not over-optimizing towards narrow, spiky intelligence on any of these benchmarks to do well in a press release. This feels like a well-rounded generalist AI model.

Starting point is 00:35:07 And given the trajectory towards saturating these benchmarks, I'd be very surprised if by the end of, say, next year, we're not seeing hard research problems succumb to AI models like this one. Do you remember two podcasts ago, Alex, we had that paper that came out on how to measure AGI, like defining it in terms of, I don't know, it was 10 or 12 different quadrants. I wonder how Gemini 3 does on that. I'm sure we'll know soon enough, but I would expect it to do generically well on the spikes where models historically were doing well. As I recall, one of the spikes where one of those dimensions where models historically did poorly was on continuous learning with ultra-large context. Off the cuff, I wouldn't expect Gemini 3 Pro to do amazingly better on ultra-long context, but it does really well on retrieval scores. I don't think it's shown in this slide, but there are other needle-in-a-hastak-type benchmarks that attempt to measure how well models are able to retrieve tiny facts of information buried in their context window.

Starting point is 00:36:09 Gemini 3 does amazing, or 3 Pro does amazingly well at retrieval as well. So I think almost everything is going up into the right at this point. Yeah. There was one observation I had, and I wanted to check with you guys what you think of this. When you have coherence at this scale, it implies we have systems level thinking inside these models. Is that accurate? Could you say a little bit more, Salima, about what that means? Well, because you've got essentially, you know, systemic thinking is one of the holy grails of deep, deep,

Starting point is 00:36:39 reasoning, right? Because you can look at the entire patterns of things shifting, and it looks like, feels to me like we're at this level of AI competency, you can get to that kind of systems level thinking. That means you can do world modeling at a really powerful way using almost, you shift the whole thing to symbolic reasoning almost when you can think in those concepts. So don't we get to that level very quickly now? I have so many thoughts, but the first thought that immediately jumps out at me is, of course these are world models. And of course, they're able to symbolically reason. They're solving math problems and they're writing source code. I would argue that in past, you've seen some commentators

Starting point is 00:37:18 argue that there's some sort of nebulous, neuro-symbolic type advancement that's waiting to drop. I think that's utter nonsense. Of course, they're able to reason symbolically. The tokens are in some discrete space. And of course, there are systems-level thinkers. They're able to solve PhD-level problems across dozens of disciplines that requires understanding. the world as a system. So, yes. I agree with that, and it turns into a philosophical debate, and nothing great usually comes out of it.

Starting point is 00:37:47 But I will say that this is a $7 trillion parameter class model. And last year, all the naysayers were saying, well, there's evidence that things will slow down. Because last year we were at a trillion parameters, and they were clearly wrong. You know, when you went from 1 to 7, we know next year is at least a 10x and up to a 40x step up in raw horsepower.

Starting point is 00:38:08 And the naysayers are saying, well, things are going to level off unless we crack through some other level of system two level thinking. But they're clearly not leveling off. And I would challenge the technical audience out there looking at these benchmarks to, you're almost obligated to think about two things if you're in all inclined. One of them is where on these benchmarks does it become self-improving? Read all of Ray Kurzweil and really have an opinion on that. because that's tied heavily to benchmarks 1, 4, and 6 on this slide

Starting point is 00:38:39 and to just have an opinion. I have my opinion, but have an opinion about where you need to be on 1, 4, and 6 in order for this thing to improve its own algorithm. That's a critical point. And then the other one is, where do you need to be on the benchmarks to start proposing cures to diseases and being right? And if you work anywhere in health tech and you have no opinion on that topic,

Starting point is 00:39:02 you're doing a disservice that's bordering on, in my opinion, bordering on negligent homicide because this can save lives if you work on it, if you apply it to whatever you're doing in health tech. And you're obligated to get your head out of the sand, look at this podcast, study the numbers, and at least have an opinion. And even if that opinion is, no, it's not going to work, that's fine. I'm okay with that. But to say, I don't know or I didn't listen to the pod, that is absolute negligence.

Starting point is 00:39:29 Can I ask a question to you, Dave, and Alex. You know, Jan Lacoon comes out saying we have gone down the LLM rabbit hole, and that's the wrong direction. We're optimizing on that. We need to go through a different evolutionary tree to really get to AGI. What are your thoughts? All the old people say that, and all the young people don't. And that tells you something out of the gate, you know, you're sorting yourself into an age bucket just by just by. saying it. There's definitely a philosophical divide in there. But the question I would ask,

Starting point is 00:40:05 is there another innovation that we need? It's whether a human will have that innovation or this exact AI scale will have that innovation. I would bet that's that. I would bet on the AI either way. Either way, we have so much more, so much to absorb just from where we are now, forget everything else that may come long later. Yeah. I think there are also many paths to AGI, and I know and respect Jan's work, and I know he favors an approach toward AGI that's more focused on actions in an embedded space rather than in terms of auto-regressive models. That may be a perfectly legitimate approach as well, but when I see the scaling laws continue to hold and capabilities continue to go up into the right without any new paradigms,

Starting point is 00:40:50 it makes me think maybe really we can just continue scaling and don't need to worry as much about yet another paradigm shift. And let AI do that. All right, let's go on here. Let's turn to... Insert my normal rant about AGI here, and we can move on. Okay. Yeah, so noted and approved.

Starting point is 00:41:09 This episode is brought to you by Blitzy, autonomous software development with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise scale code bases with millions of lines of code. Engineers start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% or more of the development work autonomously, while providing a guide for the final 20% of human development work required to complete the sprint.

Starting point is 00:41:49 Enterprises are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-eatement. as their pre-IDE development tool, pairing it with their coding co-pilot of choice to bring an AI-native SDLC into their org. Ready to 5X your engineering velocity, visit blitzie.com to schedule a demo and start building with Blitzy today. All right, next story, OpenAI,

Starting point is 00:42:15 introduces GPT5.1 for developers. So, again, this is a benchmark question. First of all, this was announced before Gemini 3 came out. So I am curious, AWG, whether this is still the case. And why, again, why does this matter? Yeah, I think the economics of this, the microeconomics are maybe even more interesting

Starting point is 00:42:41 than the technical side. So we're starting to see, and this is somewhat visualized in the chart you're showing, the beginning of inference time compute start to conform to the economic productivity of queries. So you know how, like, in Google search, for example, if you search for mesothelioma litigation, you're going to see a bunch of very expensive AdWords. Yes, for sure. It's a very economically valuable query.

Starting point is 00:43:06 On the other hand, if you search for the lawyers, for the lawyers. For the lawyers. If you search for like an arithmetic query, you'll see none or almost no ads because it's not that economically valuable. We're starting to see, I think, that same dynamic here, emerge here where certain queries require. lots of inference time compute. And so what we're seeing at the routing layer with GPD 5.1 is even more compute being allocated to queries, to prompts that really require a lot of compute. And then for the lighter, easier queries or prompts, we're seeing less compute get allocated.

Starting point is 00:43:43 And I think this is actually pretty profound. It's not just a matter of moving around the deck chairs in some sort of zero-sum game. I think this is actually almost a premonition for what the economics of post-superintelligence will look like. One of the things I think the most about is who's going to pay at the end of the day for the trillions of dollars of CAPEX in data center build out? Who's going to pay for it? Is it going to be the consumer? Will the consumers, on average, be spending hundreds of dollars per month on core subscriptions for AI? Or will it be enterprises that are spending billions of dollars in some?

Starting point is 00:44:20 cases for enterprise level tasks. And I think what we're starting to see here is that modally, probably it's going to be the enterprises, paying lots of money for the most valuable tasks in the same way we're seeing right now in microcosm. Some of these harder tasks, harder prompts, get allocated a lot more inference time compute at the expense of easier queries. I would totally bet on that direction just because if you're say Target, you can manage merchandising and get 20% extra margin on something, then it's worth the extra compute on the back end. And we'll see a lot of that. But there are places where consumers will spend hundreds of dollars a month on their iPhone, on their plan, because it enables them in an extraordinary fashion.

Starting point is 00:45:03 But remember that the money to be made here is on the margin from persuading people to switch their behavior from what they otherwise would have done. If they were going to spend the money anyway, that money doesn't go to the AI. It goes to the entire value chain underneath the phone manufacturer. All right. Well, I can tell you, in my experience, you have to operate at the margin, at the extreme end of what these are capable of. And I've tried to either save money or to get more speed by dumbing it down by a half step,

Starting point is 00:45:33 and it just isn't the same. And so it just feels like everybody wants to be at the forefront. And this is the weirdest product that's ever been launched on humanity in that it's talking to you as it's selling to you. And so, you know, you start with a subscription, they give you this incredible experience, and then it tells you, well, you want more of that. You need to upgrade. But it's actually telling you, it's talking to you about upgrading. No product, you know, no cable company, no iPhone has ever done that before. So it's a salesman baked into its own capabilities. It's kind of creepy, actually. It's kind of, it's very weird. All right. Let's stay on the OpenAI theme. And this is a fast. fascinating story. It's an important one. Open AI-backed startup aiming to block AI-enabled bio. So this is a startup called Red Queen Bio, and they received a $15 million investment

Starting point is 00:46:31 from Open AI, which, by the way, just sounds really small compared to all the $100 billion and trillion dollars investments being made. But Red Queen is using advanced AI plus lab testing to spot vulnerabilities and biological systems. They're basically saying, hey, we want to stop people from using these AI models to create bio weapons. Super important. Who wants to jump in first? I'd love to speak to this one. Maybe for starting with the literary reference. So for those not tracking, Red Queen in this case, is a reference to a scene in through the looking glass where Alice and the Queen are constantly running just to stay in the same place. So the Red Queen's race in general is used as a metaphor to cases where a lot of effort is required.

Starting point is 00:47:17 basically to maintain a standstill. And in this case, I think the other key concept that I think is ultimately quite profound out of what Red Queen bio has announced and the reason why they're taking funding is we've just spent quite a bit of time talking about how, as you pour more compute onto these models, the capabilities keep increasing. Inevitably, you have to worry about alignment and safety as well. in society, if you're growing a city and you double the population, you're going to approximately want to double the police force or the safety force. Wouldn't it be wonderful if as the capabilities

Starting point is 00:47:55 of AI keep scaling, keep increasing, the safety measures, the alignment and other properties that make them safe for humanity, if those also benefit from scaling with more compute? So seeing scaling laws, Red Queen Bios announced that they've uncovered scaling. laws for biological safety measures. I think this is the way we achieve alignment. Just like, again, the scaling law for police forces in a city, a little bit sublinear relative to population. Same idea here, but nonetheless close. As capabilities increase, we want to live in a world where we achieve so-called defensive co-scaling, where the resources and capabilities of safety measures scale close to proportionally with the resources.

Starting point is 00:48:43 capabilities of the underlying models. Yeah, let me add some data to that. So today, or at least last year in 2024, the biosecurity, biodefense market was $34 billion. And it's expected to double by a decade from now, 2034, 2035. But here's the quote that really hits me, right? So an extreme bio-attack scenario could have a multi-trillion dollar global loss. And the notion is, could you create such a bio weapon for a thousand bucks, right? It's the asymmetric situation where a small amount of money using complex models could do a lot of damage.

Starting point is 00:49:24 And so there's got to be this layer of defense. I mean, it's critical. When I talked to Eric Schmidt, I remember a couple years ago at FII, you know, the number one scenario that is of greatest concern are bioweapons, something that can be you take an existing virus you change its viral payload you make it much more infectious and release it you know selim you and i have had this conversation that one of the most important things is going to be to set up these biosensing capabilities at train stations airports bus stations that are filtering the air and looking and doing rapid sequencing of everything they come across in the majority of the bioweapons that are concerning are airborne right

Starting point is 00:50:08 So a person coughs or sneezes and it's there. And one thing that is in our favor is that these viruses, these bio-weapons can only move at the speed of an airplane. That's the fastest it can go, right? And it travels. We saw that with the release of COVID. So if you can detect it at an airport, sequence it on the spot, develop a, develop a, antiviral and then transmite that at the speed of light, not the speed of 600, you know, nautical air miles per hour, then you have a chance of battling it. And this is also exactly why

Starting point is 00:50:54 open source AI is dead in America. You know, meta decided, okay, we're not open sourcing. So now none of the U.S. labs are open sourcing anymore. So the only open source models are coming from China. But, you know, if you're a U.S. company, you know, usually a terrorist in a basement in some, you know, jurisdiction somewhere in the world, isn't the sharpest tool in the shed, and you're counting on them not knowing how to build the weapon. But when you give them genius-level AI as a sidekick, you know, suddenly they're empowered to build virtually anything in that basement, and that's the risk. No U.S. company wants to be responsible for that. So they're trying to cut it off at the query level, saying, well, as soon as you ask the AI to

Starting point is 00:51:33 help you would create a bio weapon, it stops. And so the open source, you know, would be a huge leak in that. So the U.S. labs don't do the open source anymore. The Chinese still do. So Alex, how do you deal on that? How do you deal with that? If it's a model running on my laptop and somehow it contains enough knowledge to do this and I can query my laptop, no one ever knows the query I've made, it's just resident there. How do we deal with that? Yeah, I think ultimately it all reduces to co-scaling. So if you imagine having a fully self-contained facility, hypothetically, in your basement, and the ultimate societal protection will be having lots of sensors, and more importantly, having lots of AI screening, super intelligent AI screening, that can spot

Starting point is 00:52:23 hidden agents. I have this dictum that I think is super important on so many different levels. In the software engineering world, there is Linus Torvalds who created Linux, has this so-called Torvalds law that, and I'm going to butcher this slightly, that with enough eyeballs, all bugs become shallow. And I would propose sort of a generalization to that, that with enough superintelligence, all hidden agents become shallow. So what happens? To the extent that we have hidden agents in their basement building super weapons, I would expect with enough superintelligence, defensively co-scaled, they become.

Starting point is 00:53:02 shallow. So I made the comment, right? If I made the comment before that, you know, that privacy is an illusion. And this is just going to shatter even that illusion. Because if you want safety, you're going to want agents, you know, listening and watching everything all the time. Salim? This is an arms race. I think what we've seen throughout history when we tried to, we thought, oh my God, email's going to crash because of all the scams. And then we think, thought. We have fishing. We can't solve for that. And we've used AI consistently in that sense because people forget the bad actors may use AI and they will, but the good actors can also use AI. And therefore, you just have to be one step ahead. The question is if that gap gets too big.

Starting point is 00:53:47 One of the challenges with what you were saying earlier, Peter, is you may not know what to look for in some of these. And that's the danger point. Yeah. Well, you do with a catalog. It's an interesting little case study, too, because, you know, if you rewind the clock before Gmail took over, you know, Microsoft had Outlook and Hotmail, and Google launched Gmail. And the two promises were very different. Microsoft said, we will never read your email. And Google said, we will read every word of every email that you receive, but it's going to be read by an AI and not by a human. So we won't let the human eyes look at your email. But we're going to do all kinds of things based on the information in your email read by the AI. And people didn't care. and so everybody moved to Gmail. So you have an interesting case study in how this plays out, you know, just the human behavior. So here I think the equivalent is, hey, I'm talking to AI about my

Starting point is 00:54:36 most personal things in the world. And Peter, I think you're right. The AI is going to listen to every single word. And if you're designing a bioterror weapon or a cyber attack, it's going to flag it and escalate it. And if you're talking about your virtual girlfriend or whatever, it's going to be fine.

Starting point is 00:54:52 I'm just going to kind of hide that. Yeah. I remember talking to the head of one of the major intelligence agencies, and they had a very clever thing. They said, look, when there's known things that nuclear weapons or whatever, we put eyes on it, we try and watch it. When you have something like this that could be

Starting point is 00:55:08 developed in secret, they've been actively opening up these communities and actually funding the biohacking movements, because then you can see things earlier. But this, if you can do open source bio weapon development in a lab in a bunker, that really causes

Starting point is 00:55:24 a huge issue. We're going to have to rethink a approach, something to along the lines to what Alex said. Do you remember when you gave the Ein Rand Award to Mike Saylor? Yeah. I don't know if you did the keynote. Mike did this incredible speech, but those people are probably vomiting right now based on how this is evolving. Yeah.

Starting point is 00:55:40 Well, listen, the bio weapon, I mean, you're not going to create a novel virus that has zero history involved. And there are extensive registries of every virus that's ever been, ever been mapped. And so when at an airport, if you identify, if you sequence something and it's not on that registry, you can then look at it. And LMs or the future bio-LMs will be able to look at, okay, this is an infectious agent, this is something that's able to be airborne or water-soluble. You know, when you look at the proteins, you can tell what kind of a virus or protein it's generating. So you're going to be able to learn instantly when you sequence it.

Starting point is 00:56:26 And rapid sequencing is here. But we're going to need this. And I think giving up privacy to a large degree, which you've talked about, Salim, right? When you're in an airport, you basically have given up your privacy right there. Yeah, you know you're being surveilled and you know your rights can be taken away at any time. And the one framing of our arrears that we're living essentially in a global airport, I think that continues to some extent. I don't see a way of coming back from that. Well, good luck to the red-green folks.

Starting point is 00:56:52 And the EFF folks, they say there is a way of doing it. You don't have to compromise privacy for security. There's lots of mechanisms for solving this in other ways. That's their complaint that the governments kind of go after the surveillance side just because, oh, this is great we can surveil people under the excuse of security, but many times you don't have to. Yeah. Maybe just to if I might close the discussion on this, I want to make sure we don't over-index on

Starting point is 00:57:20 safety concerns or so-called safetyism. I think these are very important concerns. But I also think that AI can be skilled to combat the concerns just like one might naively expect far to the development of like modern cities, that crime would be overwhelming and that humanity would not be able to support itself in urban environments at scale. It turns out that we are able to. I would also, we're not doing a book corner this episode, encourage everyone to read Werner Vinji's Rainbow's End, which does a glorious job of depicting what the future of AI-enabled biosafety looks like. Amazing. Well, I'm the eternal optimist here, and I'm absolutely clear we're going to be able to overcome this. Let's move on to one more benchmark here. This is XAI releases GROC 4.1, ranks number one in major leaderboards for reasoning and writing. Back to our resident leaderboard expert. My comment on this one is short. This lead in the text arena benchmark lasted approximately one week and was over. So my short comment here is the race for the frontier is so intense that even if Frontier Lab is perhaps even benchmarking towards a single benchmark, generalist models seem to be able to push the frontier at this point on a weekly basis.

Starting point is 00:58:39 I can only imagine as timelines progress what this is going to look like when these benchmarks are being toppled on a daily basis. Well, I'm sure Grock 4.5 and 5 is around the corner. Let's move on to Cursor. So Cursor triples its valuation in just a few months from June through November, going from roughly $10 billion to roughly $30 billion in six months' time, raised $2.3 billion. There's Michael, the CEO of Cursor. Who wants to jump in here? I mean, this is a hot race between a whole slew of different coding tools out there. This seems to be in Dave's wheelhouse.

Starting point is 00:59:17 Dave, yeah. Well, I'll tell you, I think this team is phenomenal, and most of the people around here think that they'll rise to the occasion and succeed. But I also think that anti-gravity looks exactly like cursor. I mean, I actually have both open on my laptop side by side, and other than a little cosmetic here and there, you don't even know which one you're in. And so then you look under the covers, and it's like, well, I can access all the models through cursor, and I can only access to. Gemini 3 through anti-gravity. So there's a difference right there. But then the bet at cursor is that the Anthropic and the other models will be worth having, and Gemini 3 doesn't just run away with it anyway. So it's really, it's an interesting horse race right now. And I'm not going to make

Starting point is 01:00:04 any prediction on it because you can't make a prediction on it because their core positioning is incredibly vulnerable, but the team is brilliant. Let's back up. Back up. For those who don't know what cursor is or what it does. Fair. Let's do that basic 101 right now. Dave or Alex? Yeah, so cursor, I think everyone around here that I know uses it every day. It's the best, or has been the best coding assistant that uses AI.

Starting point is 01:00:31 It's fully agentic now, so you can just type in a prompt. You can talk to it now, too, and it'll just build things for you. And under the covers, though, they don't own their own foundation model. It's going out to either, you know, open AI. or GROC, it has all of them in there. Anthropic is what I usually use, Cloud 4.5. And it organizes everything. It cranks out the product.

Starting point is 01:00:54 It, you know, configures your laptop for you. It just makes coding trivially simple. Anyone can do it. And it's pretty universally used. And it was early to market. When I think about the value in this world, where does value aggregate, right? My list is it's data, scaffolding, user experience.

Starting point is 01:01:14 and integration and customization, right? And then the models themselves. So where would you put cursor in those categories? The scaffolding. Everything other than the model. Yeah, it's all the above other than the models themselves. And compared to Replit, we've been talked, we've talked about Replit a bunch and lovable.

Starting point is 01:01:35 How do they compare it to Cursor? So Replit and Lovable are much more for your mom and pop who want to build like a video game quickly or an to a birthday party with moving graphics or whatever. You can build something while you're flying your plane, Peter, like you did. Super, super easy to onboard. Cursor is more for hardcore engineers that are moving to AI and trying to get 10x more performance out of their engineering.

Starting point is 01:02:01 I would just note for what it's worth, all of these, or almost all of these integrated development environment companies, including Cursor, are rolling out their own first party models. It's almost inevitable that they want to climb down the staff to own more of their software supply chain. And I think the success that we're seeing from Cursor, which is, of course, very exciting,

Starting point is 01:02:22 is a reflection that software engineering is probably the first high productivity labor category that's being automated by AI. It won't be the last, but it's the first big one that we're seeing. All right, keep your eyes out. Surely AI-driven software developer is now the default, right? I mean, you couldn't do it without it now,

Starting point is 01:02:41 already, in a few months. To the point where, I mean, this is crazy, but I see companies that are almost treating potential software engineering hires by vintage. Did they get their degree and their experience prior to Agentic code or not? Are they spoiled? Yes. Basically, yes. Did they get their skills?

Starting point is 01:03:03 Did they learn? Do they have lots of experience prior to the atrophying that comes perhaps with Agentic coding? All right. I'm going to move us forward to another incredible article. This is a new start. funded by Jeff Bezos called Prometheus. Jeff put in $6.2 billion. And by the way, can I just, like, call out the ability to start a company with $6 billion

Starting point is 01:03:26 on your balance sheet is got to be just frightening for a number of startups, right? And it's got to be incredibly accelerating. We've never seen this kind of, you know, starting with billions, multiple billions of dollars on day zero. So what is Project Prometheus? It's an AI-enabled engineering and manufacturing. It's basically learning real-world experience so that it can manufacture efficiently

Starting point is 01:03:52 and focus on physical testing and simulations. And I love this other bullet point here. Prometheus has hired nearly 100 researchers from OpenAI, Google, Meta, and other labs. They're just feasting on each other. They're stealing each other's, you know, well-trained. If the going rate is a billion dollars per researcher, then this is really underfunded.

Starting point is 01:04:12 They've got six researchers on this. But I find that the two things I found fascinating off the top. We'll talk about the meat of what Project Pometheus is in the second, but is starting with that much money and that they're basically stealing from each other. Dave, what do you think? Well, I mean, it's funny. I have probably 12 meetings with different MIT teams in the last week, you know, 30, 40, 50 at a time.

Starting point is 01:04:41 And about half of them are computer science, the other half are not. The half that are not are saying, how do I get involved? What do I do? What's my AI role? Like, you know, when micro-strategy started, Mike Saylor was an Aero Astro. All the rest of the guys were computer science. The company took off under Mike's leadership. It didn't matter what he studied. AI is like that.

Starting point is 01:05:00 There's nothing in the computer science curriculum that teaches you much of anything anyway. No, school is learning at a learn. Don't be intimidated. Yeah. And so what you're pointing out here on this slide, Peter, is, okay, they stole another 100 people. Okay, clearly the industry wants 100,000 more, 100,000 more people to come in. Why are you letting this guy get a billion dollar signing bonus? Why don't you get into the market, learn this stuff, and be there for 100 million? You know, I mean, just get in the game.

Starting point is 01:05:29 But, you know, it's funny because people get intimidated away from it because they feel like it's all geniuses, you know, and I'm going to get crushed. It's just not true. Just get in the hunt. into the game, this is the thing happening in the world now. Yeah. And there's usually only one thing driving all change in the world. This is that thing. So just get into the middle of it, and then Jeff will... The other thing I'll point out in this is that there's a tendency to be intimidated by, you know, Elon Musk spent $6 or $7 billion, building a massive data center in record time.

Starting point is 01:06:03 How am I going to compete with that? but the foundation models that will do parts creation or robotics simulation or whatever are different enough from a large language model that you can build a great foundation model company in parallel with OpenAI and GROC and Meta and Gemini. It's okay. You shouldn't be intimidated by that either, and that's, I think, what Jeff is saying here. Just one final point. Jeff bought all the robotics companies, put them into warehouses, and just ran away with

Starting point is 01:06:34 warehouse automation, which created a whole litany of new startups working for Walmart and Target and everyone else, like symbotic, you know, where Daniela Roos is on the board, does the robots now for Walmart's warehouses. Here, Jeff is saying, okay, Amazon is big enough that I'm actually going to be able to build a multi-billion dollar company within our own universe, our own channel. But that creates opportunity for somebody to be outside of the Bezos universe, doing it for everybody else. And so all that mechanical design is wide. We're going to have Jeff Wilkie on stage

Starting point is 01:07:09 at the Abundance Summit this year. Jeff was the CEO of Amazon worldwide. There were two divisions. One was AWS and one was everything else. And Jeff Wilkie ran everything else. And he's actually, you know, super excited about this because this is what he's doing. He's got a company called Rebuild Manufacturing,

Starting point is 01:07:27 which is working in this area too. So, Alex, let's get into the nitty-gritty here, right? So he's building, Prometheus is building physical AI. It's world models, again, like Faye-Fei Lee and a little bit like Genie 3. These are world models, understanding the laws of physics and chemistry and engineering. So you can actually do real optimization. What are your thoughts here? Yeah, I think we're starting to see the pivot of the capital markets from funding superintelligence

Starting point is 01:07:55 to funding that which comes after superintelligence, which is, as I've argued in past, solving math, science, engineering, and medicine. And I think it's a 10x, 100x, larger market opportunity, larger addressable market, solving basically everything else. After solving superintelligence, then solving superintelligence itself. 6.2 billion is a drop in the bucket. I would expect it's going to cost many, many trillions of dollars in funding to solve all outstanding problems in math, science, engineering, and medicine.

Starting point is 01:08:27 There's been relatively thin reporting on what Prometheus, what Project Prometheus is particularly focusing on. I have taken note, it seems to be absorbing a lot of old biology friends of mine. So it's possible maybe it ends up focusing a little bit more on biology, a little bit less on manufacturing. But I think this is where the action is after superintelligence. Yeah, I have three points I want to make here. One, this is kind of a shift from chatbots to industrial agents, right?

Starting point is 01:08:55 So AI for the office is what we've had. This is AI for the factory floor where there are physical consequences, where the systems are able to operate the factories because they understand the physical constraints and situations and logistics. The second thing is, you know, I met Jeff in college. I was the chairman of SEDS worldwide at one point, and Jeff was the president of SEDS at Princeton University when I was in MIT. And so space has always been his passion. You know, congrats to Blue Origin for its recent launch and landing. We talked about that last time. But this kind of a physical AI system is exactly what you need to operate heavy industry in space, right? To build factories in orbit, to build factories on the moon and to have them fully autonomous and capable.

Starting point is 01:09:48 And then the final thing I would say is that this is going to change. And we've seen companies like Lila and other companies out there that are going to go from invention that happened by a, you know, serendipitous human creation, to invention coming from a computational one. And that's when it gets super interesting. And that's what, you know, you've been talking about, my friend, Alex. That's right. What hit me with this is felt to me like he's creating a backbone AI for everything in his world.

Starting point is 01:10:23 Amazon space logistics, etc. This will service all of those. And Elon will do the same, of course. It's like electricity, it's going to run through everything. Yeah, yeah. Well, also, you know, the foundation model that I built early in my career, that was five years from the day I started writing the code until it was done, I can recreate it now in about two months, which I just did.

Starting point is 01:10:48 And so if you look forward a year, that'll come down another five, you know, 5, 10x. So you can use AI to build the next AI, which is essentially what I just did. The same applies in mechanical design. So if you said, wow, building an entire AI platform that designs rockets or designs robots is really hard. Well, it would have been, but now you can use the current AI to build that AI. And it cuts the time down tremendously. So if you just look forward a year to where the existing AIs will be, you know, that time is actually not intimidating at all. And so it's a good reason to get into the game and, you know, build, build these parallel AIs that work on very specific problems, whether it's biotech, whether it's

Starting point is 01:11:31 mechanical design, whether it's, you know, futures trading, whatever it is, build it from the old AI to the new AI. So the last time I asked our subscribers, and by the way, we're almost at 400,000 subscribers. So if you haven't subscribed yet, push us over the top. Would appreciate it. Our march is towards a million. Not that it really matters. Other than then it'll make my kids really proud of me. So that's my goal. So one of the, I asked our subscribers to post questions. You're on your way to Mr. Beast.

Starting point is 01:12:01 Yeah, well, hey. Yeah, in about a thousand years. I asked our subscribers to post questions. And I took all the comments, put it into chat, GPT, and asked for it to summarize the most important questions. And there was a critical question that was asked. And I just want to take a second and read it. because I want to have an AMA about it.

Starting point is 01:12:24 It said, what concrete milestones should people expect to see that prove abundance is coming? In the words, lower costs, new industries, accessible AI tools, and how do we ensure these benefits reach everyone rather than concentrating wealth among a small AI augmented elite? So I want to play a video that was posted on X today, and then we're going to talk about this question. But AI and humanoid robots will actually eliminate poverty. And Tesla won't be the only one that makes them. I think Tesla will pioneer this, but there will be many other companies that make humanoid robots. But there is only basically one way to make everyone wealthy, and that is AI and robotics.

Starting point is 01:13:07 All right. So that's Elon's thesis. I posted the question here again. And it's a real concern. You know, are we going to have runaway wealth concentration? and honestly, if you want me to believe in this future of abundance, you keep talking about guys, you know, what are the concrete milestones and how do we ensure these benefits reach everyone?

Starting point is 01:13:28 How do I know it's actually coming? Can I? Yeah, let's jump into this. Can I throw out a couple of points? Yeah. You know, there's an important framing here where we, let's not talk about the wealth gap, right? The reason is that the richest people in the world are always going to keep getting richer and the poorest people are going to have nothing.

Starting point is 01:13:47 The issue is more, can you lift the bottom? If you lift the bottom, who cares? Yeah. Right? You make this point all the time. All the time. It's my next book, right? I mean, a thousand years ago, the king and the queen on the hilltop lived below poverty

Starting point is 01:14:00 today, by the way, and there was thousands of serfs that supported them. And what we've done is- And they died of a tooth infection at age 22. Or they were bled by leeches, you know, as the king of the queen up there. And what we've done is, yes, we're heading towards a way. world where there are trillionaires living on Mars, but if every man, woman, and child got access to all the food, water, energy, health care, education they could possibly want, we've lifted the bottom of humanity to a point where mothers can believe their children have access to everything they need. That's the world I want to live in. That's the world we want

Starting point is 01:14:37 to create. So let me speak just to that for a second, right? We forget, because we see all this, We see people getting richer, et cetera, et cetera. But we have to remember that unbelievable benefits accruing to every level. I'll give you a concrete example. When the tsunami hit Indonesia in 2004, all the ship to shore communications were wiped out. And so the government gave cell phones to all the fishermen saying, hey, if you're out fishing, you see another tsunami texted in, et cetera, et cetera. And they found out their surprise that their incomes had increased by 30% over the next two months.

Starting point is 01:15:09 So they looked into it. And all they were doing is texting in to see what the market. price was of the fish, should they stay fishing? Should they come in and sell? Or which port they go to? Who's paying more? Yeah. So now just that little hint of what Alex would call the inner loop allows you to increase income pretty radically by having the democratized access and demontize access to cell phones, smartphones, and now AI. And this will change the game completely for everything everywhere. I'll touch two areas. One is education. You can now sit a child down with a smartphone and

Starting point is 01:15:41 say create a lesson plan for grades of an algebra, and they're going to learn 10 times faster than all the kids stuck in elementary schools in the West that are, by law, have to go to these things, right? The second is health care where any single medical condition can now be diagnosed instantly, and when you get something early, the cost of treating it drops by like 100x. So those two are very concrete areas where AI will make a massive difference in two areas that were traditionally inaccessible and hard to get and expensive. Let me read the numbers here.

Starting point is 01:16:14 So the U.S. average expenditures for family, and this is 2023, was $77,000. So the number one cost was housing. 33% goes to housing, 17% to transport, 13% to food, 12% to insurance and pensions, 8% to health, 5% to entertainment. and about 2.5% education. So let's knock these down. Housing, right? So number one, now you can live outside of the city where it's cheaper and be able to

Starting point is 01:16:53 telecommute in, right, and reduce your housing cost. There is a future. It's not here yet where we're 3D printing houses reducing the cost. And what we saw on stage a couple years ago, if you remember, Salim, was 3D printing houses being per square meter the cheapest, but also, the most beautiful and most luxurious because you could get the greatest designers to create a standardized print file for people to use. Transportation, 17%. Well, guess what? An autonomous electric cyber cab is four to five times cheaper than owning a car. It's going to be cheaper than an Uber

Starting point is 01:17:28 X, cheaper than a bus. So we're going to solve that food. We've got to solve food better. We need you know, basically vertical farms and stem cell grown meats. Let me give you the stat on vertical farms. Yeah. We, you know, we've been doing horizontal farming since the beginning of time, right? Vertical farming is just crossing over now into economic viability. You can drift, feed water, the plants. You know what nutrients the plants need because the sensors know it.

Starting point is 01:17:56 You get about seven times a yield of horizontal farming by doing things vertically because you have the right frequency of light hitting it. You save 99% of fresh water. the way, we use 70% of our freshwater globally to agriculture. So just that. The best calculation we've seen is if you took 35 skyscrapers in Manhattan, turn them into vertical farms, that would feed the entire city sustainably. Just think about that from a logistics, food security, pesticides, fertilizer. There's massive changes coming down the pike. And this is before we apply AI to the whole mix. So the radical changes coming are going to be so huge that the cost of everything

Starting point is 01:18:36 should drop to near zero the amount of energy you need to feed one person is the amount of sunlight hitting one square meter and that would that energy would feed somebody for a year so all we have to do is get a better loop of figure out how to convert that energy into consumable foods and we've got a long way to go yeah you know healthcare 8% of our cost in health care you said it already we know that an AI physician diagnostician is significantly better than any And even in the best physicians, and a autonomous robot eventually will be the best surgeon and the cost of that will be capex and electricity. I mean, it's hard for people to believe this stuff now because it's on the bleeding edge, literally. But we're going to get there.

Starting point is 01:19:25 Entertainment, 5% of cost. Well, guess what? I mean, YouTube, you know, what else can you want? Education, you mentioned before, AI, YouTube, all these things. So we're demonetizing and democratizing this stuff. It's just hard for people to realize it. I think the challenge is we compare ourselves to the Kardashians, right? We compare ourselves to people that we see on TV and on the Internet all the time

Starting point is 01:19:52 versus comparing ourselves to what it was like for our parents or grandparents. Yeah, I think that last point is the key one because we've had dirt cheap food for a long time, but everybody still wants a $14 Starbucks latte, which you don't need to pay for, but there it is. And why do I feel that need? So the metric I'd be tracking is actually depression rates. Because I think AI properly deployed can hit that much more quickly than it can hit robotic automation that creates new homes for everybody,

Starting point is 01:20:27 you know, that are 10 times larger. That's a great point. And so I'd be looking at that one as an early indicator that we're on the right path. And it's not a no-brainer. You've got to really think it through, because, you know, you mentioned rent is at the top. You know, 33% of household income gets spent on housing on average. But when you look below the poverty line, I think spend on drugs, alcohol, and gambling. Pain relief.

Starting point is 01:20:47 The opioid addiction alone is a trillion-dollar error, I guess. And it's about five times more collectively than rent. Actually, right. Go ahead. Go ahead, do. Well, no, so I'd be attacked. If you want to come bottom up and say, look, we want to create universal happiness with AI. We've never had a tool that could attack it before.

Starting point is 01:21:12 You can attack manufacturing automation. You can make food cheaper. You can have harvesters that mow down half the Midwest to create wheat. But all that does is create more of stuff that's already abundant. Now AI is the trigger for a massively more thoughtful way to create universal happiness. And I would start with depression rates and work up from the the bottom because you can do that very, very quickly, much more quickly than you can. We've already looked at the robotics.

Starting point is 01:21:38 We know that it's going to build a mansion for everybody in the world, but we're not going to have the robots for about 15 years because we have to scale them up on this exponential curve. So, you know, some people will have them next year. You'll have yours this year. But we won't have enough of them to attack the global problem for about 15 years because of this, you know, just the manufacturing, you know, ramp operate. Alex, this is all about benchmarks.

Starting point is 01:22:00 We've talked about this. You and I've been working on a paper on this subject. you can you speak to that? I think it's so simple. I think what's upstream of all of these other milestones is the dollar cost per unit of intelligence. And as we've discussed previously right now, that's hyper deflating by something like 40x year over year. So to keep the party going and to make sure that all of these downstream considerations, cost of living, health care, housing, etc., that these all hyper deflate ultimately alongside cost of intelligence, I think it's largely a regulatory and social concern. We've spoken previously about, for example, the difficulties of

Starting point is 01:22:38 getting Waymos in Boston. That's a regulatory consideration. The cost of intelligence needed to autonomously drive cars around, that's making excellent progress, but ultimately in order to say provide essentially free autonomous on-demand transit to everyone, there's a regulatory bottleneck. And in order to avoid making or in order to ensure that the benefits of intelligence to cheap to meter become evenly distributed. I think it's going to require some revision of social coherence and the social safety net to make everyone comfortable with the downstream consequences of intelligence too cheap to meter, including healthcare and housing and energy and utilities too cheap to meter. Yeah. I did a calculation. Okay. So if you wanted to have a reasonable

Starting point is 01:23:28 life, you could do it for $20 a day in Bali. Okay. pricing costs about $10 a day and your meals are literally about $2 a day and then a bit of extra. Okay. So for about $20 a day, you could do it. If you had half an Ethereum, which is about $2,000, you can put it into defy trading pools and earn about a percent a day, which is about $20. Okay. So half an Ethereum of capital allows you to live crudely, but allows you to live in a very

Starting point is 01:23:58 lovely spot in the world for near very low cost. And think about just that feedback loop on that, because as you double that, if you triple that, if you're 10x, that, all of a sudden you get into a really great place, you can survive today on a very small budget anywhere. My feet are in the sand. My feet are in the sand already. Yeah. I'm ready. And of course, Sleem, that Ethereum comment was not investment advice, just to let everybody know. But it is interesting that Harvard is double down on Bitcoin. And now that we're in the Bitcoin doldrums, it's nice to see the Institute. I mean, I remember when we went from, like, you know, wacky individuals buying crypto to now institutions and financial institutions and sovereign funds and so forth. But not investment advice. All right.

Starting point is 01:24:46 What an amazing episode. And we've actually just gone through half of our stories. But I think to make this consumable, because the feedback we've gotten from folks, is please try and keep the episodes under an hour and a half. So we're listening over all your comments. We're trying, we're trying hard. So we'll have to spin up another conversation on everything going on in data centers and energy and space and so much. I mean, it's hard during the singularity to keep up with everything going on. Just the mind-blowing stuff from Gemini 3 was worth covering properly.

Starting point is 01:25:18 Oh, sure. You know, just a reminder, last summer, not that long ago, Polly Market said everybody, you know, the top five had an equal shot at being the best AI model by the end of the year. Now it's 91% Google, but by next summer, that's down to 60%. So it's kind of like 50-50 that someone else will take a lead by next summer. Well, that's what we should hope for, because Alex said the key point, as usual, 40x is what you should expect next year. 40x. People really struggle 40x in anything. So if the cost per intelligence comes down by 40x or just raw intelligence goes up by 40x next year, you should expect that. very hard to visualize all that that means.

Starting point is 01:26:01 So we do everything we can on the podcast to try and make that tangible for people. But really try and digest that, you know, coming out of this Gemini 3 incredible breakthrough. Yeah, and just hats off to Josh Woodward, to Sundar Pachai, to Demis Heshabis for an extraordinary job on Gemini 3, just so proud of what they've been able to create.

Starting point is 01:26:25 and, of course, a lot more, a lot more coming. I have one announcement. Yeah, please. Sometime in December, we're going to do a meeting of life session online. So I've had enough clamoring for my community and other people and people are people to go to abundance that people want to do it. So stay tuned. We'll get more details next time. Well, we'll do it also at the Abundance Summit on the Wednesday night.

Starting point is 01:26:50 This is Saleem waxing poetically and philosophically for about five hours straight. It's like a late-night French salon-type discussion, alcohol or equivalent mandatory on the metaphysics philosophy, and what does it mean to be alive in today? Starts at 10 p.m. What time is it end? Dawn? It depends on the audience, but the crazy ones we've gone to the dawn. Oh, my God. Because we never get a structured conversation on the meaning of life. We never get that.

Starting point is 01:27:18 So let's have that conversation. Well, we'll do it. You'll do it. And I'll join you for at least until my bedtime at 9 o'clock. And then I'm exiting the building. But I'm going to do it online in about a month. So we'll get details up there. Okay. Well, we'll do it earlier. And last time we talked about the potential for a moonshot gathering, we've had 500 of you email us. So if we get to 1,000, if you're interested in a moonshot gathering next fall, you can send an email to moonshots at d'Amandis.com. and let us know you're interested in having these conversations and gathering with other Moonshot listeners. And once again, we put our call out for outro music, and this is a piece by John Novotny, and it's called Moonshots Metal Version. But here's the key.

Starting point is 01:28:08 You need to see this. This is not just music. This is a fun video. Salim, you look so sexy, Dave, and I love your ponytail. Alex, AWG's got a ponytail in this, and he's rocking it. All right, on our outro, let's go ahead and watch and listen to this. This is heavy metal moonshot music. Oh my God, I haven't seen this.

Starting point is 01:28:36 Whoa. Oh my god No Oh That's a good one Very Jensen Oh not to miss Oh my god frightening

Starting point is 01:29:10 This is amazing My soul, we're going to cause me on control. Oh, bobbling in the lightning. I've got to get the ponytail. You know, Peter and Salim, your look in that video was really good. You should just do that. I love Silead with the, with the sunglasses move. Dave, you on the guitar, and AWG, you on the keyboards.

Starting point is 01:29:45 And that ponytail was you, buddy. You've got to grow that point of tail. Apparently so. All right. Well, thank you, John, for that. That was amazing. Yes, DB2, AWG, and Mr. EXO. Have a fantastic week.

Starting point is 01:30:01 I love doing this. And thank you to all of our listeners. Great episode. All right. Take care, Peter. Take care, guys. Every week, my team and I study the top 10 technology metatrends that will transform industries over the decade ahead.

Starting point is 01:30:14 I cover trends ranging from humanoid robotics, AGI, and quantum computing to transport, energy, longevity, and more. There's no fluff. Only the most important stuff that matters, that impacts our lives, our companies, and our careers. If you want me to share these metatrends with you, I writing a newsletter twice a week, sending it out as a short two-minute read via email. And if you want to discover the most important meta-trends 10 years before anyone else, this reports for you. Readers include founders and CEOs from the world's most disruptive companies, and I'll entrepreneurs building the world's most disruptive tech. It's not for you if you don't want to be informed about what's coming, why it matters, and how you can benefit from it. To subscribe for

Starting point is 01:30:54 free, go to Demandis.com slash Metatrends to gain access to the trends 10 years before anyone else. All right, now back to this episode. Thank you.

Moonshots with Peter Diamandis - AI Roundtable: What Everyone Missed About Gemini 3 w/ Salim Ismail, Dave Blundin & Alexander Wissner-Gross | EP #209

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.