The Startup Ideas Podcast - Google's Biggest AI Announcements (I Was There)

Starting point is 00:00:00 We are here. It's Google I.O., Logan Kilpatrick from the deep mind team, friend of the pod, been on the pod a few times. Logan, by the end of this episode, what are people going to learn? You're going to hear about all of the new releases from Google that just happened at Google I.O. And also, like, specifically, we should deep dive on, like, how to build new AI agent-native products, which was sort of the thread of Google I.O. this year was agents, agents, agents. So we should talk in depth about everything that we watch and what it means for builders. developers. Okay, cool. So, I mean, let's start off by, okay, what was launch and and why does it matter? Yeah, there's so much, so much new stuff. And I want to get your

Starting point is 00:00:47 reactions to this too, because I think you have a grounded perspective of sort of the technology. I think the, one of the highlights was Gemini 3.5 Flash. It's the, the best model we've ever shipped and we've made available. Really sort of, if you look at the history of Flash, I think Flash started as sort of this smaller workhorse model that was really great for chat and was sort of the, you know, very cheap to use, very cheap to run. And I think we sort of Flash continues to evolve to meet the era of what people are actually trying to use the models for. And I think the era that we're in right now is people are trying to use the models to do

Starting point is 00:01:23 agentic sort of long running tasks. And I think we want Flash to sort of be the workhorse model for the agent era for agentic long-running tasks for coding for all that stuff. So you see a model, a flash model that's actually really great at coding that's sort of competing with a bunch of our, our sort of ecosystem competitors with very large models. The Flash model is sort of pulling its weight. Yeah, how should people think about, you know, Flash 3.5 versus the competition? Yeah, I think it's probably like a more like sonnet level model. I think Open AI with just mainline GBT and then GBT mini.

Starting point is 00:02:04 I feel like Sonnet is not like a small model, if you will. It's obviously it packs a punch. And I think it's definitely smarter than the mini models. So I feel like it's, I think we're anchoring more on the sonnet level intelligence. So if folks are using that model and want to try Flash, please let us know and send us the feedback of how it stacks up. Yeah. And the reasoning, all the agentic tool use stuff is all incredible.

Starting point is 00:02:25 So I think that was one of the launches that model's available. Actually for the first time, like available to all the users in search, available across 900 million users in the Gemini app available to developers in the API, so many other places. So I think it's like the most widely distributed model launch on day one that we've ever done, which has a phone set of challenges that we could talk in depth about. The other thread, which I think folks are very excited about, is Gemini Omni. So sort of this new model that we've created, actually somewhat of a world model, I think, is how Demis framed it when we announced it on stage yesterday, being able to take in any type of input

Starting point is 00:02:59 and create any type of output. And I think to give context for folks like Google, you know, we had VO and Vio was state of the art and sort of push the frontier for video generation. We had nanobanana, which could do sort of image generation and editing. We have all these audio models that do TTS. We have a new Lyria music model that's actually really, really capable. And the idea is how do you fuse all of those models into a single thing so that, A, developer's lives are easier.

Starting point is 00:03:26 B, we don't need to train nine different models. and actually get this really interesting cross-pollination of capabilities so that the same model that can actually benefit from Gemini's world understanding and the ability to generate text can also make a video, can also edit a video. And you see these really interesting things. And we saw this with nanobanana of what happens when you give world knowledge to an image generation and editing model is really interesting use cases. And we get feedback all the time about what that's unlocked for customers.

Starting point is 00:03:54 And so I'm really excited to see Omni starting out in, in the Gemini app and YouTube and in flow. And then very soon, already we're kicking off a bunch of early access tests. Hopefully, as soon as I get out of I.O. And get back to the office and get feedback from developers and sort of continue the iteration and bring it to developers in the API so that folks can actually build products on top of Omni. Yeah, I mean, it's products, building products,

Starting point is 00:04:19 but also like building content that gets seen. One of the hardest part it's about, you know, vibe coding or building any business is getting distribution to the thing. Yep. Right? So when I was watching the Omni demo and just seeing like how amazing it actually is, like what was going through my mind is like,

Starting point is 00:04:39 okay, how can I make a commercial? How can I make an ad? You know, how could I, you know, create an Instagram account using, you know, this model? And I think what you're going to see is a lot of people generate millions of followers, generate millions of views every single month if they know how to

Starting point is 00:04:57 storytell well and use the model. I actually also think it, and what I would love to see, I don't know, we'll give some work to your editor team. Make the intro of this video, have a bunch of, have a bunch of like crazy stuff happening,

Starting point is 00:05:12 and you can actually like change the intro of the video. We did this for some of the podcasts that I was doing, actually with the Omni team. And that, hopefully it will create a bunch of new creators. I also think it is a fundamental accelerator for, existing folks who are producing content. Like editing video is hard.

Starting point is 00:05:28 Like, yeah, I'm sure you have an amazing team that's doing really hard work. And like, there's not enough hours in the day. There's not enough editor time and hours and storage on SSDs in order to like do all the stuff that I think could be done. I think Omni will sort of fill this really interesting gap for like maybe the creator who, you know, wants to tell an interesting story that didn't have the means to go have a whole team sort of support them and telling that story. And I'm really excited to see that happen.

Starting point is 00:05:55 And I've already been seeing a bunch of examples on X and other places of folks, like, starting to be able to tell that story and bring content to life in new ways and actually, like, repurpose existing content. So it's going to be crazy. And this is only, this is the, like, the first iteration of the Omni model. This is the flash variance. Like, it only gets better from here, which is really exciting. So I think we're going to see some really cool stuff throughout the rest of this year.

Starting point is 00:06:18 And from an API perspective, like, how do you see, you think people are going to, like, integrate with Omni in terms of the products they build? Yeah, yeah, I think there's so many like interesting creative suites. I think a lot of people will also just be doing what you describe, which is like doing content with it and using the Gemini app or Flow or YouTube in order to do so. But I really do think it's an example of like this video remixing editing capability is like not something people have built into products thus far. So it is going to open up a new category.

Starting point is 00:06:46 And I think if you're building businesses and like want to help, you know, actually if you want to build a business and help the next thousand, 100,000 million creators go into. tell their story. I think there's a bunch of really unique opportunities because this is a fundamentally different way of interacting with video. Totally. Yeah. I mean, it reminds me, like, I remember when social media was coming up and social media agencies were just popping up, right? Yeah. There's sort of a similar opportunity right now where, you know, you can build an Omni agency and sort of deploy this for small businesses as an opportunity, right? Yeah. And there's so many ways to like make the, my takeaway from seeing this like on my own content was it makes the content more engaging. Like you could like do all

Starting point is 00:07:28 types of like goofy things that like don't actually dilute. You know, people are like the tongue and cheek thing as like the subway surfer, you know, engagement farming and videos because people don't have attention. I think there's actually like I'm as a as a consumer of content, I'm actually really hopeful that there'll be more people doing like the much more thoughtful version of that in a way that like you could actually only do with Omni or like a really, really. advanced like VFX Studio that was able to pull this stuff off. So I think we'll actually see that. And I think it will like, it's like move us away from this like slop sort of era into something that's like a little bit more tasteful. And I think that that's maybe corollary to I think some of the

Starting point is 00:08:09 narrative of like AI tools sort of just generating more bad stuff in the world. I think Omni actually might be a really cool tool to help people tell a story with a higher degree of taste than they've been able to. What wasn't launched? today that you wish was launched today. I know it's a question that you probably don't like getting, but one thing I like about you is you're not afraid of criticisms. And I see you, when I ever see criticisms of, you know, any of the products, you're the first person in the replies to be like, we're working on it, we're taking feedback.

Starting point is 00:08:47 So anything that you wish to see over the next coming months. Yeah. Well, first of all, I do love the feedback. I think actually, I think folks look at sort of criticism in such an interesting way. I see it as like, it actually makes my job easy. Like if you come say, you know, hey, we wish the models were able to do this thing or we wish the API was able to do this thing. Very easy to like go and take that feedback. The hard thing is actually trying to identify these things when people don't tell you what they're at, what they actually want. So I love the feedback. I appreciate it. Please keep it coming. I think what I would love to see and, but what the team's actively working on is the other set of Gemini models. So we obviously were launching 3.5 Flash. 3.5 Pro is in the works. It's cooking. There's many iterations and runs happening behind the scenes right now.

Starting point is 00:09:37 I think we said we announced yesterday it would be available early next month, or hopefully sometime next month, maybe not early next month. And so, yeah, I'm excited. It would have been great to sort of bring the whole 3.5 model family out at the same time. But it is also fun because I think 3.5 model family out at the same time. is I think 3.5 flash sets the bar high. And I think we continue to, I describe it as like pulling the rabbit out of the hat. We continue to take the pro-level intelligence and then some and stick it in the flash model.

Starting point is 00:10:05 And I was talking to Oriole and Jeff yesterday, who are some of the leads for Gemini, who actually invented distillation, which is the technique that allows us to take the pro-level intelligence model and stick it into the flash model sort of every single time. And it's just crazy that it keeps working. And it brings the cost of, it brings the cost of intelligence down, which is really interesting. Even though, like, I,

Starting point is 00:10:29 lots of feedback about, like, the cost of flash so far. And the important thing actually is that the cost of intelligence goes down. The model is, like, one implementation of the, like, skew capturing the cost of intelligence. And I think folks, it's sort of, it's a lossy way of expressing the cost of intelligence. but I think net net cost of intelligence has gone down with flash, which is really exciting. What else was launched that, you know, a founder, someone who's trying to be more productive

Starting point is 00:10:58 or make money on the internet should know about? Yeah, I think 20, I mean, obviously the OpenClaw revolution that sort of like took the world by storm is exciting and it's an opportunity. And actually, you know, I love the folks who are working on OpenClaughts, an incredible open source project. Peter's awesome. But actually for folks, if you've tried that product experience before, it's really tough. Like you really do need like, you need to be a confident person who's like willing to take

Starting point is 00:11:23 risk and let things. I think Gary Tan sort of described this as like, you know, it's a Ferrari, but you have to also then be your own personal Ferrari mechanic and sort of make the tools work. And so, I mean, I studied computer science at school and like, same, you know, and I didn't feel great, you know, yeah, love what the team has done and stuff like that. But it, it, I was kind of nervous. Like my heart rate was going up when I was like install. and doing things. Yeah, yeah. It's, it's, I mean, and you have to, like, take the leap of faith. And so, I think for people building stuff, like, that, that is in itself an opportunity. And so I think there's sort of two angles to this. One, inside the Gemini app, we sort of launched our sort of flavor of this,

Starting point is 00:12:03 like, always on 24-7 agent to sort of help you run your business or, you know, come up with your next idea and sort of, it's rolling out to trusted testers this week. And then next week, it'll start rolling out to Gemini Ultra customers, which is exciting. So I think, you know, or sending work, throwing work over the fence and having an agent do it is the best thing. And if you want to build experiences like that, we just launched managed agents in the Gemini API. So using the same harness that's actually power in the same model that's powering Gemini Spark in the Gemini app, we also have that ability for developers to go and build those experiences

Starting point is 00:12:41 themselves. And I'm really excited about this, like the friction to build agents historically and choosing a framework and all of this sort of like iteration loop that you have to do in order to like get quality good enough, et cetera, et cetera, trying to shortcut that for people who want to build and just like not have to deal with the infrastructure, not have to deal with any the problems, send a single API call. I demoed this on stage yesterday doing like an AI radio show. And I think we're calling like seven different models and doing all of the this. I didn't write any orchestration code. I literally just like wrote skills and marked down and was able to have it sort of go and orchestrate this whole show. So I think it also like

Starting point is 00:13:20 even if you're not the most technical developer and you didn't study computer science and you're just using these tools off the shelf like managed agents in the Gemini IPI, I actually think hopefully will lower the barrier to entry for people who want to build agents. I think MCPs are coming soon to that product. I think we will have MCP support for it to do tool, tool calling. I think right now it is just like skills in order to sort. So you can like describe and it can do some of these things. But yeah, there's a whole, uh, this I-O is sort of like step one of the managed agent story and there's like 50 other things that are on the roadmap that we need to land in order to make the experience rich and featureful and all that's coming. I mean, that was my big, one big takeaway from

Starting point is 00:14:03 being I know, I always thought of this time as the AI era and you guys have been saying it's the agentic era. Yes. And a lot of the new products. that I've been seeing that you guys have launched of the last 24 hours has been agentic this, agentic that. Can you just, yeah, I'm curious your thoughts on this agentic era and, you know,

Starting point is 00:14:24 the person who's listening to this is an idea person, right? So they're listening to this and I'm like, okay, how can I use this in my business or create a new business? Do you have any requests for startups or any ideas given that we're now in the agentic era that we can be using Google products

Starting point is 00:14:41 go build. Yeah, no, it's a great question. I think two things come to mind. I think historically, there was this like one-to-one correlation of you spending your time and actually, an actual work happening. And I think the exciting thing for somebody who has ideas and wants to build stuff is like asynchronous agents, agents running in the background fundamentally changes that dichotomy of like you don't, doesn't require you actively in the driver's seat every moment that there's actually useful work happening. And the important thing is, like, used to work. I think we're all sort of, I was having a separate conversation about sort of every three to six months, this like expectation reset that you have to do is somebody building in the AI era. I think a lot of people

Starting point is 00:15:25 like tried. I'm trying to think of auto GBT as one example, like three years ago. And that team, I think at the time, like did something really interesting. But like, it didn't really work to do anything useful. It was like, it was like really, really cool and interesting demo. And so I think a lot of us, like, and myself included, I was like, okay, we're, you know, agents are exciting. It's the future. The future was not there at that time. And it feels like we've really crossed the chasm and that future is right now.

Starting point is 00:15:53 So if you sort of written off that there's a bunch of opportunities, reset your priors, there are opportunities. And folks should be actually building those products because the customer base, this is the interesting thing is like that customer base, yeah, does not. yet actually know that they need an agentic product. And so I think there's a, there's some alpha and like how you story tell this. Like it's not clear that like the average person, some of them maybe are looking for an agent. I think there's a lot of people who are like, they have a problem and they just want that problem to be solved. And you can solve it in a way that like really brings

Starting point is 00:16:30 time back into people's day instead of like requiring them to be actively in the driver's seat the whole time. So, and I'll add one more comment, which is doing that in the, the form factors that people are already familiar with still feels like there's so much alpha in this. Like trying to convince the small business owner to go and adopt some completely new thing that they've never heard of before. That's really good. They're going to have to teach all their employees about and their family about. It's a high bar to pull that off. Teaching the small business owner how to text with an AI assistant or how to send an email, you don't have to do that. They already know how to do that. And so I think you can really use this like existence

Starting point is 00:17:11 technology to meet people where they are at the same time that you like fundamentally introduce new technology to them. Antigravity got a huge overhaul. Yes. We should talk about that. Yeah, yeah. So I think the whole anti-gravity suite actually is coming together. I think we introduced anti-gravity, I think like six months ago, something like that sort of as an AI powered IDE. And I think if you look at where it is now, it's actually an entire ecosystem. So you have anti-gravity, sort of the agent manager. If you don't even want to like touch the code itself, like personally, and you want the agent to do all that for you. There's the anti-gravity agent manager on web and desktop that you can install. There's the IDE, the same anti-gravity product that you were using before, continues to work as an ID.

Starting point is 00:17:55 You can do that. There's anti-gravity, the CLI product. So if you're a developer and you're like, I love the CLI. It's the best thing in the world. They've got that for you. If you want the SCK so that you can actually build agents on your own infrastructure, the SCK exists. if you want anti-gravity sort of powering experiences in the API,

Starting point is 00:18:14 because you don't want to manage the infrastructure. We have that in the Gemini API. And so I think the story is like anti-gravity sort of as this agentic coding layer, meeting you wherever you are, however you want to build products. Actually, anti-gravity is sort of the agent harness powering, the always-on-Gem-I-Spark experience for consumers in the Gemini app. So everywhere you go, actually even going to search, So everywhere you go, you sort of access anti-gravity now.

Starting point is 00:18:43 It's another layer sort of bringing the Google ecosystem together, just like Gemini is as well, which is really exciting. How should people think about Google AI Studio and anti-gravity? Yeah, you're making my job easy. I mean, that's a question I have. I honestly, like, you know, it's like, I can tell you how I see it. Please. You know, to me it's like AI Studio just feels.

Starting point is 00:19:10 feels way more comfortable by non-technical. But an anti-gravity just feels like more of a fully featured developer-focused product. But that's also changing a little bit. So, yeah, I'm curious how you see it. Yeah, I think it's changing in both directions, which is interesting. I think AI Studio does a lot of different things. We have sort of the playground experience. If you want to test our latest models and agents, you can get your API key to build

Starting point is 00:19:38 with the Gemini API. if you're a human or an agent, which is exciting. And we're now also, we started building this vibe coding experience probably, I think actually 12 months ago, which is crazy. And to see the progress of where we are today, you can now natively build Android apps and sort of share them with users and download them onto your phone. And you can natively integrate with Google workspace without ever leaving AI Studio,

Starting point is 00:20:02 which is also really exciting. The way that I frame this, and actually, Andre Carpathie, has the best framing of this, which is, it's actually two different things. There's on one end of the spectrum is vibe coding. On the other end of the spectrum is agentic engineering. AI Studio is going after vibe coding. We want to make it so that you can bring your idea to life,

Starting point is 00:20:21 go from prompt to profitable company without ever actually having to see any code. You could never look at a single line of code. And we should help you do that. You should be able to deploy, get a database in the future add payments, get a mobile app, do all those things. Never look at a single line of code. On the anti-gravity side, I think the part. problem they're trying to solve is like very much like production quality code. You want to work in a

Starting point is 00:20:44 million line code base. You want to build Google. We're using anti-gravity to build Google. You want to build Google. The bar is quite high. We're, you know, we are using AI Studio in certain ways to build Google, but like not contributing to like the Google search code base as an example. And I think it's, there's like a flexibility control tradeoff versus batteries included. I think Anti-gravity can do anything. You can use, you know, other models. You can use, you know, you could be using the Open AI API API if you want as part of anti-gravity. AI studios really focus on the Google ecosystem. And so we're trying to make a bunch of opinionated decisions for you, make it so that you don't have to think about those things actually. And it just all works.

Starting point is 00:21:25 You can sign it with your Google account. It works out of the box. You can build an Android app natively for free, install it on your phone. Which I just found out yesterday, by the way, that you can do that. Yeah. It's crazy. I mean, we just launched it yesterday. So it's a brand new, it's a brand new feature that I'm excited about. And Paige and I were talking offstage before yesterday with a bunch of folks. And like, neither of us had ever actually built an Android app before. And it's like opened up this whole new ecosystem that we probably would have. I mean, I love Android and Android's awesome. Like I was probably never going to build an Android app. And so to see, you know, literally in the first day, like tens of thousands of people go and build their first Android app in AI Studio literally for free, I think it's really cool. I mean, so much alpha and just that alone, right? If you think about it, you know, Android is the number one operating system. There's like billions of people using Android. It's crazy.

Starting point is 00:22:15 And you don't need to be technical and you can ship your Android app. And, you know, you need a good idea. You need a good niche. You have to understand distribution. But there's no reason that if you want to build a mobile experience, that you're not just trying it out. Yeah. And one of the intentional decisions on Android that we actually

Starting point is 00:22:35 made was to make it so that these are native Android apps. So this isn't like you're not using one of the frameworks that sort of lets you do like all of the different ecosystems with a single app. It's intentionally a native Android app. And the reason for that is because we want to make sure you can actually build for all the different form factors. So you can actually, and this doesn't work perfectly well today, but we have this sort of groundwork in place. And this is where we're incrementally getting towards. But you can build for our Android XR wearables. You can build. So like glass. in the future in the fall when the glasses land. You're going to be able to build in AI Studio,

Starting point is 00:23:10 build a native app for XR and build XR apps without ever actually writing any code. You're going to be able to do the same thing for the other wearables like the watches, Android Auto. You can build like apps for people in the car when they're on the go. Like all of those things using the same sort of setup and the same agent in AI Studio is really exciting.

Starting point is 00:23:30 And those form factors, you actually don't get using any of those other like compatibility frameworks. And so I think it'll be cool to sort of build these frontier type applications in AI Studio with native Android apps. For people who listen to this, we'll end with this. For people who listen to this, what are your hopes with how, you know, there's hundreds of thousands of developers and builders who are listening to this? What are your hopes for these people who are watching now to go and build? Yeah, that's a great question.

Starting point is 00:24:01 I think the most exciting thing is like what YouTube did for creating. I think is what's happening now for software. I think there's going to be this entire generation of like software-s creators where you can just like go by yourself or with a small team like go and build a business using all of these AI coding tools. And I'm very excited to see similar to what YouTube has done where like you don't need this like historically to build software. You need to like either get really lucky or raise a bunch of money from people.

Starting point is 00:24:34 And in order to raise a bunch of money from people, the market had to be massive. And so there was all these like really, really interesting problems that people have just decided they're not going to solve because the market's not big enough. And I think those markets are now in play. And I think it's just like it's alpha for the observant person who's willing to do the diligence to like go try to find some of those opportunities. And I think why I love the content that you do is because I think you're actually helping people find those opportunities.

Starting point is 00:25:02 And like that is the next generation of like value. that's going to happen is people going after these things that you wouldn't have obviously gone after five years ago when the cost to build software was so high and you needed a 40-person team to build something. Now it's like one person or a small group of friends and some of these tools and anything is possible. You're off to the races. I love it. All right. Thanks a lot, Logan. This was great. Thank you for. Thank you for having me. And first I-O. First I-O. Hopefully we'll have you for next year as well. And thanks for Thanks for blazing the heat of Northern California.

Starting point is 00:25:39 I appreciate you. Yeah. Take care.

The Startup Ideas Podcast - Google's Biggest AI Announcements (I Was There)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.