Limitless: An AI Podcast - Google's Lyria 3: The Music AI that Changes Everything

Starting point is 00:00:00 Just in the last week, Google unveiled some pretty unbelievable new AI software. One of my favorites being this new thing called Lyria 3. So on screen here, I have a music generator that I actually am going to prompt to make a song for us. So what I've loaded up here is Make Me a 2010s Kanye-style Stadium Anthem about the Limitless Podcast hosted by us. And what I kind of want to do is have it do this in real time while we discuss the product while we discuss the software. So Leary 3, what you noticed on screen there, which was super cool, is you choose the genre you want, you feed a prompt to the model. And after you fed the prompt to the model, it will go and generate you lyrics, the full production, and a full 30 second song about whatever it is you want. Yeah, I saw a hilarious example of a lady asking her husband to do the dishes, but instead she wrote it as a message and created a song and sent it to Dan her husband.

Starting point is 00:00:54 And I played just like hilarious music. All jokes aside, I think this is more than just a gimmick or a novel. You're about to see the quality of this thing through the jingle is just generated, but this could replace a music series entirely, at least for 30-second excerpts for TikTok or whatever that might be. Are you ready to listen to it? We have our output ready. Let's go. Yo, turn the volume up.

Starting point is 00:01:15 It's mandatory listening. Josh and E. Jazz on the mic a brand new vision. 25 minutes of that next level AI talk. They breaking down the future every single time they drop. This ain't no podcast. It's a movement. Feel the energy. The number one spot

Starting point is 00:01:26 That's the only place we're gonna be We're living in a limitless world Yeah, living in a limitless world The world's best podcast Going to everybody that is Josh and EJA They got the future on the track We're living in a limitless world Yeah, living in a limitless world

Starting point is 00:01:41 Whoa, even an outro too Wow, okay, that's pretty good Just to be clear, that was generated In about under two minutes I'd say less than one. Fast and the production is good Like the lyrics sound good the, it's like a proper rap song. And I think that's one of the novel breakthroughs with this

Starting point is 00:02:01 technology is that a lot of these tools that Google in particular has been releasing recently are really enabling creators to have this powerhouse suite of tools that let you do things like create music. If we want to have an intro song, we can actually use that or you can kind of workshop it and change a few of the lyrics and change the style. But I think what's really fun is that you can actually choose the types of songs that you want to create and you can embed them into whatever content you're making. So there's a second feature. to Liria 3, which is around generating photos or generating songs through photos. So what I have here is a photo of cold and snowy New York.

Starting point is 00:02:35 We just got a blizzard last night. And I'm going to say, generate me a song that reflects the mood in this very snowy picture. And yeah, I mean, we were under a blizzard warning locked down last night. Things are really bad here in New York. And maybe it will. Yeah, well, that's right. EJES went to Florida. but maybe what we can do here is kind of synthesize that feeling through music.

Starting point is 00:02:58 So we'll see what it does. This is something that Google is really well known for, which is multimodality when it comes to AI. It's not just a great LLM or chatbot. It understands visuals such as videos and images. It understands sounds. Remember, like this isn't the only, I guess, sound model that they've created before. They've had live translation with their AI that translates any kind of like, I think 40 plus languages. into whatever language that you want.

Starting point is 00:03:25 They've had experience in this, and the fact that their models now have this intuition and this judgment, this capability to be able to see a snowy day and say, hmm, maybe I'll generate a vibey song for that. It's pretty impressive. It looks like we have it already. Wow, that was under 30 seconds.

Starting point is 00:03:41 Yeah, you're ready to give it a listen? Yeah, play it. Okay. Josh, how do you feel after listening to that? Are your spirits lifted? I feel like I'm listening to a like kind of knockoff Demi Levato, like main character energy type thing. And I like it.

Starting point is 00:04:21 And to your point, I think this is something unique to Google in the sense that because they have these world models, they have that understanding of what they're seeing in a way that I think is more intuitive than a standard language, text-based language model would do. And another interesting thing that I learned while kind of playing around with this is they have this tool called synth ID. and what it is is it's a digital watermarking service that's baked into these files that we're listening to so that if you have an algorithm that can detect it, it can actually know with high levels of certainty that AI was used to generate the music, which is good for copyright and for kind of trademark issues.

Starting point is 00:04:58 And the way it works is it injects this unhearable waveform within the music that we just heard that we can't detect as human beings, but should you feed it to an AI, it'll be very easy for it to decrypt that and let you know that this was AI generated. So pretty cool stuff. I really enjoyed Liria 3. And I think a lot of people would have fun playing with this. It's free to use available in Gemini. And yeah, super fun for people who wanted to be producers

Starting point is 00:05:22 or just want to send their friends a funny birthday song. Well, listen, my take on this is that it's actually, I'm going to go back on what I said earlier. It's probably not going to take over music labels anytime soon. But for like V1 or V3, you can imagine what this probably looks like in about six months time. And if it's taken, if it improves, any of the rate that any of the LLMs have. Like, this is going to be a pretty insane thing to do.

Starting point is 00:05:43 On the synth ID thing, Josh, the reason why, I was wondering why when you were explaining it, why it sounds familiar to me, they do this with AI images as well, where they kind of like mix up a few certain pixels in an image that is generated to tell you that this is AI generated. They do this with text as well, where they subtly kind of like change certain word choices

Starting point is 00:06:04 to do the same thing. My question to you is, or rather to both of us, is can you create models now that just don't do that? Like this is a baked-in watermark that Google kind of put up, right? But presumably you could create a model that just, I guess, illegally copyrights a bunch of this stuff. Well, my assumption is that if you can detect it, then you can reverse what it's detecting. Right.

Starting point is 00:06:26 By using the same tool. So like if an AI can be aware of the trademark, it can probably reverse engineer those few pixels or that waveform that exists and remove it entirely. So I suspect it probably goes both ways. but it is nice to know and nice to have. And I think it's funny because we are following this trend of, again, reverse captures where we're building things for AIs to recognize and not humans to recognize. And this is another step in that direction. Well, why I like it as well is if you are a music creator that's listening to this or even like an artist that creates images,

Starting point is 00:06:58 you might be thinking, well, they might be stealing my work and I get nothing for it. With watermarks like this, it'll be recognizable to your attributes or brands. and maybe you end up getting paid out for this something like in some kind of future system that doesn't really exist right now and it definitely beats any kind of archaic royalty system that existed before. So I don't know, I just like the technology. I think it's more than just a watermark. It's pretty cool. Yes.

Starting point is 00:07:19 And that is not the only Google Labs cool new thing because there was another really fun creation that they had that they published through the Google Labs team called Pameli. Do you want to walk us through what this post says? Yeah. So, I mean, I hate to say it, but it's true. from a single image of your product, you can now generate several thousand dollar worth photo shoot, which includes people that feature your product, different types of lightings and textures, backgrounds,

Starting point is 00:07:47 all in a matter of seconds. And the reason why that's a crazy thing to say is there's a lot of photographers that spend a lot of their time and expertise creating these photos shoots that now are pretty much out of a job. And I'd also take that a further step and say that there are a lot of fashion models that also may not be able to benefit from this as well.

Starting point is 00:08:06 But I have to say the feature is so cool. The fact that you can go from a single person sitting at home that's ordered or generated their own product to a full-on advertised kind of website that features these slick different images is just so cool to me. Yeah, and we have some examples of this. I mean, one of them I used for the bankless website that we use

Starting point is 00:08:26 and posted about it. And it got like a million views because it's really awesome. What it does is you feed it a URL. So if you have a personal brand, if you have a website, if you have any sort of content on you, you feed it the URL, and then it populates this business DNA sheet where it shows you your logo, the fonts, the colors, your tagline, all the values. And it extracts this kind of identity that you can then use to create this marketing material with. And another fun example that we generated with the bankless stuff was a hat. We had a hat in the merch store. And it's just this very basic, bland hat. But what I did is I fed it through Pameli and I asked it to do a product shoot. And we what it generated on the other side was really impressive. It shows, yeah, we could see here. It has a model wearing the hat that looks like it was shot professionally. The lighting is very cool.

Starting point is 00:09:12 The everything looks real. I can't really, like, if you were to show me this in a magazine or just scrolling on X, I would think that it's real person. I'm like zooming in here, Josh. Yeah, there's really not many. This guy looks incredibly real. Wow. And this took 30 seconds to generate all by feeding him the hat. And then it created this really epic product shot.

Starting point is 00:09:30 That was just this fun kind of hero image showcase. showcasing the hat, showcasing the logo in the middle of it. And I think to your point, I mean, this is something that a lot of creators would see as a huge barrier, where it costs a lot of money to get these professional looking shots, to hire models, to have the lighting and the photo shoots. And the reality is, is that it's really not that difficult to do

Starting point is 00:09:51 if you use this tool. So what I figured we could do is we can try to use Pameli here live and do a demo for a limitless. So I actually have it loaded up right here. We can run through exactly how it works and create our own business DNA. So when you click, let's go, it asks you to use the website. We kind of have a website, limitless.bankless.com, if anyone wants to go check it out.

Starting point is 00:10:11 And I'll feed it in there. And what you'll see is it takes a few minutes. It'll analyze everything. It clicks through and it starts to understand what colors you use, what images you use, what type of copy you use. And while that's thinking, maybe we could kind of describe a little bit more about how it works and what is used for. Yeah.

Starting point is 00:10:27 So the rough kind of steps would be you can enter your website URL. and Pumeli basically just scans your entire business and extracts, as you mentioned earlier, something that would be referred to as, I guess, a business DNA. So it kind of judges your vibe. It gets your vibe. And it's a theme of all these different products. Google's models are very good at kind of gauging sentiment.

Starting point is 00:10:48 So it'll understand kind of like the tone of your voice, the color palette, the fonts, the brand identity, stuff like that. And it generates a completely kind of novel marketing campaign that hasn't existed for any other product before is based on the type of vibe that you generally. like. And with all these things, I think that, well, for example, like in previous jobs that I've worked, Josh, I've worked with massive marketing departments and a large chunk of their time and money is spent on creating these types of photoshoots, gauging the sentiment. Now, what I will say,

Starting point is 00:11:17 okay, I have to argue the other side here, right? I've seen a bunch of demos of these product shoots. And after a while, you kind of see a kind of similar sense and style. Some of the product shoots and angles are kind of repeated. So if you want to get something unique, you do need to be very descriptive and kind of know and see what you want. And that argues in favor of the product shoot photographers, which already have the ideas and experience, kind of like a movie director. They're not going out of job because of video models. They kind of know the artistic shots and they just need to kind of upgrade their toolkit per se. So I think we're going to see something similar like that rolling out. I don't think it's at its final form just yet. I think you're right in the sense that

Starting point is 00:11:57 Like, it can get you 80% in the way there, but it's not going to give you that professional look. So for the people who don't have these budgets, this is awesome. For the people who do, I mean, there's clearly a gap still that exists. But you have to assume that gap is going to be filled fairly quickly. So it generated our DNA. And you could see it's already working on generating these automatic images that we could use, perhaps to post on Twitter to get people excited. I mean, it shows mass the LLN power shift.

Starting point is 00:12:22 It understands the topics that we're talking about, the trends, opus, the deep dive context, mastery. This just jogged my memory. This reminds me of the live demo we went through with Nano Banana. And I believe this is using Nano Banana on the backend, right? Google's image generation model. I believe so. So we're using Nanobanana Pro. And we're going to test that by using it for a photo shoot. So what I did is I stole a little hat and I'm going to drop the hat in here and we can have it do a photo shoot in real time that allows us to generate our own. limitless branded imagery. So we'll load that up.

Starting point is 00:13:02 We'll change the format to be, let's say, square. And then we could choose the photoshoot template. So maybe we'll choose flat lay, in use, and then we'll do two of these product shots here. And we'll say it looks good and generate. And now while that generates, we can talk about, yeah, maybe the things that are powering this. So one is Nanobanamo, pro, but two is,

Starting point is 00:13:19 I mean, this is using Gemini 3.1 Pro as well, which is the new model. Brand new. Brand new model. A lot of new launch. I was also looking at the costing for a typical product photo shoot, Josh. It's between 500 bucks to 5,000 bucks on the high-end scale. And a lot of the images that we're seeing here are on that high-end scale.

Starting point is 00:13:39 It takes a while to kind of like edit in post-production and make things like really crystal clear. So that's super impressive. What I'm also impressed by is this is like available to quite a few different regions. So typically when Google launches the AI products is just for the US only, but this is available in the US, Canada, Australia, New Zealand. So think of like the entire swathes of people that now get access to a tool that cost them cents instead of these thousands of dollars. It's just so cool. Yeah. It's amazing. And I think it's a testament to the direction that they're going towards, which is just kind of complete and total value creation for anyone out there.

Starting point is 00:14:14 Yeah. Like they're working on they have their coding agent. They have their IDE. They have their video generation with V-O-3. They have audio generation now. They have brands and brand DNA. I guess, marketing and understanding of that. And there really is no industry that's safe from these tools. And as these tools continue to get better, they're just going to become way more proficient and way more optimal as it relates to kind of not replacing these jobs, but enhancing and augmenting people's abilities that work in these spaces. But let me ask you this question, Josh.

Starting point is 00:14:44 Do you feel that way about all these products in unison today? Because my take is it's great individually, but I want an entire suite that can manage all of this. for me and they're working towards that, right, with Google AI Suite? Yeah, there's going to be a single comprehensive tool that has all of these things. And it's exciting. So we have the outputs here, which I didn't load the logo onto and I didn't ask to load the logo onto. So perhaps that's why our logo isn't actually on there. But I mean, to your point, you can see the bankless one that showed you earlier looks very similar to this one.

Starting point is 00:15:17 And the model actually looks like kind of similar to the last one. And you can see there are some, I guess, kind of restraints and limitations as it relates. the quality of the outputs. But to have something this good, this quickly, this easily, it's really impressive and it's a fun tool to use. And like we mentioned earlier, Under the Hood, Gemini 3.1, it just came out last week. There's a lot of good and bad news about it. So maybe we should talk about that next. Yes. Okay, let's get back to being suited and booted. Gentlemen, we have to get into the LLMs. Google dropped a brand new model. And on paper, it's pretty damn impressive.

Starting point is 00:15:58 What I've got showing on the screen right here is the ARC AGI2 benchmark. Now, just for context here, there was an ARC AGI1, but the models became so good that they needed to create a brand new test to make sure that these bottles were actually getting more intelligent. So ARC AGI2 is a benchmark that was set. And I just want to focus on a little difference here. And by little, I mean a lot. That's more than a double.

Starting point is 00:16:24 That's more than a double. It's a 46 percentage point increase from Gemini 3 Pro. And I want to point out that this is a 0.1 version update. We've gone from 3 to 3.1. This isn't even Gemini 4 yet. And we've seen such a major leap in intelligence and reasoning. And the reason... I just look this up, actually.

Starting point is 00:16:43 Just real quick. It came out in November 18th of 2025. So this is less than four months. The iteration cycles for these model updates are getting astounding, to be honest. We've spoken about Claude Opus 4.6 and ChatGPD 5.3 Codex. These are the new coding models from Anthropic and OpenAI respectively. Those released within a week, sorry, within an hour of each other. And their previous model version updates were three weeks prior to that. So the fact that these models are improving at a rapid rate doesn't surprise me. The secret source behind that is these models are most likely working on themselves, i.e., they're reviewing their own code and updating themselves, which is, There's a scary topic, which we'll get into another time. But back to Gemini, 3.1 Pro. It excels in two major leaps.

Starting point is 00:17:32 Number one, what you're seeing on the screen here, AGI2, reasoning. So it understands what you're saying. It goes deep into the weeds of your prompt and gives you a really well-thought-out answer. Again, on paper. The other thing that it's really good at is coding. It's insanely good at generating visuals specifically using code and understanding the physics of simulations that you create.

Starting point is 00:17:57 But I have to point out that there's been a bunch of bad feedback from people actually testing these things. So what I've just explained to you has been on paper, but in practice, people observe that the reasoning actually kind of defeats its own self. It starts getting into these thinking loops where it starts doubting itself, questioning its own answers, and you end up waiting 10 minutes for what would otherwise be a very simple answer. And then other people's experience is when they use the web app version to access Gemini 3.1, it's actually a much more reduced model that just agrees with them. So it kind of sounds a lot like GPD 4-0, which is known for being very agreeable, which Open Air actually decommissioned last week. So there's a lot of mixed reviews for this. There are a few cool examples from the big man himself, Jeff Dean. He's showing us an example here of Gemini 3 Pro on the left. And Gemini 3. one pro on the right.

Starting point is 00:18:53 As you can see, the SVG generations are just a stark difference, right? The physics is super cool. The animations, look at the backgrounds on these things. It's so cool. This is a really cool demo of someone using it to do urban planning. I'm going to mute it so I can show you

Starting point is 00:19:09 the start over here. It's just super cool how it can map, track, and figure out spatial awareness. It's a really spatially astute model, which is just super cool for creating kind of games or any kind of simulation. related demos. But yeah, the actual practical feedback from people hasn't been that great. It's funny to see the evolution of these models in between major releases. Because, I mean,

Starting point is 00:19:30 what happens during these incremental releases, they're not doing this huge pre-training loop where they're building an entirely new model. They're taking the existing core model and they're kind of building on top of it. And what you find is that these incremental models become spiky in some different areas. So some are much better at some things, but the tradeoff is that it gets worse. I mean, a good example we had is GPT 4.5, I think it was, which was trained to be an excellent writer, but it was actually so bad at everything else and so expensive that they had to depreciate it. And I think what you see throughout these kind of incremental evolutions, 3.1 being one of them, is that they're trying to improve them, and they will improve them in some areas,

Starting point is 00:20:08 but there are some unexpected areas in which the net quality actually declines. So each one has its own personality, it has its own skill set. net net, they're better because they've been refined and they've been distilled down into just a highly refined model. But there are going to be those downsides that we see. And I think that's just the natural part of these incremental releases is kind of seeing how these models involve over time from that huge core base model.

Starting point is 00:20:34 My major lesson from this, I think which is one that we've learned a while ago, is just don't trust the benchmarks, just trust your own experience. And obviously, different people use these models for different things. Do I think this is the end of Google? No, but they did go on a gargantuan winning streak, and this may have subdued them just slightly. The good news is I hear that Gemini 4 is cooking up an absolute beast. And what I like is that they're integrating,

Starting point is 00:20:59 yeah, I'm integrating, like, what I like is that Google's integrating a lot of these models into a singular interface. I mentioned this earlier. The one thing that annoys me is I need to go one place for Nana Banana. I need to go another place for V-O-3, their video generation model, and then I need to go another place to access Gemini 3.1. Sometimes you can access it through the API. Sometimes it's through the web app.

Starting point is 00:21:17 Sometimes randomly it's for free through Google search. And so I'm like, I just want one place to go to. I'll give you my money. Just let me combine all these things into one experience. And that's all I want for like a single price, a single Netflix subscription. And they're moving towards that phase, which I think is great. And so my guess is maybe they kind of dropped the ball here because they're focusing on all that other stuff. The other part I want to mention is Google's distribution is,

Starting point is 00:21:44 is just insane. Just through the other two features, Liria and Pemeli, they've been able to distribute these to tens of millions of people on day one. And that feedback loop cannot be understated. So whereas Google may have this initial reaction, they have been the quickest to bounce back. Don't forget what Google was two years ago when they had the worst AI model. And now they're like fighting for the throne pretty much. So I wouldn't cut them out. It's cool. Yeah. It feels like the Adobe Creative Cloud Suite on steroids for Google and for creators. And I think that's something I can get really excited about. And what you saw with these demos today is it's real technology that's here today and is improving very, very quickly.

Starting point is 00:22:23 So I'm excited for 3.2, 3.3 and then 4.0 whenever that comes, which I'm sure is going to shock the world. But that wraps everything up for our fun little demo day. I would encourage you to go and try these things. I mean, a lot of them are just available for free, which is pretty cool. You can just go and test them out, create a song, send it to your friend, roast them, whatever it may be. It's a fun way to kind of help creators, help marketers. These tools are awesome and they're coming so quickly. So if you enjoyed, please don't forget, share it with a friend. Subscribe to our newsletter. It comes out twice a week every single week. Last week we had our biggest week ever. We never had more views than last week. So the mission is working and is thanks to you guys for sharing it

Starting point is 00:23:01 with your friends and liking it and commenting rating five stars on your favorite podcast platform. Eja's any final notes before we sign off here. I have one small piece of feedback. sort of feedback. I have a small piece of homework for those of you who are up to it. I'm desperate to hear your music creations or even your product photo shoots. So try and find Josh and I on Twitter. We'll link

Starting point is 00:23:24 to our handles or profile pages below. And DM us. I want to hear some of your creations because I'm genuinely interested in how creative people can get. And maybe it's a breakup text or maybe it's telling your girlfriend to, I don't know, make something for you by

Starting point is 00:23:40 the time you get home. I don't know. I'm going to try out later and see what it's like but yeah let us know your questions i want to see what you guys are up to awesome well thanks so much for watching and yeah we'll see you guys in the next one

Limitless: An AI Podcast - Google's Lyria 3: The Music AI that Changes Everything

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.