Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 530: Google I/O AI Updates: 15 new features and how they can grow your business (Pt 1 of 2)

Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Google has come a long way in a very short period, which seems weird saying that about one of the biggest companies in the world.

Starting point is 00:00:54 But when it comes to the AI race, let's be honest, about 15 months ago, I don't even think Google was in the top three. When you look at Microsoft and Open AI and Anthropic, I think about 15 months ago, Google, Google, Google was actually in fourth place. But now, without a doubt, Google is the absolute leader in the generative AI in large language model landscape. And what they just announced at their I.O. conference is nutty. And I think, if nothing else, it really just cements Google's place, at least right now, as the leader of the pack.

Starting point is 00:01:39 We'll see how and when everyone. else responds, but at least for right now, Google is just cooking when it comes to AI. And they released dozens of notable AI updates. And on today's show, well, on today and tomorrow show, we're going to be breaking down what I think are the top 15 most useful. So yeah, we're going to have a part one, which is today and a part two, which is tomorrow. But we're going to be going over the top 15 most useful AI updates out of the Google I.O. for everyday business leaders such as yourself. All right.

Starting point is 00:02:15 So I'm excited to dive in. I hope you are too. If you're new here, what's going on, y'all? My name's Jordan Wilson. I'm the host of Everyday AI. And this thing,

Starting point is 00:02:24 it's for you. This is your daily live stream podcast and free daily newsletter, helping us not just keep up with AI, which is very hard, but how we can actually use it to grow our careers, grow our companies. So is that you? Did that hit home?

Starting point is 00:02:39 If so, well, you're in the right place. This is your home. It starts here on the unedited, unscripted, live stream of podcasts. This is where you learn, but where you're actually going to leverage this and put this to use is on our website at your everyday AI.com. Because once you're there, you can sign up for our free daily newsletter. We're going to be recapping today's show. But we also keep you up to date with everything else happening in the world of AI. And yeah, even though Google is sweeping the headlines, there's still a lot more happening.

Starting point is 00:03:06 And then also on our website, you can go and listen to for free, sorted by category, more than 500 past episodes. Whatever you're trying to learn, we've already spoken to the experts. It's all there already. All right. So normally we start out each livestream with the daily news, but let's be honest,

Starting point is 00:03:25 Google is the AI news today. All right. So I'm excited for today's show. What's up? Live stream, fam. It's good to see you. Yeah. If you listen on the podcast,

Starting point is 00:03:35 maybe sometime drop by at 7.30 a.m. Central Standard time. You know, when we have guests on, And what other place can you go and ask questions live to the smartest people in the world on AI? Today is just me. Sorry. But what's up, live stream, fam? So Christian, join in on YouTube.

Starting point is 00:03:53 Good to see you. Brian and Michelle, Dr. Harvey Castro, big bogey, everyone else. Daddy, good to see everyone. Let's just not tease you anymore. Here's at least the first half of our top 15 AI updates from the Google IO conference for every day. business leaders, such as yourself. Here we go with 15 through 8. Number 15, imagine 4.

Starting point is 00:04:17 14, Chrome with Gemini integration. 13, personalization in email. 12, notebook L.M. updates. I can't believe that didn't make the top 10. 11, Gemini diffusion. A whole new type of large language model. 10, real time translation in Google Meet. 9, Gemini app updates.

Starting point is 00:04:38 And 8, Gemma, 3. And yeah, that's a lot. And y'all, I didn't miss anything. We still have our, you know, number seven through one. But here's something that didn't even make the list, all right? And if you've been following the AI news over the past, you know, I don't know, 12 to 20 hours, these are big advancements that didn't even make our top 15 lists. All right.

Starting point is 00:05:03 Gemini Code Assist, Synth ID detector, Leria 2, the virtual tryon in shopping, Google Beam, which is enormous news even of itself, formerly called Project Starline. Jules, the new Autonomous Coding Agent, the A2A agent to agent enhancement. So yeah, when I say that there were dozens, I literally had to scratch my head and look at my list of like 50 and say, what are the top 15? Right? So very hard to do. Very hard to do. All right.

Starting point is 00:05:35 So there's probably some big things. You're like, wait, where are some of these big? uh ones well those are tomorrow right you'll notice i didn't even say the word jemini 2.5 there a lot of updates there or v03 yes v03 which is shocking all right so we're going to be going over those and a lot more tomorrow all right but let's stick on our top 15 for today hopefully a concise show or more concise show than normal for for you all instead of doing a an hour and a half show or something like that. We'll try to keep this one short. All right, first,

Starting point is 00:06:13 Imagine 4. So this is Google's updated text to photo platform in Imagine 4. It's really good for our live stream audience. You can probably see if you're listening to the podcast. Nothing overly visual or overly instructive today. But, you know, maybe you want to check out what's on the screen. You can always do that by checking out your show notes

Starting point is 00:06:37 and, you know, go on our website. website and watch the video, but look at this image. This looks beyond real, right? So this is a young girl here, a young woman looks like in a in a dorm room, maybe with pink hair and earrings and, you know, kind of a grungy t-shirt with light filtering in, you know, through the window. It looks like an amazing photo that was captured with a high-end DSLR. This does not look AI generated in the least bit. Let's just start there. It is as somewhat.

Starting point is 00:07:12 And I don't talk too much about my background here. I just realize, y'all, I just realize, I don't even have my mic plugged in. This is how much work I was doing and maybe how sleep deprived I am. So live stream audience, give me a second. Let me know. Let me know if you can hear me now. Hopefully you can. Can I get a thumbs up from the live stream audience?

Starting point is 00:07:44 I didn't have my mic plugged in, but it must have been picking up somewhere else on my computer. All right. So thanks to my computer for still delivering some type of audio, even though my mic wasn't plugged in. All right. Hopefully, hopefully y'all can hear me. All right, let's keep it going.

Starting point is 00:08:02 So this is good. So Imagine 4. Let's talk a little bit about what's new. Thank you. Thank you, Marie and Laura for letting me know. You can hear me. Appreciate that. Okay. So here's what's new in Imagine 4 and what it is if you haven't heard of it.

Starting point is 00:08:23 So maybe you've heard of mid-journey. You know, there's the new kind of viral GPT-40 ImageGen inside Open AI. You know, there's a lot of these AI photo generators, you know, stable diffusion. flocks, there's, you know, a good five to 10 pretty good ones. I'm going to be interested to see where Imagine 4 lands on the benchmarks. So in the same way we talk about the L.M Arena, which is kind of blind taste test for large language models. They have that for image and video models as well.

Starting point is 00:08:53 So I'll be interested to see where Imagine 4 lands on the list. But from early eye tests and as someone, I was a photographer, you know, kind of before in my earlier life, I've probably taken more than a. million, yes, more than a million photos with the DSLR. So I would say my eye is a little more trained than the average eye when it comes to looking at things like photo realism or even being able to decipher what's real and what's real and what's not. And I will tell you, imagine four images are otherworldly good.

Starting point is 00:09:30 You know, in the same way, you know, mid-Journey v7, very good. But geez, these imagine four photos. So good. So good. All right. A little bit about what a Match and 4 is. What's new when it's rolling out, all that good stuff. So this is Google's latest and most capable image generation model with improved detail and text rendering within images. That's a big thing. Mid Journey can't render text. And they kind of said, yeah, we don't really care about that. This is good. The ability to render text. Yes, GPT40 ImageGen does great at rendering text for whatever reason you may want. right so maybe you want this person wearing a shirt to have a a t-shirt that says you know the name of you know university of Illinois or something like that or Chicago right some AI image generators struggle with that

Starting point is 00:10:15 Imagine 4 so far does a really good job like GBT40 image gen does but in terms of photo realism quality imagine for is pretty good and by pretty good it might be the best out there time will tell So right now is rolling out now in the Gemini app. Also, this is pretty interesting. It's coming to all of Google's different products. So Google Docs, slides and other workspace apps. So yeah, I don't really use Google slides, but now I'm like, okay, there might be some use cases where, you know, I might want to or maybe might need to in some instances, right? So this is going to be in the new included in the Google AI Pro and Ultra subscription.

Starting point is 00:10:59 We're going to be talking a little bit more about that tomorrow. But for these things to make sense, you have to know, previously, you know, Google had a couple of tiers, right? There was a free tier, and then there was a Gemini advanced. And in typical Google fashion, they're confusing as all. So now they're still obviously a free tier. The new $20 a month plan is called Gemini or sorry, Google AI Pro. I'm already getting confused. Google AI Pro is the base $20 a month plan.

Starting point is 00:11:26 And now you have the ultra, which is ultra, expensive at $250 a month. Technically, $249.99. And I think for the first three months, it's like half off. But, you know, the base plan is going to be $250. So this is already rolling out to those people who have either of those two subscriptions. So like I said, some of the standout features here significantly better text rendering and images, enhanced photo realism, improved handling of complex prompts, so prompt adherence.

Starting point is 00:11:56 There's in painting and outpainting capabilities. So if you want to change something inside the photo, you can do that very easily. If you want to extend a photo, right? So whether it's a photo you start with or a photo that you create inside Imagine 4, you can outpaint or extend it to bring in more of the scene that was actually never captured originally. And also this supports a range of aspect ratios up to a 2K resolution. So there is a coming soon for this, a faster version. I don't know if they're going to call it turbo, but a Pairs,

Starting point is 00:12:29 Apparently, it's going to get 10 times faster fairly soon. So what the heck could you use this for to grow your business? Well, first, get rid of those ugly stock photos on your website. They look horrible, right? Also, starting from here, if you're creating any videos for social media, anything like that, start with an imagine for image, right? Yes, start with images. If you're doing AI video, it turns out better.

Starting point is 00:12:56 But there's no shortage of ways that companies, can just use visuals. Chances are everything you're using, whether it's for internal or external purposes, is either extremely old, extremely boring, or a combination of both. All right. Number 14, Chrome with Gemini integration. All right. So what this is, well, the Chrome browser is finally going to get a little smarter.

Starting point is 00:13:21 All right. So I can't pretend that this is some groundbreaking new feature. It's more of like, oh, in about time, because let's call a spade a spade here card players uh Microsoft in their edge browser which is actually freaking fantastic it's based on chromium right so all your chrome extensions everything like that will sink over uh Microsoft Edge has had this for like a year not all the capabilities but they've had a built-in co-pilot um for like more than a year and that's why I use edge a ton but about time we're going to be getting Chrome with Gemini

Starting point is 00:13:59 integration more than just being able to summarize web pages and things like that. But it helps you also with web browser tasks. So this is also, you're going to have to be on a paid plan and you can summarize web pages that can help you explain complex information, answer questions about page context, content. And eventually, here's the eventually and why maybe it's for page subscribers and not available for everyone for free. Eventually, it will be able to help you navigate websites autonomously, which is

Starting point is 00:14:29 pretty big. That's been a big shift over the last even month or two. A lot of companies, the DIA browser from the browsing company, perplexity coming out with a comet browser, even Microsoft Edge with their vision feature, you know, built in, you can see web pages. So the ability for browsers by default to perform tasks is not some future, you know, sci-fi. This is, it's already available. But it's been like wildly, popular the past like three months. So this Chrome with Gemini integration will be eventually be able to do that, at least Google says. So what are some business use cases for this? Well, pretty straightforward. Number one, it's going to help you summarize web content faster, right? Which if you haven't already

Starting point is 00:15:16 just been doing that in Microsoft Edge, I told you about it like, I don't know, a year and a half ago. And I'm like, start doing this. So nothing's super new there. But obviously, the ability for Chrome to perform actions on your behalf without having to launch a separate agent pretty big in terms of time savings, winning back time, all that good stuff. McDonald said this is very impressive. Oh, talking about Imagine 4. So he says, art director for 20 years and things like Imagine 4, very impressive. Yeah, I agree.

Starting point is 00:15:51 Like I said, I've been taking more than a million photos with a DSLR getting paid to do so. and it's really, really good. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the All In One Creative AI Studio. Powered by Adobe's creative agent,

Starting point is 00:16:23 Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant. The Assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations.

Starting point is 00:16:57 Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. All right, that's number 14. Let's go to number 13, personalization in email. So this is an actual, not what's on my screen, but this personalization in email was one of the things that actually Google CEO,

Starting point is 00:17:31 Sandober Chai actually talked about during his keynote, which I found interesting because when there's literally dozens of, of updates that are huge. Personalization in email at first, I'm like, okay, this is no big deal. But when you look at some of the,

Starting point is 00:17:48 the marketing materials, again, there's obviously a huge gap between what's being marketed, what's being promised, and what actually happens, right? And Google is getting much better, although their original track record on this,

Starting point is 00:18:00 a year and a half ago, two years ago, no, no, no, now they're just shipping, right? So I actually do have a high degree of confidence, A lot of these things are going to be shipped on time.

Starting point is 00:18:10 But the personalization in email, something against Indar Pichai mentioned in his keynote address with his limited time on stage. So for our live stream audience, you kind of see an example here. You know, so there's kind of this blue area that shaded, a green area that shaded, and then a yellow area that shaded. And it's showing you how Google and Gemini are going to be able to use personalization based on your context. right so it's not just those auto replies right which had been in google gemini for a long time and i don't really use them because i don't think they're good this when and if it gets released will be actually really good so as an example you know the things in blue it's basing part of an email reply based on uh your own writing style so it goes and it sees how you respond to email

Starting point is 00:19:00 so the type of words that you use the format is it long is it short etc right so it bases it number one on your writing style number two pulling in context from your past emails which is obviously important right we want a i to be smarter and then also based uh in the the yellow portion there for our live stream audience is based on files in google drive that's the part that i'm like holy freak this is really good so in this example right it's talking about uh someone's asking about a package or a service this company offers and it says our pampering packages range from $90 to $230 depending on your dog's size and the specific services you were looking for so uh you know it's pulling in that information based on a Google drive file according to uh you know what Google released here so that right there

Starting point is 00:19:55 extremely impressive personalizing emails based on your writing style based on past emails based on files in your Google Drive, when and if this happens, I'm going to love it. You know, I won't have, I'm embarrassed to do this live, but I'm going to tell you guys the truth, all right? I get just bombarded with emails. You know, somehow people find my personal emails, the emails for the podcast, mainly it's just a bunch of people wanting to push, you know, they're sometimes garbage. you know, AI products and services to you all.

Starting point is 00:20:36 And I say no to many of them, but there's some great people that land in the email. But, you know, already today, I have dozens of emails. And most of them are unread because right now the Google Gemini, you know, abilities are not good, you know, to reply to emails. So when this happens, oh yeah, I was going to look. So I have 2,328 unread emails. I hate email.

Starting point is 00:21:01 I hate it, right? I get too many emails. It takes too long to respond because, number one, I have to do these three things, right? I have to write it in my own style, right? I don't want people to think I'm using AI, even though I will end up using AI, right? You know, I need to pull in context from past emails. And, you know, in many instances, people are asking, hey, I want to sponsor the podcast. I want to do this and this.

Starting point is 00:21:23 Will you come speak at our event, right? I have all that information in different Google Drive files, but sometimes I forget it. So it takes a lot of time to go and do those three things. So this personalization piece will be huge. So this is launching in Google Labs. So you have to sign up for Google Labs. It's a free program. That's essentially where you get beta access to certain tools and features.

Starting point is 00:21:47 So right now it's saying it's launching via Google Labs in July of this year. Initially, it's going to be on the web only. You know, so you can't use this inside different apps. And it's going to be English. English only at first. So I'm excited for that. And the business use cases for that are obviously off the charts. Cecilia, I 100% with what Cecilia says.

Starting point is 00:22:15 Cecilia says email is the bade of every professional. So anything that helps is more welcome. Absolutely. Absolutely. And I do know, you know, spending a little time on Twitter last. night looking at all the new releases and everything else. Logan Kilpatrick, who I've had on the show a couple of times. He's lead a product for Google and AI studio. And he did mention that email priority of like the email priority is extremely high because someone's like, yo, is this actually

Starting point is 00:22:48 going to happen? He's like, yes, it is going to happen. So, you know, vote of confidence is there from Chicago's own Logan, who's been on the show a couple of times. So yeah, I'm really looking forward to this one. Hopefully it does come out in July. Heck, Google, I'll even take 2025. Please give us a working version of this in 2025 and the business world will be crying. Tears of joy. Next, hey, tears of joy. If you're a notebook L.M user, you're going to like these updates. It's actually crazy that this didn't make our top, you know, seven for tomorrow show. But here's what's new in Notebook L.M. If you don't know Notebook LM, it won our 2024 AI tool or mode of the year award. And it wasn't even close.

Starting point is 00:23:37 Notebook LM is an amazing piece of technology. It is powered by Gemini 2.5 now, whereas previously it wasn't. So that just rolled out at Google Cloud Next about six weeks ago. So if you haven't used notebook LM recently, you should go use it now because it uses a hybrid thinking model. So it's even better than it was before. but it is grounded in your data. So as an example, let's say I load it up, which I literally did for this show, I loaded up with a bunch of information about Google I.O. updates.

Starting point is 00:24:08 And I ask it about deep dish pizza. It's going to be like, can't respond, don't know. So it is grounded in your data. It only works with what you give it, which is huge for trust, transparency, and being able to use something with accuracy, knowing that there's likely not going to be any hallucinations. So some of the cool things is video is going to be coming out, which is going to be wildly fun. All right.

Starting point is 00:24:36 So not a ton of updates yet, but there's kind of these multimedia features. One is the audio overview, which is essentially a deep dive podcast. It makes a podcast between two hosts that sound very real, right? And many of you probably feel like you even know those two AI hosts, right? Because you listen to them all the time if you're like me. So you are going to be able to have the default time to either five minutes, 10 minutes, or 20 minutes. So the default is 10 minutes.

Starting point is 00:25:05 If you click shorter when you go to customize audio overview, that's about five minutes. If you click longer, it's about 20 minutes. So that's great. I was able to already kind of do this with some simple, you know, quote unquote prompt engineering, which is just, you know, instructing it over and over for a time or giving it more complex request when asking it to customize to get it longer anyways.

Starting point is 00:25:28 So yeah, there's going to be some simple video generation based on your files, which I'm excited to see what that looks like. And then like I said, the ability for 5, 10, and 15 or sorry, 5, 10 or 20 minutes for the audio overview. So also, you know, they did update. this to 50 languages a couple of weeks ago. So I don't think the video overviews, FYI, they're not going to be like V-O-3 quality, right? Something that you would, you know, produce and, you know, go say, okay, this is going to be our new explainer video for our business. I don't think that's what we're looking at here. What we are looking at is more

Starting point is 00:26:14 of a fun and kind of cutesy way, at least the kind of the examples that they showed. We're more kind of I would say animated, right? Like more retro-esque graphics, which is fine, but great for explaining more complex topics, which is something I use notebook L-LM for anyways. So yes, this is when you're talking about business use cases, this probably isn't something that you're going to export and go put on the front page of your website.

Starting point is 00:26:42 But I don't know, maybe it will be or at least something that you might put on social media. I could see that as well. So yeah, a couple new updates there. Also, there's obviously higher, much higher limits for Google AI Pro and ultra subscribers, although I think even the free limits for most people on a free plan is more than enough. All right. Our next one, this one's interesting. A Gemini diffusion model.

Starting point is 00:27:19 Okay. This is pretty big. This is pretty big. So, uh, this is not a transformer large language model. So, uh, diffusion. How do I explain this? It's almost like a live denoising process. All right.

Starting point is 00:27:42 So Gemini and most most large language models are quote unquote traditional transformer models, right? Uh, a very advanced, uh, next token predictor, right? Kind of you could say in theory working from left. to write where a diffusion model, it kind of just starts with noise and then it updates the whole thing. This is a very non-technical description, right? But this is an experimental text model using diffusion techniques, which like I said, diffusion models are inspired by image generation methods. And this is to refine answers with exceptional speed. So I have the example up here. And And what Google's kind of going to be releasing this for initially is for things that are more finite, right?

Starting point is 00:28:33 Things like math and coding because that's what I think diffusion models might be better at. You might be saying like, okay, like why? Why do we need a diffusion model? Well, how about for speed? So Google says their early testing show four to five X faster, four to five times faster on math and coding. text compared to comparable, you know, non-diffusion model. So this is a completely new technology, but if you do use large language model for coding, STEM, you know, specifically math tasks, I think it's going to be great. So right now it's in limited preview and there's a wait list.

Starting point is 00:29:18 So like I said, this is a very novel approach to applying diffusion-based methods to language models, which have not been used before. And it's really just focused on solving complex reasoning problems. So this is less about, you know, creating long form blog posts and more about working in areas that usually have more of a right or wrong answer and less about using them in areas where there's a ton of gray, if that makes sense. So like I said, some business use cases, if you're in anything with coding math, and if you're already finding a ton of utility by using, you know, Google Gemini or other large language models,

Starting point is 00:30:00 but, you know, maybe you need more speed. This could be it, right? So it's a completely new technology, uh, diffusion for text based large language models. A diffusion technology has been out there and been wildly popular for, uh, image models, right? So it's essentially denoising. So if you ever watch an AI image be generated in real time, which many of us do, because you go And, you know, whether you're using GPG40 ImageGen or, you know, you're using Imagine or mid-Journey, right? It starts.

Starting point is 00:30:31 You can watch it go live, right? So whether it's five seconds or a minute. And you see it kind of transform. It starts with this blurry, noisy outline. It's like a bunch of blobs. And then slowly it comes into focus. So that's kind of like what a diffusion model does versus kind of going left to right next token prediction on steroids. So pretty interesting here with the.

Starting point is 00:30:53 Google, or Jevonai diffusion model. All right. We have three, a couple, couple more here in our part one of our top 15 features. Okay. So real time translation in Google meet. So this is really cool. And like I said, this is technically nothing groundbreaking. Microsoft co-pilot has already had this.

Starting point is 00:31:22 for certain users, right? So Microsoft Copilot has had a version of this for their teams meetings, but you did have to have a certain copilot plus PC. So you had to be able to do this locally on your device. So Google is bringing this to the cloud. So what is this? Well, it's very unlimited right now, but very cool. So it is live speech translubes.

Starting point is 00:31:52 during video calls that work like having a human interpreter present. So in terms of availability, initially, it's only going to be available for people on the $20 a month pro or $250 a month, ultra plan. And at least right now, it's only going to be Spanish and English. But Google did say there's more languages coming soon. So essentially, it translates this in near real time with. natural voice synthesis. So if I was talking to, you know, some of my wife's family in, you know, Bolivia or Chile, we could talk to each other, right? And I was speak in English and it would

Starting point is 00:32:37 use a voice that kind of sounds like mine in real time, translate what I'm saying to Spanish, and then translate what they're saying from Spanish to English. And at least from the demos they showed, there's not a huge delay, right? It literally sounds like a world-class human translator or interpreter, right? It like you can't really tell any lad, right? So it's not like you say a full sentence and then, you know, 10 seconds later, you know, the translated version comes. It is milliseconds.

Starting point is 00:33:08 It is almost instantaneous, right? So again, that's the demo. We'll see what actually happens when this rolls out and specifically how it rolls out. Because, you know, one of the things I'm wondering and I am going to be following up with my Google contacts to get a lot of answers to questions. So if you do have questions on this, let me know in the comments because I will track down the answers. But one thing I'm wondering is like, okay, do both users need to have a pro plan? Right. Or can just one person, you know, be on that $20 a month? Because if both people have to have a pro plan, I think that

Starting point is 00:33:48 really limits the, you know, kind of the talking that you can do and having this be great. But think of what this does for business. This is absolutely nutty, right? Once this does roll out to more countries and more languages, and I do assume that Google will be trying to update this, and I'm guessing that this would be in the latter part of 2025 to the 50 languages that Notebook LM supports. That would be my guess. I don't have that on authority, but Google did say they're working on more languages and it probably makes sense to work on the 50 languages that they've already incorporated into notebook LM, which are the most widely spoken languages in the world. So this is huge, even if just for right now, right, think if you have business in Latin

Starting point is 00:34:35 America, South America, the language barrier is gone. Yeah, you might have to, you know, if both users need to have a $20 a month. Like, who cares, right? Imagine being able to talk to your colleagues from another country without a language barrier. This is huge. This opens up so many new business possibilities, especially when you look beyond where we're at now. Right. And like I said, this has already been out with Microsoft for many more languages, but the downside is you had to have that running on your local device.

Starting point is 00:35:10 So you had to have a newer co-pilot plus PC that essentially was running a language. model locally on your own device. So if Google can pull this off and expand it to 50 languages, this completely changes, how you can do business, right? Maybe you've only been a domestic business for now and maybe the language barrier is one of the biggest reasons why, right? This is huge. This is huge.

Starting point is 00:35:41 All right. One or two more here as we wrap up. So number nine, Gemini app updates, a ton here. So there's been a lot of enhancements to both the Gemini mobile app and the Gemini app. And we'll probably, in the coming weeks, we'll probably have a lot of dedicated episodes covering this. And we're going to be covering this a little bit more tomorrow when we talk about Gemini Live. So a lot of new updates there. But some of the Gemini app updates are rolling out now.

Starting point is 00:36:16 to both iOS and Android users. You get a lot of the core features free with some of the more premium capabilities for people on those subscription plans. So some of the ones that I think are worth noting, specifically within Gemini, deep research, right? Anyone out there using deep research like every single day like I am? I'm excited for that. But you can start deep research by uploading.

Starting point is 00:36:48 PDFs or images, which is huge in terms of personalizing your deep research. And like I said, I think a month ago, Open AI was in a league of their own with their deep research. But now I think Google Gemini is probably slightly ahead because they did change how their deep research worked because they upgraded it to their Gemini 2.5 model. So it used more thinking and reasoning and planning, right? But if you don't know anything about deep research, essentially you give it a query. And it'll go off and spend anywhere from, you know, two to 20 minutes researching anywhere

Starting point is 00:37:28 from a dozen to hundreds of websites. But now what makes it better inside Google Gemini with 2.5 is it uses this thinking model. It plans it step by step. And a lot of times it will make a turn, right? It'll start going down one path. And then in its research, it finds out like, oh, I was wrong about that. So I should probably not go look at another 100 web pages if I found out I was wrong about my original plan. So then it'll deviate and pivot, right, which is what Open AI's version of deep research has always done.

Starting point is 00:37:59 But now Google Gemini's version does that as well. But the new thing here is, at least with deep research, is being able to start with uploading a PDF or an image, which is huge. a lot of new updates for Canvas, which we're probably going to have multiple shows in the very near future, just looking at Gemini 2.5 Canvas and all of these new updates, you know, you can create now infographics, interactive quizzes, and then everything with Gemini Live that we're going to be going over a little bit tomorrow. So, I mean, just improved response quality through personal context, more natural voice interactions with emotion detection in the, the voice features.

Starting point is 00:38:42 And when you talk about business use cases, I mean, there's a ton, right? This is really where I think a lot of knowledge workers should be starting their day, right? Whether it's Chad, GPT, Google Gemini, co-pilot, right? You should be starting so many of your tasks in a large language model, not in the middle, not at the end, but start with idea, strategy, research, et cetera. So a lot of these app updates, they're more than quality of life. They're changing what's possible. And then speaking of changing what's possible, and this is last on today's list, but not least, Gemma 3N.

Starting point is 00:39:17 So this is Google's latest fast, inefficient, open, open source, multi-modal model designed for on-device AI applications. So what the heck does this mean, Gemma 3N? Well, first of all, it's scary good. This is a small language model. model four billion parameters. So what does that mean? Well, without getting too technical, a small language model, a four billion parameter model can fit on a phone can fit on today's smartphones.

Starting point is 00:39:55 Right. So edge AI and small language models have been saying this for years. This is the future of large language models because what's one of the one reasons that most enterprise companies or even individuals don't, um, work with large language models. Well, they're like, okay, well, data security, you know, all those things. Okay, sure makes sense. I don't want to send my stuff to the cloud, even though you already have everything in the cloud.

Starting point is 00:40:20 And it doesn't matter. It's the same thing. Regardless, for those that aren't smart enough to make that connection and, you know, figure out that, you know, one plus one equals two. I don't know in the new, the new math, the core math, if one plus one still equals two. But one plus one still equals two here. Because when you talk about edge AI, that, takes out all those data security things because you're not sending any information to the cloud.

Starting point is 00:40:44 You can shut off your internet and use Gemma 3N on a local device, right? And the performance is absolutely nutty. Okay. Claude 3.7 Sonnet is one of the world's most powerful proprietary models. Obviously, you have to use it in the cloud, right? Because it is enormous. You know, we don't know how big but chances it's a couple trillion parameters or at the very least at least a hundreds of billions of parameters which just means size right think of like a like a gigabyte of storage or something like that Gemma 3n is a fraction i would say it is less than 5% of the size of claude 3.7 sonnet yet for chatbot arena elo's scores so side-by-side comparisons it is essentially this same right there's only a four point difference. So that means when humans don't know the difference. And, you know,

Starting point is 00:41:45 everyone, I don't, I'm not a huge quad fan, FYI, but Claude's latest model, although there's rumors that they might be releasing, you know, a Claude, uh, four sonnet or quad for opus any day now, but at least their most powerful proprietary model. This itty bitty model that you can download, you can fork it, you can do whatever you want is just as powerful. just as powerful. So the availability is the preview of this is available now via Google AI Studio. Also Google AI Edge. It's free for developers.

Starting point is 00:42:21 You can download it, you know, fork it, fine-tune it with your company's data, et cetera. So it's engineered to run smoothly on phones, laptops, and tablets with minimal resource requirements. And it can, it is multimodal as well. It can handle audio, text, image, and video inputs. That is amazing. So it's optimized for resource-constrained environments while maintaining strong capabilities and modalities.

Starting point is 00:42:47 Also, speed fast, right? Not having to send something to the cloud and wait for the inference for it to go do its thing on the cloud. It's happening on device. So it's faster. It's more secure. And I've been saying this for a long time ever since we saw the first version of Gemma 3 a couple of months ago.

Starting point is 00:43:04 I said, don't sleep on Gemma 3. It is wild. wildly powerful, y'all. This, this completely changes how we're going to work in the future. Because what this signals, what this signals is this is going to force the other big companies, Open AI, Anthropic, etc. Those companies that don't have an open model yet, this is going to force them to go open because if you have a Gemma model, right?

Starting point is 00:43:35 And also, you know, there's good open ask models from. mistral from meta, their llama models as well. But I mean, this right now, Gemma 3N is benching off the charts for how small it is, right? This is going to force big companies that are only doing proprietary models to offer open models, right? Open AI CEO, Sam Altman did say that they're going to be releasing something, but this is huge because this means that probably within, I don't know, a year or two. Most new computers, I mean, well, I won't speak for Apple since they're still operating in the 1990s when it comes to artificial intelligence, but you would have to think even Apple is going to have to catch up. Most computers are going to come with a state of the art level, large language model that can run everything locally. So you won't even have to worry about data security because nothing's leaving your hard drive. It's the same thing as saving a file to your local device. working with a model like Gemma 3N. So this is huge.

Starting point is 00:44:39 So that's our quick recap of what's new, at least on the first half here. If you want some related episodes, y'all, I've had some recent ones. So I was at Google Cloud Next a couple of weeks ago and covered what was new there with Logan Kilpatrick. Already mentioned that. So if you want to go listen to that, that's in episode 501. Also, there's been a lot of new updates just with Gemini 2.5 Pro, and we're going to be talking about some of those even newer updates tomorrow. So if you want to go get caught up, go listen to episodes 494 and 495 as we do a two-part series on Gemini 2.5. You don't have to wait for anything. It's live. It's there. It's free on our website. Go listen to it.

Starting point is 00:45:22 So a very quick rundown as we wrap things up are part one. Here we go. Number 15, imagine 4.14 Chrome with Gemini integration. 13, personalization and email, 12, notebook LM updates, which I'm extremely excited about. 11, Gemini diffusion, a brand new type of large language model. 10, real-time translation in Google Me, only English Spanish now, but more coming soon. Nine, Gemini app updates, and eight, Gemma, 3, and the world's most powerful small language model. It is insanely good. I can't wait for tomorrow.

Starting point is 00:45:57 Make sure you tune in for part two. I'm telling you, some of these things that we saw, mind. boggling. I don't even know how I'm going to verbalize it with words, even though that's all I do. So thank you for tuning in. If you haven't already, please go to your everyday AI.com. Sign up for the free daily newsletter. Please make sure you join us tomorrow and every day. For more, Everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps.

Starting point is 00:46:40 including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going.

Starting point is 00:47:12 For a little more AI magic, visit Your EverydayAI. and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 530: Google I/O AI Updates: 15 new features and how they can grow your business (Pt 1 of 2)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.