Everyday AI Podcast – An AI and ChatGPT Podcast - EP 508: OpenAI’s impressive new thinking models, Google gives free AI to millions and more AI News That Matters

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. One of the largest companies in the world in Microsoft just released an autonomous AI agent that can use a computer and use apps and browse websites just like a human can.

Starting point is 00:01:03 Yet somehow, that probably wasn't even a top five AI news story of the week. That's because we got. five new large language model modes from open AI. Google released a ton, and they're giving away their Gemini AI products for free to tens of millions of people. And even Claude's, or sorry, Anthropic from Claude came out

Starting point is 00:01:32 and finally made some sizable updates that make you now have to consider Claude. Yeah, a lot going on as anyway. week in the world of AI news. And we're going to be covering all of it today on Everyday AI with our weekly installment of the AI News That Matters. What's going on, y'all? My name's Jordan Wilson. And I'm the host of Everyday AI. And this thing, it is for you. This is our daily live stream podcast and free daily newsletter, helping us all not just keep up with what's happening in the AI world, but how we can use all this information to get ahead, grow our companies,

Starting point is 00:02:10 and grow our careers. So if you are trying to be the smartest person in your company or department when it comes to generative AI, this is your new home. Your second home is our website at your everyday AI.com. So there on our website, you can sign up for our free daily newsletter. So each and every day, yeah, we have the live stream in the podcast at 730 AM Central Standard Time. But then we recap all of the most important insights in our free daily newsletter, as well as keeping you up to date with everything else that's happening in the world of AI.

Starting point is 00:02:42 Speaking of everything else that's happening in the world of AI, there's literally too much to keep up with. And I get it. So that's why almost every single Monday, we do our little special installment of the AI news that matters. So we cut through all of the most important updates of the week, all of the fluff, all of the good stuff, and we just give it to you straight.

Starting point is 00:03:06 No bias. just give you the bullet points in our take. So that's what we're going to do right now by going over the AI news that matters for the week of April 21st. All right. I'm excited of y'all, live stream audience. How are y'all doing? It's good to see some of you.

Starting point is 00:03:25 Aiden in the house, rooting for the Hoosiers. Good to see you, Aidan. Gene and Douglas, everyone else. Christopher, Fred, holding it down for my people here in Chicago. Rolando Rolando, Rolando, Kyle, Sandra, everyone else. Thanks for joining.

Starting point is 00:03:44 Love doing this live, right? People are always like, oh, Jordan, you should, you know, maybe pre-record this thing and edit it so you don't say the word, um, so many times. I love doing this live because then I get to hang out with you all and learn together. So let's get straight into it with our first big AI news story of the week. an acquisition, a multiple billion dollar acquisition from Open AI? Okay, maybe. So according to reports, Open AI is in advanced talks to acquire AI coding company, Winserve for $3 billion as the company

Starting point is 00:04:24 seeks to secure a major stake in the rapidly growing code generation market. So despite OpenAI's prior investments in Innesphere, the creator of the popular coding assistant Cursor, the company's acquisitions, discussions with Inesphere reportedly failed twice as confirmed by CNBC. So Cursor is currently generating around $200 million in annualized recurring revenue, While Winsurf brings in about $40 million in ARR, highlighting the intense competition among AI coding startups. So after OpenAI's acquisition talks with the parent's company for Cursor collapse, you know, that's when the talks to acquire Winsurf picked up. So Open AI's pursuit of Winsurf, despite its own recent launch of the Kodax CLI coding tool, signals a sense of

Starting point is 00:05:24 urgency to capture market share and not wait for in-house products to gain adoption. This move underscores how competitive the AI-powered coding assistant sector has become with several startups, including codium, vying for leadership as developers increasingly rely on generative AI to speed up software creation. So yeah, this one is pretty interesting. Wasn't shocked when I saw it, but I was like, huh, right? Because, yeah, as we just said there, Open AI has invested in any sphere, the parent company of Cursor, and apparently was trying to acquire Cursor before those talks fell through. Yeah, apparently Cursor, you know, got too popular too quickly already bringing in $200 million in annualized recurring revenue.

Starting point is 00:06:14 So now it looks like Open AIs kind of aim or their focus has turned to windsurf. So yeah, I would say that that's the 1A and the 1B kind of in the AI ID or the coding AI realm right now with cursor in the lead and then windsurf right behind. And then you have more specialized tools like lovable and bolt. And obviously, I don't know why more people aren't using Microsoft's GitHub copilot. It's an amazing tool. Very similar to some of these others. But it looks like, you know, for whatever reason, you know, cursor in in. Winserve really just took off, you know, online pretty quickly.

Starting point is 00:06:56 And I think a lot of that kind of online buzz led to just users in the hundreds of thousands flocking to these new IDE tools. It was pretty, pretty interesting as well. I got to talk to Winsurf's leadership team at the Google Next conference. Yeah, I know. Crazy, right? Like I talk to people that don't end up on this show. But, you know, some pretty cool things that they were cooking up there.

Starting point is 00:07:21 I got to talk to them about. All right. Our next piece of AI news, Google has launched Gemini 2.5 Flash, a new AI model that allows developers to set a thinking budget controlling how much computational reasoning the model uses. So pricing also reflects this flexibility with output costs ranging from 60 cents per million output tokens with a with reasoning off to three dollars in 50 cents with reasoning on so yeah if you keep that kind of thinking mode enabled on the new Gemini 2.5 flash you're looking at about a more than 5x increase in price almost a 6x increase in price so the new model from google the Gemini 2.5 flash

Starting point is 00:08:16 automatically adjust its reasoning budget based on the task complexity, aiming to help businesses save money on simple queries and invest more for complex problem solving. So early benchmarks show that Gemini 2.5 Flash, even though it is the smaller, much smaller brother of the world leading Gemini 2.5 Pro, but benchmarks show Gemini 2.5 Flash is already outperforming key competitors like Anthropics Claude 3.7 sonnet and DeepSeek R1 while even coming close to open AI's new 04 mini in reasoning tasks. So right now it's available in Google's AI studio to play around with for free, right?

Starting point is 00:09:05 But obviously that training data inside Google's free AI studio goes to Google. But you can also pay for it and start using it on the back end in your product. in Vertex AI, you can also use it in the Gemini app. And this release is part of Google's broader AI strategy, including some things that we're going to be talking about here in a minute. So pretty impressive benchmarks, obviously, for Gemini 2.5 Flash. I mean, you just saw this. This is their small model, right?

Starting point is 00:09:38 Not a small language model, but their small version of their large language model, Gemini 2.5 Pro, and it already, the small one, is getting better marks on benchmarks than Anthropics Claude 3.7, which is also a thinking model. So obviously pretty impressive results here from Google. I would not want to be anthropic in this position, right? When your two biggest competitors in Open AI and Google, both in the same week, come out with smaller versions of their models that cause a fraction of what it costs to use Anthropics Claude on the back end. And it is blowing their big boy out of the water.

Starting point is 00:10:26 All right. So what do you all think? You know, I'm going to get some comments and some, some thoughts here from our live stream audience as I sip on my very strong coffee. Yeah, Trevor, what's up, Trevor? Trevor is great with the live stream stuff on LinkedIn. Trevor says hard keeping all these versions straight. Yeah, absolutely is, right?

Starting point is 00:10:51 I think some people have asked for this. I think I'm going to create a graph essentially that has the latest models, what they're good at because, you know, even Open AI, I said they just released five new models, right? So it's like, okay, models that you were probably using last week, like 03 mini high, are gone, right? GPT 4.5 is leaving, at least in the

Starting point is 00:11:17 API. And now you have this alphabet soup of all these other new models including Gemini 2.5 Flash, but it is extremely impressive. So yeah, you would probably not want to use Gemini 2.5 Flash on the front end, right? So if you're using it inside the paid, you know, if you're on that

Starting point is 00:11:33 $20 a month, Google AI plan, there's really no need to use that model, right? Because you get full access to the Gemini 2.5 Pro. This is really for developers who are building with this on the back end, right? So if you're using Google's API to build your own product or to create a version of Google Gemini 2.5 Pro, a smaller version with their Flash model, impressive results so far. All right. Speaking of impressive, it's not even close. Google VEO's AI video generator VO2 is by far,

Starting point is 00:12:11 the best and most capable AI video model. And now it is rolling out to Gemini advance subscribers. So yes, Google has finally unveiled V-O-2, their industry-leading text-to-video AI model for Gemini advanced subscribers, letting users generate 8-second 720P videos from just prompts. So users can now create videos that can be shared directly to social media with though monthly limits on how many can be made. Also, right now inside Google Gemini's, you know, their front end chatbot, that's

Starting point is 00:12:52 where you can go use V-O-2. Now, it is a slow rollout. I didn't have it on any of my Gemini advance accounts right now, but you will just have to go back and look. But right now, it is just text prompts. So if you do want the full power of V-O-2, you still might have to use their own either vertex AI platform or it is available as well inside Google's AI studio. But right now, Google 2 or sorry, V-O-2 is touted for its improved realism and understanding of

Starting point is 00:13:23 physics and human motion, producing more lifelike content. All videos, though, do feature the synth ID digital watermark for transparency about their AI origin. Google also introduced whisked anime, a tool that turns images into. short videos available available uh i'm tongue tied this morning available globally i was trying to combine uh available and globally which doesn't make sense available globally to google one a i premium subscribers yeah so more on the google one uh i premium but what that means is you know essentially you're paying twenty dollars a month uh for a little bit of everything you get some of uh you know google's normal

Starting point is 00:14:06 non-a i uh tools and features as well as all of all of of their AI offerings. So, I don't know. Live stream audience, has anyone seen this pop up in their Gemini advanced plan so far? I think I have three or four different accounts with that $20 a month, Gemini advance plan. I didn't see it pop up in any over the weekend yet, but I would assume probably in the next week or so. But right now I've been using Google V-O-2 on the back end inside Google AI Studio.

Starting point is 00:14:39 it is a little bit more flexible and you get those some of these new features that aren't yet available when you're using the Gemini chat bot so so Kimberly says tried VO2 it's nice yeah it is extremely nice I still think I'll probably use open AI SORA for some instances I think there's some you know some some UI UX features inside Sora that I really like the ability to kind of string together multiple clips at once and create more of a short with multiple of these AI generated clips. But if you're just looking for one clip or if you're just looking for overall quality, I still think V-O-2 cannot be matched, at least not right now. Obviously, these AI video tools are being updated just like large language models,

Starting point is 00:15:28 almost on the daily. But you have to, I mean, if you haven't already, you got to go check out V-O-2. It's nice. It is really, really nice. Yeah, Kyle said it didn't pop up in his either, but he's liking whisk. Yeah, whisk is just a super fun tool to use. All right. Our next piece of AI news. The Trump administration here in the U.S.

Starting point is 00:15:50 is considering new restrictions on Chinese AI lab deep seek, potentially limiting its access to Nvidia's AI chips. And also barring and banning Americans. from using it. So this move follows the White House's recent tightening of rules, restricting Nvidia's AI chip sales to China, expanding on measures first introduced by the Biden administration. So DeepSeek has rapidly gained popularity among U.S. developers due to its competitive pricing,

Starting point is 00:16:25 prompting Silicon Valley to lower the cost on its own advanced AI models. The Trump administration's actions are part of a, are part of a, part of a broader U.S. effort to slow China's progress in artificial intelligence and also protect American technology and consumer markets. There are ongoing concerns about DeepSeek's business practices as OpenAI has accused the Chinese lab of distilling its models in ways that may violate intellectual property rights and OpenAIs terms of use. For individuals and companies, these restrictions could mean, fewer low-cost AI options and increased pressure on U.S. firms to innovate and protect their own intellectual property. So according to the New York Times, the decisions made could reshape the competitive landscape for AI development and access, especially for startups in smaller

Starting point is 00:17:20 businesses relying on affordable cutting-edge AI tools. All right. So I've covered this plenty, right i'm going to try not to accidentally go into a into a hot take uh you know hot take tuesday here on this piece of news i will say this cover the the deep seek saga a couple of months ago so if you want the the truth with receipts uh go go read that but there's a reason why the u.s government is potentially looking to ban deep seek it's because when other than you know it or not, if you are using directly DeepSeaks API, if you are using deepseek's chat on the front end, all of your data goes directly to the Chinese government. So I know people don't like to, you know, talk about geopolitics a lot.

Starting point is 00:18:18 And I'm not going to dive in too deeply, right? But here's the reality. I'm from the U.S. Right. So this is like artificial intelligence, whether you want to admit it or not. not, it is about so much more than technology. It is about global power, right? Let's just call it what it is. Right now, compute, AI chips and large language models are the new oil. They are the new gold. They are the new currency, right? Essentially. So when it comes to geopolitical tensions,

Starting point is 00:18:55 I think it's important to call that out. This is about so much more than just, just, oh, large language models or, you know, chip exports. No, right? I think we've already seen it for the past year and a half. I think we're going to continue to see it even more. Just tighter restrictions around the most powerful technology. But y'all, you have to be smart. And this is why I told you all.

Starting point is 00:19:19 I told you all this, right? I didn't jump on the bandwagon like, you know, it's funny. These, you know, you have these quote unquote AI influencers on, on social media. And when DeepSeek came out, almost every single one of them is like, go use deep seek. It's so cheap. Okay. That's because you were sending your data to China, right?

Starting point is 00:19:42 Whether you want to send your company's proprietary confidential data to China is ultimately up to you. Right. But Deepseek does not work in the same way that if you go and log on to, you know, chat GPT or Google Gemini or. Anthropics Claude, right? There's built-in data protections with those companies being based here in the U.S. So, yeah, if you have been using DeepSeek's API directly, not through third-party service providers who essentially go through and they make this safer and they take out some of the built-in biases.

Starting point is 00:20:17 But if you've been using Deep Seeks API directly, if you've been using Deep Seek directly on the web, anything that you've uploaded has been sent and is being used by the Chinese government. So maybe you're fine with that. That's okay. But just important to call that out. why we are probably going to see these talks on a deep seek ban continuing. All right. Our next piece of AI news. Yeah, I started to show off on this. This isn't even a top five AI news story of the week, which is silly, right?

Starting point is 00:20:47 Because this is huge. So Microsoft has launched a new computer use agent inside co-pilot studio. Enabling AI agents to automate actions on websites and apps as if they were actual human users. Right. And you can get this set up. Essentially no code, low code. So you can go into Microsoft's Copilot Studio if your organization has given you full

Starting point is 00:21:13 access to Microsoft 365 copilot in Copilot Studio. And you can go right now inside Copilot Studio and get a computer using agent that just works. So this is significant because it lets AI agents handle tasks even when there are no APIs or built-in integrations, which could dramatically expand automation possibilities for businesses. So the feature allows agents to click, type, and navigate, essentially performing any activity a person could do online, such as filling out reports, logging into secure sites, or even managing customer service requests. So Microsoft executives have emphasized that If a person can use an app, so can the AI agent, making automation accessible for a wider range of business processes.

Starting point is 00:22:06 So the update builds on Microsoft's earlier actions features, but is designed for more advanced in business scale automation rather than just personal use. So the technology is also able to adapt to changing websites and app layouts, making it more reliable for ongoing real-world automation, automation needs like invoice processing, data entry, or just even more complex research that maybe some of these research tools can't get to. So this development follows similar efforts by Open AIs operator and reflects a broader industry push to streamline repetitive tasks and free up time for more valuable work. Yeah, it could run your LinkedIn activity 24-7, sure. You know, that's actually one thing I tried to get operator to do, Open AI's operator.

Starting point is 00:23:01 And it didn't work very well. One of the main reasons, it's not actually an operator. Like, so yeah, if you're, if you're wanting, you know, one of these AI agents to go use LinkedIn. One of the reasons it actually doesn't work very well is because the LinkedIn interface states, right? So what I was trying to do to have operator do is to go through my DMs, not reply to them, right, but just mark anything that's important because I don't know, 50% of what I get on LinkedIn is spam. And it's hard for me because I get obviously legitimate people like you all reaching out, right? Like if you all see a story that breaks, you know, when people are building new products

Starting point is 00:23:43 and, you know, they, you know, want open AI or sorry, they want everyday AI to cover them, right? I get a lot of important DMs, but it's hard for me to go through them all because I don't know. I have probably, I don't know. a couple thousand unread over the years. Right? So I was trying to train operator to go through and read them. And it did okay, but it was more of an interface bug because when you're doing this infinite scroll thing

Starting point is 00:24:06 on the LinkedIn inbox, one little pixel difference is all that takes. Right. So yeah, maybe I'll have to try out the new computer use agent from co-pilot studio. See if that does any better. All right. Let's keep it going.

Starting point is 00:24:23 Google just gave away. its most powerful AI for free to like 20 million people. So Google has announced that all US college students with a valid. .edu email address can now get a full year of free access to Gemini Advance. So this move is part of CEO Sundar Pachai's strategy to reach 500 million Gemini users by the end of 2025. So right now, eligible students can sign up for the Google One AI premium plan, which is normally $20 a month.

Starting point is 00:25:09 And that includes the advanced Gemini Pro models, unlimited deep research tool usage, VO2 video generator, notebook LM plus Gemini Live, as well as two terabytes of Google Drive storage. Yeah, if you are a college student and you have a valid.edu email address for a university here in the U.S., yet you are going to get a free year of Google's best AI offering. So the offer is immediately available in runs until June 30th. So yeah, you do have to sign up by this June 30th, but the free access actually lasts through this spring semester of 2026. So yeah, if you sign up as an example, today, you could get up to like, what is that?

Starting point is 00:26:01 Like 14 or 14 months of free Google Gemini. So that's, you know, almost $400 of free value. So Google's definition, though, of student is broad. So it says anyone with a dot edu email address qualifies, even if they are not currently enrolled in classes. So this strategy could obviously help students and recent graduates. So, you know, students are preparing for finals right now. There you go. You can prepare with Notebook LM Plus, which I would highly encourage you to do, my gosh,

Starting point is 00:26:37 as well as recent graduates, right? If you're looking to land a job, you know, some great tools inside Google's, Jev and I that can help you do that. For Google, though, the promotion represents a very calculated effort to build user loyalty among younger adults and future professionals, even at the cost of short-term revenue. You know what? This is where it's like, again, one of those things, I don't want to be anthropic, right? Because Open AI has said that they're losing money on their, on their more premium tiers, such as the $200 a month pro tier, right? It's been widely reported that Open AI is losing

Starting point is 00:27:21 billions of dollars a year. Here we go. Now, Google following suit, just being like, we don't really care about short-term revenue. We care about users, right? So I think it's an extremely smart move from Google. And this follows after OpenAI essentially announced two months of free access to its chat GPT plus $20 a month plan to students. So Google essentially said, yeah, Open AI, we'll see that two months and will raise you an entire year. So yes, this is a race for users. It's a race for eyeballs. And again, I don't want to be anthropic in this one. Speaking of Anthropic, ah, finally, they up their relevancy just a little bit. All right. So Anthropic has rolled out Google workspace integration for its Claude AI chatbot,

Starting point is 00:28:15 letting users pull information directly from Gmail, docs, calendar, uh, according to the company. So the integration is available to all paid Anthropsych Claude users. However, if you are on a team or enterprise plan, administrators must first enable access before individual users can connect their Google, Gmail, docs, or calendar accounts. So, yeah, I'm curious, has anyone tried this? I have.

Starting point is 00:28:47 I have some mixed thoughts on it. But one other new update here from Claude, they did start rolling out their new research tool, which automatically searches the web and workplace documents to answer questions. But right now, that is only available on that super pricey either Max plan, which is $100 or $200 a month or team or enterprise plants in select countries. So when it comes to the new research tool, which is very similar to all these other deep research tools that we have from OpenAI, we have them from Google, Perplexity, GROC, everyone else. So again, you know, Claude here a little late to the party, but the difference is it can also look at your workspace information as well when pulling these deep research reports together, or as they call it, just their new research. tools. So I did test this out. I tested out the Gmail integration because that's something that I think is unique.

Starting point is 00:30:00 So Open AI already rolled this out a couple of weeks ago to their teams users, which I think works very well. But one thing they don't have is the ability to go through Gmail. So I was testing it out a little a little bit over the weekend. I was actually in line for something. for like an hour. So I was just on my phone, testing this out. And it was okay, right? Like my use case,

Starting point is 00:30:24 you know, speaking of my, my LinkedIn DMs being full of spam. My email is probably worse, right? So a lot of times I get companies reaching out and they want to advertise with everyday AI or maybe they want to hire me to speak at their conference, you know, train their employees,

Starting point is 00:30:46 et cetera, My email inbox is disastrous because I also get pitched dozens of times a week for people that come on the show and just a bunch of spam, right? So I'm trying to use this new feature from Claude and I tried it with thinking. So with 3.7 son it, with thinking enabled, with a disabled. And it's okay. It's not great. It did an okay job.

Starting point is 00:31:10 But I don't think it's something that I would necessarily be like, okay, this is a game changing feature or even a reason. to stay on a paid plan from Anthropic, right? Funny enough, I did a show last week. I believe it was on Tuesday that I'm like, y'all, anthraffic's in trouble unless they come out with some meaningful updates. Coincidentally, these updates dropped hours later after that very live stream, right? Funny enough.

Starting point is 00:31:37 I don't know if this is enough, right? I am going to give this a little more, a little more of a look right now from Anthropics. but I don't know. First, first impressions, it didn't do a good job of going through my email, right? Like, at least judging by the questions that I asked, right? I'm like, hey, go through, find people that have reached out about sponsorship or hiring me to, you know, train their teams or to speak at their events. It really looks like it only went through the first couple of pages, even when I encouraged it to go deeper or if I said, okay, you know, pick up where you left off. So I did a lot of tinkering around and it still only looked like this initial feature, sorry, like this feature was initially only able to go through the first

Starting point is 00:32:22 pages of my emails, right? When it's like I have, I don't know, I don't delete emails. I feel people are either like inbox zero or inbox a trillion. That's me. I'm part of the latter. I just let the emails stay in the inbox, you know, so I have tens of thousands of literally. I think that email inbox probably has, I don't know, 50,000 emails at it. Right.

Starting point is 00:32:44 So it didn't do a very good job of going past. maybe page five or six because I knew, right, I'm kind of doing these needle in the haystack test. It's like, oh, I know this company reached out two months ago. I forgot to get back to them. My bad. So I was kind of seeing if Claude could pick up on it. And it didn't do a good job. So, you know, you might be wondering, okay, well, couldn't you just go and search, you know, your email and type in the word partnership or sponsorship or advertisement?

Starting point is 00:33:08 Yes. Right. But that's the whole point of large language models is kind of this natural language processing, right? because people might not always use those same keywords, right? They might use a different set of words. So that's the whole point of having a large language model that can connect to your live data. But at least my early testing of this, not super impressed live stream audience. If anyone else did this, let me know if you found better results than I did.

Starting point is 00:33:43 Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversation. experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premier, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for

Starting point is 00:34:27 common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at Firefly. dot adobe.com.

Starting point is 00:34:56 All right. Speaking of the big AI companies trying to compete in a new space, open AI is quietly testing a social media platform that could reshape how AI in online communities interact, according to reports. So OpenAI's reported new social media network mimics X or formally Twitter. and centers on chat GPT's new and extremely viral image generation feature. So CEO Sam Altman has reportedly been gathering private feedback on the social media project hinting at serious interest in actually launching it. So this move follows Twitter or X's successful integration of GROC AI, which competitors reportedly envy for few.

Starting point is 00:35:54 VIRAL posts. Mata as well has tried adding AI features to Facebook and Instagram, but faced scrutiny and mixed results. So yeah, the big social media companies have already been tinkering on how they can better integrate AI into their offering. So here we have the reverse approach, the most popular AI company in the world when it comes to users when it comes to monthly active users thinking about taking the reverse approach by saying, ah, we have all the AI users.

Starting point is 00:36:30 Maybe we should start rolling out a social media network. Uh, well, actually, your audience, what do you think about this? I have my thoughts. Uh, but right now it is unclear if the new platform would either be a part of chat GBT or a standalone app. And there's obviously no official word on if this will become an actual product or not. if it'll even even get launched. But for businesses and creators, this could mean new tools for audience engagement,

Starting point is 00:37:00 but also challenges around data use and content authenticity. Here's what it comes down to. Data, right? Data. So that's a huge, one of the biggest, I guess, I wouldn't call it features. One of the biggest advantages that companies like XAI, with their GROC and the X slash Twitter social media network, as well as Meta Lama with their integration to Facebook, Instagram, WhatsApp, etc.

Starting point is 00:37:37 Is they can train their model on all of that data, right? Which can be a good or a bad thing, right? Personally, if I'm a business, right, I would not want to touch XAI and GROC because, you know, from reports what we've heard, a large. percentage of its training data is just ex post. So I think there's good side of this and a potential bad side of this. But what Open AI wants is they want more training data, right? They obviously want users more engaged, right? Because the more engaged they are obviously the more ways that they can monetize this in the long run, aside from just a, you know, $20 or $200 a month subscription to their

Starting point is 00:38:23 more premium services. So Joe says AI plus social media, not a good idea. All right. So yeah, some people are not fans. Sandra says, how do you control misinformation? That's a big one, right? Yeah, because like I said, with X, AI, and Grock, you know, I can say this. Studies have shown that X, the X platform is by far.

Starting point is 00:38:52 the platform in the U.S., the social media platform with the most misinformation and disinformation. So yeah, you hope these companies that are, you know, using this live, real-time data for their AI models can decipher between the real information and the misinformation disinformation. But the reality is, it's probably hard to keep up and it's hard and difficult to do that. So I don't know. So I don't know. I'm not a big social media person. I'm not obviously this this live stream goes out to LinkedIn. You know, I'll look at Twitter for AI news, but you know, everything else. I'm not, you know, following people on social media.

Starting point is 00:39:36 I'm not posting things, right, you know, at least about my personal life. So I guess I use social media a little bit professionally for everyday AI. But that's about it. But I would assume that Open AI has bigger plans if they do release a social media network for it to be much more than just sharing your, you know, latest AI image generation with their impressive new 4-0 image generation tool. All right. Our next piece of AI news, 4.1.

Starting point is 00:40:11 Yeah, we have a new model, but it's not available for everyone. So OpenAI has launched GPT 4.1, touting a 1 million. million token context window and dramatic API price cuts. So this new small, much smaller technically version of their new model is not yet available on the front end. And it may not be. So if you are going to chat gpt.com and you want to use this new upgraded GPT 4.1, you will not find it because right now it is only a developer model.

Starting point is 00:40:51 So API pricing for GBT 4.1 is now much lower than competitors with input at $2 per million tokens. Input and output is $8 per million tokens plus a 75% cashing discount that rewards prompts reuse. So pretty impressive, especially given that this new GPT 4.1 model beats Anthropics, Anthropic Claude's 3.7 Sonnet in both coding benchmarks and real world GitHub code reviews, making it a strong contender for coding application. So yeah. And actually, the 4.1 mini model has actually gotten rave reviews for both power benchmarks and price as well. So competitors like Anthropic and Google are now facing increased pressure as their pricing is higher. And Gemini's complex tiers and lack of billing safeguards have drawn criticism.

Starting point is 00:42:01 But obviously, Gemini responded, Google and Gemini responded to this days later with their Gemini 2.5 flash. So again, who we have on the outside is poor andthropic, right? It has been kind of the coding sweetheart over the years. But gosh, I'm not wanting to be Anthropics Cod right now with these new updates in GPT 4.1, developer-only model, extremely adept in the software development side in coding, as well as being extremely affordable. And then like we said, Gemini 2.5 Flash, a hybrid model. So pretty interesting.

Starting point is 00:42:48 But yeah, at least for right now, for GPT 4.1 will not be available inside of chat GPT. So at least right now, it is not a front end model. Also, Open AI did announce that they will be getting rid of the GPT 4.5 model on the back end. They did not mention, right? So they're not going to be supporting it, I believe, past the summer on the API side. They did not announce if it was going to be going away from, you know, chat GPT.com. I'm assuming it might still looking for a little clarity from OpenAI on that one. All right.

Starting point is 00:43:25 Our last piece of AI news and it's probably the biggest. So Open AI has launched new models, a full version of 03. It's an extremely impressive thinking model. and 04 mini. So OpenAI has released its most advanced AI models yet in 03, full, and 04 mini, giving users faster, smarter, and more flexible tools for tackling complex questions. So these new models can search the web, analyze images and files, and write code and generate charts all in one conversation.

Starting point is 00:44:11 Yeah, that is the biggest new feature. So previously, if you were using a model like OpenAI's O3 Mini High, which was actually my workhorse model, now it's gone. But the difference is now, 03 and 04 Mini, yes, they have this, they are reasoning models. However, now they can still use all of these other tools under the hood, right, which is impressive because previously they could not all do that. So as an example, uh, you know, O3 full is probably the most capable model, which I know is super confusing. Because if you are on a pro plan like myself, now you have three different generations of these O models, right? These models that use this kind of chain of thought or reasoning under the hood, right?

Starting point is 00:45:05 They can think and plan ahead, uh, and adapt in, you know, sometimes they'll think for, you know, three, five, 10, 15 minutes before providing you a response. But right now, right, if you're on a pro plan, you have 01 pro, which is a model I still use for a ton of things. But now you also have this new 0.3 full, right, which is different because previously we had 03 mini and 03 mini high. So now we have 0.3 full. And then we have 04 mini and 04 mini high.

Starting point is 00:45:38 So we don't have a full version of. 04, we have a full version of 03. And y'all, using the full version of 03, it feels criminal. It is so, so good. So there's instances, right? And I'm still exploring this myself. It just came out a couple of days ago. So, you know, I'm still getting my, you know, trying to find the time to fully investigate this. I probably only spent about maybe three hours so far using these new new 03 and 04 MIDI, which is not a lot for me. Normally, I'm like, you know, eight hours the minute it comes out. I haven't had a ton of time yet.

Starting point is 00:46:16 But 03 is so good. It feels criminal to use it. And I think one of the reasons why is you get access to all of the tools. So not only can you have this, you know, model that thinks step by steps, sorry, step by step. It can reason. It can plan ahead. But in doing so, it can also use.

Starting point is 00:46:38 use multiple tools and it can go back and forth, right? So as an example, it can use Canvas. It can use chat GPT search. It can use Python, right? That right there, the combination of using all of these different tools, it's pretty amazing, right? So one way I tested it, I had a screenshot. And I just did this all on my phone.

Starting point is 00:47:03 I had a screenshot of some of the top AI tools. And there's probably like 30 or 40 of them. So, you know, on my phone, you know, because again, I was in the line, I was in a line for something for an hour this weekend. So I just uploaded that screenshot to 03. I said, hey, give me pricing for all of these tools. Because I knew them all, right? But I didn't know pricing for all of them because there's like 30 or 40 of them.

Starting point is 00:47:28 And I was subscribed at most of them, but not all of them. So not only could Open AI number one or sorry, this O3 model use computer vision, see them all. It went and it used the web, but I could see it as it went. It was using the web and using Python interchangeably, right? Because what it was ultimately doing, it was putting together a graphic, a graphic for me and a table on all of these different AI tools. It was sorting them and categorizing them and it was going and doing research and going back and forth multiple times. So I just kind of watched it in awe. And I'm like, this changes things completely because essentially you can chain together multiple commands that use these tools, right?

Starting point is 00:48:10 And one of the biggest things that separates kind of a general large language model from AI workflows from agentic AI is the ability for agentic tool use, right? So what that means is a large language model or an AI workflow can decide on its own, hey, I need to go query the web for this. Hey, I need to use computer vision for this. Hey, I need to put this in a in a table. Hey, I need to run some Python code for this, right? And it can make that choice on its own, right?

Starting point is 00:48:41 And the user does not have to tell them to. So to be able to have a model as powerful as 03, be able to string those things together. Y'all, I was just like jaw on the floor the first couple of times I used this. So a couple more stats on this. Open AI says 03 is now its top performing model for reasoning, coding, math, science, and understanding visuals, making a 20, making 20% fewer major mistakes than the previous version. Yeah, the hallucination rate is still pretty high on this. So you do always have to start any chat with more data and keep your expertise in the loop. However, it does build on the previous

Starting point is 00:49:26 generations of these thinking models. A little bit on the 04 mini model. Well, it's designed for speed and cost effectiveness, hitting a 99.5% pass rate on major math competition when giving code access. So both 03 and 04 mini can figure out when and how to use different tools to answer tough, multi-step questions adapting as they go. And for the first time, users can upload images, like photos of whiteboards or diagrams and the models can think with those visuals to help. solve problem. So yeah, that is, you know, kind of my example. I had a screenshot, uh, right, that I could kick off that whole flow with, uh, which was, uh, a little mind-boggling from being honest. So open AI has reportedly rebuilt its safety features for these models to better

Starting point is 00:50:21 handle sensitive topics and reduce risks. So the new models are available right now for chat GPT plus and business users with 04 mini also open to free users. if you click the think option. So if you are on the $20 a month plus account, there are limits on these new models. If you are on the pro plan, I believe it is essentially unlimited is what I read. So keep that in mind.

Starting point is 00:50:50 But even if you're on the free plan on chat, GBT, you can click that think option. And you can use 04 mini, although it is very limited. So developers and companies will benefit. fit from a smarter, faster response that could help them save time and money on everyday tasks. LiveSremittance, have any of y'all use this, the new 03 or 04? I was personally a little flabbergasted, right?

Starting point is 00:51:21 I use AI way more than the average person, even way more than the average power user. And I was like, wow, this can really change workflows, right? And also, you know, starting to bridge the gap between traditional large language model use and AI workflows, right? Which is extremely important to keep in mind as well. Yeah, Michael here from YouTube says, can't believe how much is happening every week. Still, I don't think it will ever stop. Yes. So Sandra asking, what are the best applications for it?

Starting point is 00:51:58 Great question, Sandra. I might have a dedicated show on this new. O3 model, if you all would like. I think the best application for it is an example of kind of what I did, right? When you have to work between multiple modalities, you need research and you also need maybe an output that requires more than text, right? So maybe it's starting, yeah, like a simple example, starting with a whiteboard, right? And maybe your team is just ideating or brainstorming things for an upcoming product launch.

Starting point is 00:52:27 You know, take a picture of that screenshot, right? Combine it with some of your data. And then the new O3 model can go agentically, essentially, research the web. It can go use Python and other built-in tools. When you're done with it, right, you can use the canvas mode to kind of work iteratively with O3. So, yeah, like, I think the best applications for it are when you need to do research, maybe when you're starting with an image as an input, as well as if you need a little bit of code. If you need, you know, chat, GPT to, you know, essentially go through and categorize something,

Starting point is 00:53:05 organize or do a bunch of research, research, and then categorize that research as well. So a ton, a ton of potential use cases for this new model. All right. That's a wrap, y'all. Let me know in the comments, live stream audience. I know Mondays are always long shows. But let me know what you want to hear more of. It should be a very interesting week in AI.

Starting point is 00:53:30 But let's do the very, very quick recap. Here's the AI news that matters for the week of April 21st. So first, we started with Open AI is pursuing a $3 billion acquisition of windsurf, which, hey, I didn't even mention this, but that is probably, or might have been the reason why when we saw this announcement of GPT 4.1, you got free usage for windsurf users. So there's a little nugget. All right, our next piece of AI news, Google launched Gemini 2.5 Flash, an extremely capable and impressive model. They also unveiled VO2, their AI video tool for Gemini advance subscribers. The Trump administration is reportedly weighing a ban on Deepseek, which I personally think is a good idea.

Starting point is 00:54:19 Microsoft has launched a computer using agent inside of code. Pilot Studio. Google is giving away a year plus of Gemini advanced to all US college students with a valid.edu email address. Anthropic released Google workspace integration as well as a new research tool that everyone else already has, but it's only available on its higher price plans. Open AI is reportedly eyeing a social media platform. potentially competing in a different way with X slash Twitter and meta. OpenAI released a API only model in GPT 4.1 with a 1 million token context window and cheaper pricing. And then last but not least, OpenAI also released 03, 04 mini, 04 mini high,

Starting point is 00:55:21 some extremely capable reasoning models with agetic tool use. My gosh, it was a lot. What do you guys want to hear more of this week? I might have an open slot or two on the show this week. So if you want a dedicated show on any of these, let me know. I'll probably put a poll out in the newsletter as well. So if you haven't already,

Starting point is 00:55:42 please make sure you go to your everyday AI.com. Sign up for that free daily newsletter. Also, if this was helpful, y'all, Please, I'm super appreciate this. Number one, if you're listening on the podcast, please subscribe, follow the podcast, leave us a rating if you could. I'd really appreciate that as well as if you're listening on social media. Don't keep everyday AI your little secret. That's rude.

Starting point is 00:56:05 Share it with the world. Share it with your coworkers. Share it with your neighbor's best friends, mothers, babysitters, dog walker, right? Everyone needs to learn AI. And it is extremely hard to keep up. That's why we do this AI news that matters almost every single Monday to cut through. the marketing, cut through the BS, cut through the noise, and tell you what really matters. I hope this was helpful. Thank you for tuning in. Hope to see you back tomorrow and every day for more

Starting point is 00:56:29 everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premier Express, and more in one conversational interface. You wreck the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI.

Starting point is 00:57:17 Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll We'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 508: OpenAI’s impressive new thinking models, Google gives free AI to millions and more AI News That Matters

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.