Everyday AI Podcast – An AI and ChatGPT Podcast - EP 526: LLM May Updates: What’s new in ChatGPT, Gemini, Claude and more

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live and Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Just in the last 10 days, there's been more than a dozen major updates to the large language

Starting point is 00:00:53 models that most of us use and rely on each and every day. And I'll tell you what, even as someone that does this every day on the everyday AI podcast, I myself find it hard to keep up. So I decided, let's do a little recap and get everyone caught up to speed on what's new in large language models in the last month or two, talking about new updates in Chad, TBT, Gemini, Claude, and a lot more. All right. So if that's what you're trying to do, I think this episode is going to be for you. What's going on, y'all? My name's Jordan Wilson and welcome to Everyday AI.

Starting point is 00:01:36 This is your daily live stream podcast and free daily newsletter, helping us all not just learn what's happening in the world of AI, but how we can leverage it to grow our companies and our careers. So it starts here with the daily live stream podcast. It's unedited, unscripted, just bringing it to you real. But if you really want to grow and be the smartest person in your company when it comes to AI, you got to go to our website. That's where you leverage it.

Starting point is 00:02:01 You learn here. You leverage at your everyday AI.com. So there, you can see. sign up for the free daily newsletter. Each day we bring you some exclusive insights, recapping the podcast, what you're listening to right now. We're watching. But also on our website, you can find more than 520 videos, podcasts, write-ups, right, recaps, all sort of by category. So no matter what you're trying to learn, we've already talked to the smartest people in the world. And we already have the answers for you, for free on our website. So make sure you go

Starting point is 00:02:32 check that out. All right. So I'm excited today to talk about some large language model updates. So yeah, there's been a ton. Even in the last 24 hours, there's been some pretty sizable ones that we're going to talk about. But before we get started, let's start off as we do most days by going over the AI news. So first, Saudi Arabia is making a big $600 billion AI push. So Invidia will sell hundreds of thousands of its new Blackwell AI chips to Saudi Arabia's sovereign wealth-backed startup Humane, launching with an initial batch of 18,000 chips, according to Reuters. AMD has signed a $10 billion deal with Humane to deliver 500 megawatts of AI hardware infrastructure over five years, aiming to diversify the region's AI supply chain. And Saudi firm Datavolt will invest

Starting point is 00:03:28 $20 billion in the USAI data centers and energy infrastructure, while companies like Google, Oracle, Salesforce, AMD, and Uber plan $80 billion in joint tech investments across both countries. So this was some pretty big news over the last few hours as the U.S. and Saudi Arabia make some pretty big billion-dollar AI deals. And Saudi Arabia's aggressive AI investments signal its ambition to become a global AI hub, potentially creating new opportunities for tech professionals and businesses seeking to grow in that sector. Our next piece of AI news, GROC's AI bot has stirred a little bit of controversy with some unprompted comments about violence in South Africa. Yeah, this one's not good.

Starting point is 00:04:22 So NBC News is reporting that GROC, news, is reporting that GROC, the AI chat bot from Elon Musk's XAI has been unexpectedly responding to unrelated user queries with comments about violence against white people in South Africa.

Starting point is 00:04:41 So the bot's XAI's GROC is referencing racially charged violence in Musk's claims of quote unquote white genocide more than 20 times this week according to the report. Even when

Starting point is 00:04:57 users did not mention South Africa. Yeah, that's not good. This behavior raises concerns about AI bias and moderation, especially as the topic has become politicized in the U.S., and Musk continues to amplify such unfounded claims. X says it's investigating the issue, highlighting the challenges of managing AI outputs as bots increasingly interact with users on major social platforms. All right. Our last piece of AI news, pretty exciting if you care about the science of AI.

Starting point is 00:05:31 Google DeepMinds Alpha Evolve AI has broken records by inventing new algorithms for real world impact. Yeah, all the people that say AI is nothing but, you know, advanced auto complete. Nah, it just literally created a new algorithm that's shattering the scientific community. So Google DeepMinds Alpha Evolve AI has invented a new computer. algorithm that boost Google's data center efficiency by 0.7% and speed up AI training by 23% on key operations. So the system combines Gemini large language models with evolutionary techniques to generate test and refine code, outperforming previous AI coding tools by evolving entire code bases. Yeah, not just like rewriting the code, but just rewriteing.

Starting point is 00:06:25 the rules of coding. So Alpha Evolve surpass a 56-year-old matrix multiplication record and solved mathematical problems that had challenged researchers for decades, including breaking the record for the quote-unquote's kissing number problem in 11 dimensions. Google plans to expand access to Alpha Alpha Evolve for academic researchers and its success hints at future breakthroughs in fields like chip design, material science, and drug discovery. So yeah, this is absolutely huge news for the scientific AI community. So for more on those stories and a lot more, make sure to go to your everyday AI.com, get the free daily newsletter and we'll be recapping those there.

Starting point is 00:07:08 All right, let's talk about what you maybe started listening to this show for. So many large language model updates over the last month or so. So I said, let's have a dedicated show to go over what's new in chat, GPT, Gemini, Claude, and more. So if you are absolutely brand new here, we do also a weekly AI news story, which a lot of times has, you know, some of these large language model updates, but sometimes there's so much going on on the new side, we don't even get to cover all of these. So I've been thinking for a long time, maybe I should start doing a monthly update. So whether it's just one large language model show or maybe, you know, maybe we do Chad, GVT and, you know, Google Gemini and co-pilot. like once a month updates. So live stream audience,

Starting point is 00:07:58 it's good to see you all. Let me know, just say like monthly updates, yes or monthly updates, no. All right. Or if you're listening on the podcast, I always put my email in the show notes,

Starting point is 00:08:09 my LinkedIn, so you can just literally say monthly updates. Yes, monthly updates. I don't know if it's going to be super helpful for you all, or if it will seem redundant since we kind of cover it anyways, but I thought this might be a good place. Maybe you only listen to the show a couple times a month.

Starting point is 00:08:25 This could be a good opportunity just to keep up with everything that's happening. So let me know. I work for you all. Just let me know yes or no. All right. Let's start with chat GPD a lot. So yeah, in the last couple of hours, a pretty big update. So OpenAI released GPT 4.1 into the chat GBT interface for all paid users.

Starting point is 00:08:47 That includes Plus, pro, and team, as well as enterprise and EDU. access will be coming in the, uh, will be coming in the fourth coming weeks. Uh, they also introduced GBT 4.1 mini, uh, and replaced GPT4.0 mini for paid users and as a fallback for free users after they've hit their limits. So if you're confused on GPT4. It has a longer context window, uh, you know, open AI says it's better for coding. Uh, and it's a little better at instruction following. Um, so this has been available for more than a month on the API side. And a lot of apparently OpenAI said that they heard from a lot of users that wanted this

Starting point is 00:09:30 to come to the front end of chat Gpt. So, you know, if you're using chat gbt.com like a lot of us are, you know, we haven't had access to this GBT 4.1 model. I've been using it a lot and a lot of third party, you know, platforms that, you know, have offered this up. So the GPT 4.1 is a little confusing, right? Because they're not saying it's a complete replacement for GPT4. which is still kind of the default workhorse model.

Starting point is 00:09:56 So yeah, interestingly enough, it's even kind of buried. So if you go into your chat GPT account, I'm just double checking this on mine. Yeah, you have to even go down to the more models. So it's almost kind of hidden. So yeah, this isn't a replacement for GPT4O, which is confusing, right? When you talk about GPT 4.1 and also OpenAI did say that they're going to be getting rid of GPT 4.5. They did confirm that that is going to be in the API. Most people are assuming it's going to be gone out of chat GPT as well, although OpenAI did not yet confirm that.

Starting point is 00:10:33 All right, that's the latest update. But one thing I think that no one really talked about, I mean, we did because I'm a dork and I think this is important. But chat GPT has added Microsoft SharePoint in OneDrive connectors for deep research for plus, Pro and team users globally with Enterprise Access coming soon. All right. So here's what this means. And also your interface might look a little different. So as most of you know, because I talk about this, we have a lot of different chat

Starting point is 00:11:08 GPT accounts because companies hire us to teach their teams. So, you know, please reach out if you're trying to learn chat chit, whether you're trying to, you know, teach 5,550. We help, you know, teams of all sizes. better use chat chbt as their business operating system. So even you might see something a little different. This just popped up for me in the last couple of hours on my pro account. So that $200 a month, now it says skills.

Starting point is 00:11:34 So, you know, the option to create an image, search the web, write with canvas or use deep research is now coupled under a skills icon. Whereas on my, you know, normal chat chad chpt plus, the $20 a month, on my team's account, on my enterprise accounts. I don't see that anymore. So if you're looking for this new, this new deep research connector, you might have to first hit the skills button if you're on a pro plan. So just keep that in mind.

Starting point is 00:12:04 All right. Here's where this gets exciting. So deep research, when you click it, you will now have this option to choose the sources. So by default, it's web search, right? Which for most cases, you'll probably want. However, now there's some use cases where you might not even want web research. So if you are new, and maybe you don't use chat to be a ton or maybe you don't know about deep, deep research, it's amazing. I do think Open AI's version of deep research and Google's new updated version of deep research are probably the biggest time savers for beginners to AI, right?

Starting point is 00:12:47 literally it will go out and look at anywhere from, you know, 50 to more than 100 or 200 different websites. It will use reasoning and chain of thought. Think like a human, plan like a human. It'll start researching something, find something online and change its path of research. Very impressive. But now you can include a whole GitHub repo. You could also include SharePoint, Microsoft SharePoint. So that is extremely exciting.

Starting point is 00:13:18 And you can also toggle off web search, right? So, you know, pretty big play and pretty big integration here with, if you are a heavy Microsoft using team to connect your entire SharePoint site and then use deep research just on that. So you could, you know, use the web and SharePoint. Or you could even, you know, toggle web search off and just do the deep research on sharepoint. So pretty impressive.

Starting point is 00:13:46 This is technically something that Claude offered a couple of weeks ago, but only to their highest, the higher paying tier on the max plan. So pretty big move there from OpenAI and chat chbt. All right. A lot more chat chbt updates. And you know what? I actually just did the last two months, right? Because there's been, there was a lot that changed in April. And some of the stuff from May wouldn't make sense.

Starting point is 00:14:11 So again, maybe if I keep doing this every month, you know, I'll just start with, you know, middle of May to middle of June and we'll call it the June updates. All right. So GPT4 old school, ah, it's gone now. So Open AI removed that model and also GPD 40 became the sole flagship model. So yeah, don't get confused, even though there's a 4.1, it's still the 40 that is kind of the default workhorse model. Also, this is pretty big and I think we're going to do a dedicated show on this in the coming weeks. ChadGBT is now a shopping platform, right? Fully integrated shopping.

Starting point is 00:14:49 So Open AI added shopping cards with images, prices, reviews, and direct link for product queries. If I'm being honest, I haven't used this a ton because I keep forgetting it's there. This is another one of those things, right? When I tell people, hey, if you want to succeed in AI, you need to like unlearn. right? Like this is one of those things I'm trying to unlearn, right? Because a lot of times when I'm shopping for something, I just go to Google shopping or I go to, you know, straight to Amazon or Target. I've been loving the Target app lately, you know, getting, getting like pickup like so, so good.

Starting point is 00:15:28 I did that today. You know, I don't have a ton of time. So to be able to go, you know, do an entire, you know, days worth of growth or not a day's worth, but to do two hours of grocery shopping and just pulling up amazing, by the way. But I'm trying to unworked. learn and remember now that chat GPT search is like going full Amazon, right? It is going, you know, the ability to chat with shopping products is huge, right? I do, I did read that OpenAI has plans in the future to potentially do the entire checkout process in OpenAI, which would be bonkers. But no official word on that yet. But this is pretty big. I think it's a shot at Google shopping. I think it's a shot at.

Starting point is 00:16:11 Amazon, right? Let's be honest. Amazon tried to roll out their AI shopping assistant, Rufus, and it stinks. It's not good. I can't answer the most basic questions. So, you know, in what little testing I've done with this new chat chvety shopping, it's pretty good. Live stream audience, have you guys used this yet? The new shopping feature inside chat chpT, it's pretty impressive. You know, the downside is, you know,

Starting point is 00:16:41 I'll end up spending a bunch of money I don't have. But the upside is I'll save so much time researching. You know, I'm one of those people that I will search for like, even if it's a $50 something, I'll spend like hours sometimes doing research, right? The old school way, you know, luckily, you know, I will use, you know, large language models for this now, you know, deep research probably for bigger purchases. But, you know, still, it's something. I'm using a lot of time.

Starting point is 00:17:12 And I think even the large language model options have not been the best. Perplexity has had this for a couple of months. But I think perplexity is like so overzealous when you're asking about products. Even if you're not tried to shop, perplexity has this propensity to just push shopping, even when you're not trying to shop. So like, I know perplexity has that feature built in. but it's almost too pushy. And when I'm trying to just get more informational about something that I might want to buy,

Starting point is 00:17:49 it's trying to push me to buy it right away. And it's actually trying to push me to buy something way too soon when I'm still trying to learn about it. So, you know, so far, the chat chvety search or sorry, the chat chvety shopping is not as pushy as the perplexity one. And it's, again, in mind limited testing, it is much better. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant. The assistant orchestrates multi-step workflows, drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life.

Starting point is 00:18:52 You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible, so you can refine. redirect or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. All right. Some more updates in April.

Starting point is 00:19:29 So obviously OpenAI released the 0304 mini and 04 mini high reasoning models inside chat GPT with agentic tool use and introduced memory with search. Also in April, smaller one, but you know, kind of cool and also sets the stage for what OpenAI may be doing in the future. So they introduced OpenAI introduced the chat GBT image library, allowing all images to be saved and access via the sidebar on web, iOS, and Android. So yeah, the mega viral GPT4O image gen. So if you didn't know all the images that you create right now with that image gen, there's a little little little. library icon on the left-hand side. So why is that important? Well, you know, if you've listened to the show a couple of weeks ago, we saw reports that Open AI is working on a social network,

Starting point is 00:20:25 among other things that will very prominently display the images that people create with this new feature. So, you know, one small little step, right? There's no social network yet, but it looks like Open AI may be sneakily laying the groundwork. for a social network of the future. All right. Last but not least, on the update side for chat GPT, they added a referenced save memories feature, allowing chat chbt to reference a user's entire conversation history.

Starting point is 00:20:59 A lot of people love this. I absolutely hate it. I turn it off. One of the reasons why is I use chat chpity all day, right? Some, like, it always very. I always say I use large language models anywhere from two to 10 hours. I'd say generally about five hours a day. I use chat chad chabit for everything.

Starting point is 00:21:25 And this the kind of, you know, this conversational history, you know, memory like it like advanced depth in your memories to me, it's absolutely terrible. Because I use chat dbt for so many different things, I don't want it remembering my preferences for how I'm doing research versus how I might want to, uh, you know, refine an email. You know, in my research, I might want it to be super in depth, you know, using, you know, longer sentences with, uh, you know, rich detail. And then in my emails, I might want them super short and informal.

Starting point is 00:22:01 So it's, it's sometimes picking up trends and tendencies because I'm using it for everything, for multiple businesses, multiple clients, multiple things in my personal life. it's actually making chat chabit worse for me, but I know a lot of people like it. So if you do like, you know, or if you are using Chatsubitia for very limited purposes, I think this is a great feature for power users, not so much. All right, so that's a wrap on what is new in Chatsubit.

Starting point is 00:22:31 Let's move over to Google Gemini. So Google Gemini woke up last week and chose violence for no reason. literally zero reason at all. They introduced a new version of their world leading Gemini 2.5 Pro model. So they already had the world's most powerful model. It wasn't necessarily close, at least according to most benchmarks. And they went ahead and released a more powerful version of it, right? Which is something that Google has not traditionally done.

Starting point is 00:23:08 You know, normally it's, you know, opening up. and Google and, you know, sometimes anthropic or meta, right? Everyone's timing their releases around each other. It's like, oh, you know, Open AI comes out with the world's most powerful model. And, you know, the next day, bam, Google responds back, right? Or Claude, right? Although Claude's been sleeping at the wheel for like six months. Google, I don't think they've done this before, right?

Starting point is 00:23:31 They literally came out and released a new version of Gemini 2.5 Pro called the I.O. edition. So yeah, Google's I.O. conference coming up here in a couple of days. So a lot of upgrades. So specifically, it is better, according to Google and even in my very limited testing, with coding, particularly for interactive web apps, topping the web dev arena leaderboard. I literally think I might do a series of shows just on Google Gemini's canvas. it is so good. So good. All right.

Starting point is 00:24:13 A, comments, another thing. It's sometimes hard for me to like, you know, if I'm going through and looking at comments. So if that would be interesting to you, to see,

Starting point is 00:24:24 because the canvas mode is crazy. So yes, Gemini canvas, it's similar to Open AIs canvas, but much better. It's similar to Claude's artifacts, whereas I had always loved artifacts even more than Open AI and Google's canvas up until like this.

Starting point is 00:24:45 So even two weeks ago, I was still using clawed artifacts for a lot of things when you're just trying to create a visualization for your data. When you're trying to create a little, you know, a little program, right? That's a huge aspect of large language models right now that I think people are not using is you can create little programs for yourself to use, dump a bunch to your data, go in and use it whenever you want, right? You can literally create a little better version of a CRM or whatever based on all your data, right? But with this new release, this Gemini 2.5 Pro I.O, the Canvas mode is scary good.

Starting point is 00:25:24 So if you want to see that, I might do literally like up to three episodes just on Gemini Canvas. So just say, you know, in the comment, say Gemini Canvas, you know, yes, or. Jim and I canvass know, right? I think, you know, you don't even have to be a technical person. The fact that you can go in there, just dump a bunch of data and say, you know, essentially create me an app that can help me make better decisions, right? You don't even have to tell it what you want, right? Sometimes I'll just dump a bunch of data in a bunch of contexts and I'll just say, you know,

Starting point is 00:25:58 create me an app or create me a web app or create me a data visualization tool that helps me better analyze this, better learn it, make better decisions. and it's wild. What these tools, you know, even Open AIs, Canvas, inside chat, GPT, Claude Artifax is great at this, but now with this new I.O.

Starting point is 00:26:19 version, Google takes the cake. All right. Also, this is pretty cool. Not a lot of people talking about this because it's hard to keep up with everything that Google's shipping now. So now inside their mobile

Starting point is 00:26:34 and web apps, it's gained native image editing capabilities for both uploaded and generated pictures, whereas before you had to be on the desktop version. So like I said, the new Gemini 2.5 Pro preview is absolutely crushing everyone. So, you know, they were already in first place in the LM arena, which is essentially a blind, you know, AI test. You put in one prompt, get two outputs, choose the better one. They're already crushing everyone else and they just go ahead, they went ahead and released a better model. What's interestingly, interestingly enough, though, open AI, even though they're not number one,

Starting point is 00:27:12 they have spots two, three, and five with their 03, 40, and four, five models. So, you know, open AI is still, right, even though, yes, technically Gemini 2.5 Pro, a hybrid model that can reason when it needs to is still the single most powerful model in the world. Open AI, you know, you can't just like write them off, right? They have numbers two, three, and five. And you have to assume once they bundle this all up under the GPT5 architecture or system, you do have to believe that they will then exponentially leapfrog everyone else. You have to believe it. But I mean, we'll see. All right. More Google Gemini updates. This was in the end of April. They launched V-O-2, the video generation tool inside of Gemini Advanced. So yeah, a lot of people

Starting point is 00:28:05 don't see that, just look at it. What's funny is it actually came up for me first on my Gemini app on my phone, even though it was the same account before it showed up on the web version inside Chrome or Edge. So that was interesting. So that allows paid Gemini users to generate high quality eight second videos from text prompts. Here's another thing.

Starting point is 00:28:30 I was talking about this with someone the other day. There's a good chance that you have Gemini advancing. and you don't even know. So if your company uses Google, Google workspace, there's a good chance that your company just pays for that extra storage. You know, it's the Google One subscription. And that includes Gemini Advance. It includes notebook LM plus.

Starting point is 00:28:47 So there's a good chance that you already have access to VOT2, which is the world's best and it's not even close, AI video generator, and you might not even know it. So, yeah, you might be wondering, well, what would I use it for? I don't know. Maybe your website needs, you know, updated visuals and put a video background, right?

Starting point is 00:29:05 You can literally just do an eight-second video background on your website, see if it brings in more conversions. If you have something old and ugly, there's a good chance it might. Also, now, Gemini 2.5 Flash is the newest Flash model. This is great for people building off of the Google Gemini API. There's really no reason to use it if you're going to like gemini.com, right? Because there, you're not paying for it. So you're always going to use the most powerful version.

Starting point is 00:29:33 which is 2.5 Pro for most use cases, right? But now the Flash version, which actually does for a Flash, which is a smaller version, does extremely well on most benchmarks. It is punching way out of its weight class for being a quote unquote mini model. So it's optimized for fast and efficient reasoning.

Starting point is 00:29:54 So one other thing, let me see if this is actually on my list here. So Google Gemini did also update the deep research to 2.5, 2.5 Pro as well. That was in April. That one's big, y'all. That one's big. Because I had been saying for a while that Open AI's deep research was in a league of its own. But now Google Gemini with their 2.5 Pro is in that league. And here's why. A lot of the other deep research tools, it's more of this blanket approach, right? You put in a query, and essentially it'll go out in one swoosh,

Starting point is 00:30:36 and it might go out to 10, 20, 50 different websites, right? But now that it's using the 2.5 Pro model, which is a reasoning model, it's a hybrid model. So it will think and plan like a human when it needs to. So when it's doing this deep research, so think of something in your industry that you spend a lot of time on. Maybe it's a growing, you know, a growing trend. in your industry, maybe it's keeping up with your competitors, whatever it is.

Starting point is 00:31:08 Maybe you're spending a lot of time doing this, you know, doing certain research and personalizing it, pulling out certain information, right? Which is what so many of us do, right? Like we go in and we read these, you know, 50 websites with something in Mont, right? Our own, you know, our own business agenda, you know, maybe we're just looking for certain pieces of information from 50 websites. So, you know, to have a personalized, essentially research assistant in Open AIs deep research and Google Gemini's deep research, it's silly good. But, you know, why the Google Gemini update is especially good is now we can reason, right? So it will start to do some research. And I have a screenshot here for the live stream audience.

Starting point is 00:31:51 It might find things according to your certain research. You might say, hey, Google Gemini, I'm looking up, you know, product or competitor A's product. B and it's going to look up, you know, competitor A, product B. And it's like, oh, wait, did you know that competitor C has product B as well? Let's research that. You know, even though you told me, right, human, you told me that only, you know, competitor A has product B. Actually, competitor C and D just released this a week ago, whether you know it or not.

Starting point is 00:32:23 So it's still going to fulfill your query. But as it goes out, it's going to find new things. And it may steer. It may create a fork in the research. Road, whereas the previous version of Google's deep research did not do that. All right. Here we go with Anthropic Claw, their April and May updates. So the May one, this is pretty good if it worked consistently, but it just doesn't

Starting point is 00:32:47 work consistently. Come on, Anthropic. Get it together. Fix this. I'd love it. It even works better than Google's own integrations in some use cases. So May 1st, Anthropic launched their integrations feature, which connects clawed to external applications, including Jira, Confluence, Zapier, and others, and the expanded

Starting point is 00:33:08 research capabilities with an advanced research mode for thorough report preparation with up to 45 minutes of research time. That's nutty. But the one that I think a lot of people are talking about, myself included, is the new research capability and Google workspace integration. So being able to connect to your Gmail, calendar, and Google docs. And this, transforms how Claude can find and reason with information. So yes, if you are on a paid quad accounts, uh,

Starting point is 00:33:40 first of all, this new interface, Claude, I don't know if it's just me, y'all. I hate it. I mean, Apple does the same thing.

Starting point is 00:33:50 Um, Claude's interface used to be super clean. Now, like you got to click like eight times, right, to do simple things. Whereas before it was like three clicks, right?

Starting point is 00:33:59 Um, I know that sounds like, petty for me to pick on Claude. But when they launched this new, like, you know, they're like, oh, we have a cleaner UI, better user experience. I'm like, no, you don't. I got to click way more. And it makes me not want to use your stinking tool.

Starting point is 00:34:15 That and the fact that even on a paid plan after I use Claude for 10 minutes, it's like you've hit your rate limit. Anyways, what I do like about Claude now is the ability to connect your Google Drive, your Gmail, so it can search your Gmail as well as your calendar. And actually in some situations, does this better than Google's own tools, right? So as an example, I did something. I said, what's on my calendar for tomorrow? I'm not feeling good.

Starting point is 00:34:45 Find the emails for those people and say, I've got to postpone. Right. So Claude will go through. It will, with just that prompt alone, you know, I think I have five meetings on my calendar tomorrow. It went through, found all the meetings, found the email for those people. and drafted a relevant email for all those people, according to what I told it, right?

Starting point is 00:35:06 It can't send it. When I try to do the same thing inside Google Gemini, unfortunately, it didn't work well, even though Google Gemini is directly integrated with obviously, if you click that app button, right? So if you go to Gemini.com and if you're using Google Gemini in the chat interface, you can click the at button and pull up your calendar, your Gmail, etc. It doesn't work as well in Google Gemini, which is weird.

Starting point is 00:35:30 And which is why, if Claude could get this to work consistently, it would be amazing. The problem is, like, one thing I try to use it for is to go through large batches of emails and asking for specific things. And a lot of times, it's not really using the conversational aspect, right, the NLP aspect of a large language model. It's usually just doing more search base. So, you know, I'll say something like, hey, tell me everyone that's reached out, you know, about sponsorship or advertising. something like that. And it literally just does a search for the word like sponsorship or advertising, right? But then I'll see emails where it's like, okay, clearly these companies reached out to sponsor or advertise or have me speak or something.

Starting point is 00:36:13 But because it doesn't match that exact word, also the Claude, this, it does, at least I tested it for a few hours a couple of weeks ago. And it really struggles going deeper into your email inbox, right? I have tens of thousands of emails. it doesn't do a good job, right? It will just generally do like the closest exact search or just like your first couple pages of emails, but that doesn't help me, right? And if you use email like I do, right, I don't delete emails. I don't archive them, right?

Starting point is 00:36:46 I try to read them as much as I can, but I get way too many emails. Stop selling me all your, you know, your garbage random AI. I get 100 emails every day like, oh, I've, you know, we have a better. large language model than chat gbt put us on your show no no you don't um anyways you know i can't like literally search and find this so if claude can make this better i would love it and maybe it's just me live stream audience let me know have you used this is it good is it does it work better for you but the potential is there so hopefully this is something that anthropic continues to update a couple other things they introduced their max plan for claude which is a one hundred dollar a

Starting point is 00:37:28 or $200 a month that essentially lets you use it as much as like Google, you can use it for free, right? This, let's be honest, this is terrible. This is terrible. The fact that Anthropic, you have to pay $100 or $200 a month just to use their basic model, like, for more than like 10 minutes.

Starting point is 00:37:48 Yeah, I generally copy and paste a ton, right? But the limits are just ridiculous. So, you know, I guess this is one of those things that, you know, Anthropic release. There's some other, there's some other features that are available in this $100 to $200 a month plan that aren't available in the,

Starting point is 00:38:10 in the normal paid plan. But I don't know. For me, you know, what you get with chat GPT's $200 pro plan, the limits are insane, right? They're so good.

Starting point is 00:38:21 Even the limits on the normal $20 a month, chat TPT plus plan, I find very few people are here. hitting those limits. But, you know, on the Claude plan, apparently you got to pay $100 if you want to use it. There was literally, I did a show on this. If you use Claude, so there's certain hourly increments, I think it's like every four hours or five hours. Even if you do one prompt, I did on that. So like, I did the math. I think it's like, let's just say you used Claude every five hours and did like one prompt. You could be like not use your $100 a month plan after like a couple like after like two weeks, right? If you use it like that, obviously no one's using it like that,

Starting point is 00:39:01 but their rules and their rate limits are absolutely bonkers. All right. Also, in April, Anthropic launched their Claude for Education with a specialized learning mode that guide students through Socratic. Socratic, I think that's how you pronounce that. I should know that. Socratic questioning rather than just providing direct answers. So, yeah, for those students that actually want to learn with a large,

Starting point is 00:39:28 language model and don't just want to get the model to spit out the answer, it more guides them and improves critical thinking, which I think is actually great. I think that's a great mode. Like, I wish that that was an option on all large language models. Sometimes I would just want to learn and work a little bit more versus just the large language model spitting the answer out. Obviously, with a little bit with a little bit of prompting, you know, you can have large language models do that.

Starting point is 00:39:54 But I actually like that mode clawed for education. All right, Microsoft Copilot. So, I mean, I could do like 10 shows just on Microsoft Copilot. I'm not really focusing here as deeply on Microsoft 365 Copilot because it is a beautiful mess, right? There's like Microsoft 365 Copilot. You can get it in like 832 Microsoft applications. Like it's everywhere. So, but a couple of things that have some new updates.

Starting point is 00:40:23 So co-pilot pages have become available worldwide for signed in users. They've introduced a deep research feature for pro users using advanced reasoning models, using Open AIs. Also in April, they debuted copilot search on Bing. They announced memory and personalization features and copilot pages. They introduced copilot actions for task execution, although those have not been rolled out very widely yet. I'm looking forward to using those. They've expanded copilot vision to new platforms, which is. So cool, by the way, has anyone used copilot vision?

Starting point is 00:41:00 So essentially, if you're using the edge browser, if you look at the very bottom, there's this little tab, that's the co-pilot vision. The cool thing is, is, yeah, you can upload a screenshot of, you know, whatever website you're on to any large language model and talk with it, right? I think Google inside AI studio has this stream real time, which is really good at that. It essentially takes a screenshot of, you know, whatever is on your screen. But the good thing with co-pilot vision is it looks at the entire website. So not even just what's on your screen.

Starting point is 00:41:33 Really cool. I'd say one of the more underutilized and least talked about just cool AI features. So they expanded that to new platforms. So yeah, it doesn't work on every single website. They launched with just a few dozen websites that it works on. So they've expanded that to new platforms. They've added AI podcasts. So in the same way that Google has their deep dive audio,

Starting point is 00:41:56 overviews. Co-Pilot has that as well, and they have a new deep research feature. The same way, have this on my screen here for our live stream audience. This is more if you're using like copilot like pro. If you have an enterprise version of Microsoft co-pilot, this interface is not going to look familiar. So it's a little confusing. You have your kind of biz chat versus a personal co-pilot chat. So I pay the $20 a month for co-pilot pro.

Starting point is 00:42:30 So in this situation, you have a quick response. You have Think Deeper, which uses Open AI's reasoning model. And then you have Deep Research, which presumably uses Open AI as a reasoning model as well, but then it does the deep research. So not nearly as good as Open AI deep research or Google's new deep research with 2.5 Pro. However, it's still, it's still serviceable, that version of deep research. I'd say it's on par with, you know, perplexity. Grox is actually pretty good.

Starting point is 00:43:04 The problem is, is grok is, you know, if it uses information from X, then it's a loud card and you probably shouldn't touch it. Grox is actually really good if you don't use information from, you know, X slash Twitter. But the deep research here from Microsoft copilot is pretty good. All right. Last but not least, we have some updates from meta. So meta obviously had their conference, their LamaCon conference where they announced their standalone meta AI mobile app launched globally featuring hyper personalization using social graph data, a discover feed, advanced voice capabilities, integrated image generation called Imagine and Web Search. And then right before that a couple of weeks actually, they announced their new models.

Starting point is 00:43:49 So Lama 4, the Sout, Lama 4 Scout and Lama 4. Maverick, which are already released. And the first open weight natively multi-mody models with unprecedented context support. Yeah, I think 10 million token context window, which is nutty and the mixture of experts architecture. And any day, week, month now, we'll probably be seeing their large version. So Scout and Maverick are the small and media versions of Lama 4. And then we should be seeing Lama's Lama 4 behemoth, which is the large. version and then we'll separately get a reasoning model.

Starting point is 00:44:27 So thanks to the meta crew that I was talking to what conference. Oh, at the IBM think conference, I was talking with the meta developers who cleared some of that up for me. So the reasoning model from what I was told, Lama4 reasoning model will be a separate model. It won't be Scout Maverick behemoth with reasoning. It'll just be a reasoning model. They might tie it to one of those. But they said at least right now that it is its own separate model.

Starting point is 00:44:58 All right. Oh, no, there's still GROC. So GROC announced a couple of new things in April. So they introduced personalized memory, so similar to what Chatchipt rolled out and custom work spaces, which allows GROC to retain information from previous conversations and organize related content. And they also launched Grog Studio, which is a canvas-like collaborative tool for creating various forms of content, supporting code execution, and Google Drive integration.

Starting point is 00:45:27 So here we go. The workspace, you know, now it's, you know, a year ago, this was really cool and innovative, but now it's like you just got to have it. Right. So when you launch a new chat in workspace, you can have it start with custom instructions. You can have it start with attachments. So if you use projects inside chat, Chb-T or projects inside Claude are kind of just, inside Google Gemini a little different than, you know, this would be familiar.

Starting point is 00:45:56 Oh, another thing that was on the Google Gemini update list that I forgot to put in there because they didn't announce it. It just happened. But now your gems inside Google Gemini can actually use the most powerful version, Gemini 2.5 Pro, which makes gems now all of a sudden really good and really useful. And I hope that Open AI will soon follow suit. the problem right now with GPTs, right? So, you know, talking about gems, I got sidetracked here.

Starting point is 00:46:29 So the problem is you still have only the GPT40 model that you can use for GPTs. Luckily, they did add 4O image generation, so you can create GBTs that can use the 4O image generation. But, you know, the most powerful models, the 03, right, you can't use that. that for GPTs. But inside Google Gemini gems, you can use 2.5 Pro. All right. So, and then last but not least for X slash GROC, they launched their voice assistant with action-taking

Starting point is 00:47:06 capabilities. So, oh, no, sorry. This is, sorry, wrong headline here. This is for perplexity. Hey, I'm a human. Humans make mistakes. So perplexity launched their voice assistant with. action-taking capabilities on iOS.

Starting point is 00:47:21 This was about three, four weeks ago at the end of April. They also upgraded their GPT-powered image generation. They added GROC3 beta for advanced reasoning capabilities. No, no, they didn't. Sorry. My last slide here is confusing. All it's supposed to say is perplexity added their new voice assistant inside the iOS app, which has a ton of potential.

Starting point is 00:47:49 again, it just isn't always working. So one thing it can do, it's trying to be what Siri is it. So if you do have a paid plan, give it a try. For some things, it's pretty good. So you can say like, hey, what's on my calendar? Right. So Siri's not always good at that. So it can read your calendar.

Starting point is 00:48:10 It can draft emails. That's something Siri can't do right now. You can set iPhone reminders. Also, you can launch YouTube videos. just with your voice. So that is the new, for perplexity, the new iOS voice assistant. So a lot of promise. Like if I'm being honest, I don't use perplexity a whole lot. You know, ever since the deep research tools came out, I don't find much utility in perplexity. But this is one instance. I'm, you know, I still pay for perplexity because I want to make

Starting point is 00:48:46 sure that I can inform you of the latest and greatest. So I pay. for literally every single tool. It's way too much money to do that. And I've been like, why am I still paying for perplexity? It's not very good. If I'm being honest, it's not. But this is one thing. Hey, no one right now likes Siri.

Starting point is 00:49:05 No one does, right? We're supposed to get all these AI, you know, Apple intelligence series. Siri's still dumber than a box of rocks. And that's being mean to the box of rocks. It doesn't know anything. It can't do anything. It has like zero. right? There's actually multiple class action lawsuits against Apple because of all this marketing

Starting point is 00:49:24 and commercials they put out about this smarter Siri with all this AI and it doesn't do anything. Anyways, perplexity could actually compete here. If they make this better, number one, and if they add more features, number two. So at least shout out perplexity. I was suggesting some things on the tweeting machine and they reached out and said that they're passing some of my feedback back. So yeah, hopefully they make it better. There's a lot of potential there with that iOS assistant just because who would have thought that even a year after Apple announced all of these smart theory things that we would still not have it. So I like this kind of small pivot that perplexity is doing here, but they got to update it. They got to make it better, more reliable.

Starting point is 00:50:09 But hey, might give us a reason to keep perplexity along for a little longer. All right. That's a wrap, y'all. Did you like the monthly updates? I know it was a little longer. There was a lot to cover. technically I put in April and May because there was so much in April as well. So, you know, it's hard to keep up with all the large language model updates.

Starting point is 00:50:28 Yes, on most Mondays, we do an AI news show, but there's so many big large language model updates we don't even cover, right? We usually cover the top like eight to 10, you know, AI news stories, but a lot of them are news. They're things impacting business, politics, the economy, right? So we're not always covering every single large language model updates. So, hey, if you want these monthly updates, if these are going to be helpful, let me know. You know, maybe if you only listen to the show, you know, once or twice a month,

Starting point is 00:50:56 maybe this is a show that could really be useful for you. So let me know. Please, when I say I do this thing for you, I do this thing for you. So if this is not helpful, tell me monthly update. No, if it is helpful, tell me monthly update. Yes, and we'll keep doing it. Reach out on the podcast. Thank you for listening.

Starting point is 00:51:14 If you haven't already, please go to your everydayaI.com. sign up for the free daily newsletter. If this is helpful, tell someone about it. If you're listening on the podcast, please leave us a rating that would be super helpful. Follow the show. I'd appreciate that. I'd appreciate you also tuning in tomorrow and every day for more everyday AI. Thanks y'all.

Starting point is 00:51:38 Meet Firefly AI Assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com.

Starting point is 00:52:08 And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com. and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 526: LLM May Updates: What’s new in ChatGPT, Gemini, Claude and more

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.