Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 744: Leaks show insights on OpenAI and Anthropic’s next models, Claude Computer Use goes viral and more

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. I think right now the best AI model in the world, if you have the budget and you have the patience,

Starting point is 00:00:51 is GPT54 Pro from OpenAI, which seems like it was released months ago. Crazy enough, it was announced this month. Yet, we already have new AI news that OpenAI is working on a newer model, codenamed Spud, and it could be right around the corner. And Open AI must think it's pretty good as they say it'll reshape the economy and their doubling headcount to prepare for it. And, well, they're not alone. As their biggest AI competitor Anthropic, well, recent leaks show that they're also cooking up a new model that could arrive soon. And apparently it's so powerful the company is concerned about its capabilities.

Starting point is 00:01:36 Yeah. So this week was a heavyweight week for both the big AI companies. a ton of AI news, leaks and updates, and we've got it all. Oh, and if you stick around toward the, well, the end of the live stream version of our show, you'll actually be the first in the world to hear of a new AI update from one of the big four that will break to you live that, well, we think it's actually really good. All right. I'm excited to get to you the most important AI news of the week.

Starting point is 00:02:06 I hope you are too. Let's dive into it. If you're new here, welcome what's going on. My name is Jordan Wilson. This is Everyday AI. It's a daily live stream podcast and free daily newslet that are helping everyday business leaders like you and me. Keep up with the nonstop avalanche of AI updates

Starting point is 00:02:22 because they are literally every single day. I tell you what's important. What's not? You take that information to be the smartest person in AI to grow your company and your career. So it starts here with the unedited, unscripted live stream podcast, but to take it to the next level. Make sure you go to our website at your EverydayAI.com.

Starting point is 00:02:39 Each day, we recap the, live stream in the newsletter. So you can quickly catch up. And we give you all of the other important AI feature updates and everything else that you need to know to stay ahead. So without further ado, let's get straight into it and give you the AI news that matters for the week of March 30th. Well, the first one, it's a short name for a apparently really big model. So Open AI, their new model, nicknamed Spud, Well, could be dropping any week or any month now. So according to reports, OpenAI has completed development of a new AI model,

Starting point is 00:03:20 co-named SPUD. Capability the company says could significantly boost productivity and help refocus its own internal strategy toward focusing more on enterprise tools and integrated products. This update comes as Open AI has shelved a bunch of, well, kind of popular products, such as SORA and they've delayed some of their other efforts as well to concentrate on their core business and well at least it looks like for the future that may be whatever this new codnamed Spud model is actually call right whether that's a GPT 5-5 a GP6 we're not sure but according to the information open AI recently finished pre-training work on spud and plans to

Starting point is 00:04:08 reveal it within weeks with the CEO, OpenAI CEO Sam Altman reportedly saying the model can really accelerate the economy, a claim that signals a push for broad commercial impact. Spud is expected to support OpenAI's plan to build a productivity super app by combining chat GPT, codex, and the company's browser called Atlas, potentially making multi-2O workflows faster and more tightly integrated. So while specific technical details about spuds architecture or capabilities were not disclosed in the report, the emphasis from leadership suggests improvements in productivity, multimodal understanding, or more advanced reasoning and tool use.

Starting point is 00:04:55 The timing of spuds completion aligns with OpenAI's decision to redeploy resources away from consumer experiments like SORA and their adult mode, which they've delayed, and toward enterprise and robotics implying spud will be a cornerstone of that strategic shift. So pretty big news here from Open AI. Yeah, I was like looking at the calendar and I'm like, wait, what has it been like two months since GPD 54? And I'm like, no, we're still in March. And GPT 54 was released in March. So the pace of development, obviously, if you've been following AI for more than a couple of months is straight up blistering here. Right.

Starting point is 00:05:41 And I think maybe one of the reasons we've seen this is continued pressure and great models and improved productivity gains specifically, I think, from Anthropic, where I think they've been winning 2026 so far, at least when it comes to shipping products to consumer and then obviously from Google as well. All right. Speaking of things from Anthropic. That's our next piece of AI news. And this one was big.

Starting point is 00:06:10 Well, it was at least extremely viral. We'll see how big it ends up being. But I think this is something it could kind of shape the next agentic layer. So Anthropic has rolled out a research preview feature that lets its chatbot Claude actively use a Mac to perform task for users, making a step toward more autonomous agent-style AI that can operate apps, type, click, and use connectors for services like Google, calendar, and Slack. So the new feature is in a limited research preview, and it's right now only available to

Starting point is 00:06:48 paid Claude users. So if you're on the pro or the Mac's plan, and it is restricted to computers running MacOS, making it immediately relevant only to, yeah, if you're a paid user on Apple. So here's what it is and, well, why it's pretty cool. It can perform real world actions on your computer. Right. So a lot of these capabilities were previously available, kind of in Claude Co work, in Claude code, but here's what's new.

Starting point is 00:07:17 Well, Claude can now perform real world actions on your computer. It can open files, use your browser and developer tools in the browser. It can type and move the cursor and even choose connected apps that it has access to inside Claude or manually control to complete tasks such as transferring files to your phone or batch resizing photos. Anthropic says Claude will always ask for permission before acting and allows users to stop the agent at any time, aiming to balance convenience with user control. Also, the company implemented automated safeguards to scan for prompt injections and other common attacks against agents, and it disables some apps by default.

Starting point is 00:08:02 But it warns the feature is new. It may contain errors. And it should be used with highly, it should not be used with highly sensitive data. So I don't know. Live stream audience, podcast people, let me know in a comment. Have you used this yet? I've used it. I think it's extremely impressive.

Starting point is 00:08:24 We may go give this the total hands-on treatment on Wednesday. So yeah, if you are new to the show, on Mondays, we go over the AI News. On Wednesdays, we go pretty deep. hands-on demo with one tool. And then on Fridays, we kind of do a recap style of new features. And Tuesday, Thursday, you know, we rotate different shows in there. But we may go super in depth with this one on Wednesday. My quick take on this is it's buggy.

Starting point is 00:08:50 Obviously, it's the first of its kind. But it's extremely impressive. And I do think it represents the, not the final layer, right? But it represents the next layer of agentic work. you know, I think until some of the protocols improve, you know, the A to A agent to agent protocol, MCPs, right, all of these other things, I think there's still, even with all of these, you know, protocols that essentially help AI agents talk to other AI agents and to help large language models, talk to other large language models, even until these things improve, I still think the next layer is, well, it's agents using your actual computer. And this is a very, you know, open claw style update here, right? But this is different. This isn't just being able to save files, upload files, access your local directory.

Starting point is 00:09:40 Yes, that has already been available in Claude Co-work and in Claude Code. This is actually controlling your entire computer. So I've liked it so far, like I said, when it works, great, doesn't always work. But being able to open different documents on your computer, being having, Claude, the desktop app, control, open, use other apps on your computer. That's kind of the new capability here, right? You know, some of my testing, I'm having, you know, Claude, open perplexities, Comet browser, open, open AI's Atlas browser, right?

Starting point is 00:10:17 Use different browsers, use different desktop programs that normally might not be able to talk to or access other AI. So pretty big one, if you think we should give this the Wednesday. deep dive treatment. Just say computer use in a comment there. So I know if we want this. So it was crazy viral though. So when open a,

Starting point is 00:10:39 or sorry, when Anthropic released this on Twitter, I think the release video got like 75 million views almost instantly. So pretty big. All right. Next piece of AI news. This might seem like a smaller one, but I actually decided to highlight this

Starting point is 00:10:55 as one of our main AI news stories for the week. although there's probably even bigger AI news stories from OpenAI, but I think this one is actually really important. That's because, well, plugins are back, right? If you've been a longtime listener of this show, you know I was very, very bullish on plugins early on inside chat, GPT, super bummed that OpenAI actually disabled them, you know, discontinued support. Eventually, you know, it led to GPTs and apps and other things,

Starting point is 00:11:22 but now OpenAI has quietly updated its Codex program to let, developers create custom plugins and let general users use them. So OpenAI has introduced plugin support for codecs, allowing developers to add custom skills and external integrations to their coding assistant. So the most newsworthy detail here is that plugins can include pre-packaged scripts, which are skills, plus configuration files. So codex can run tested code snippets instead of just generating code from scratch each time. which that obviously reduces the hallucination risk and cuts inference costs and response time as well.

Starting point is 00:12:04 So plugins can connect Codex to external services through MCP servers and developers can upload MCP configuration files to control sandbox, middleware, and environment behavior. So OpenAI also with this launched a plugin directory with more than a dozen pre-builds integrations, including plugins that let Codex edit Google Drive files in review get Hub repo changes. So this follows a similar capability launched by Anthropic for Claude code a couple months ago. They're kind of plug-in ecosystem. And Anthropics already supports subagents and Codex just rolled those out as well.

Starting point is 00:12:44 So OpenAI says plugins are designed to help software teams keep development tools and configurations synchronized across multiple engineers, open AI accounts, reducing code and consistencies on collaborative projects. Here's why I think this is super important. Number one, everyone, including OpenAI, is straight up sleeping on Codex, right? I do think maybe this is one of the reasons that Open AI is shifting away from having Codex as its own dedicated app in moving forward with the super app where it kind of rolls Codex chat, GPT, and the Atlas browser all into one.

Starting point is 00:13:21 That's because Codex is a freaking beast. Yes, it is better. I don't care what anyone says. I use these tools more than 99.9% of the population. Codex right now with GPT54, extra high, is better than Claude Code with Opus 46. Better than Claude Co-work with Opus 46. It is just better. You know, if you want something that looks nice and done quickly, you can use ClaudeCode

Starting point is 00:13:49 Claude Co-work. If you want something done the right way and you have the time, I think, Codex is by far better. Yeah, it creates ugly front ends. We get it. But it just gets the code right. But here's the thing. Even if you are not using it to code, it is amazing at just doing everyday knowledge

Starting point is 00:14:09 work tasks. So in the same way that I think that Anthropic kind of not maybe pivoted, but kind of remarketed, essentially as Claude Co-work, right? Because they realize that it's great for even non-coders. I think that's what we're going to see with open. an AI in this new super app because maybe they're slowly starting to realize that Codex is actually really good for people that are not running software, not software development teams. Yes, it's good for those people, obviously, but it's good for everyday knowledge work.

Starting point is 00:14:39 And I think plugins combined with the, uh, kind of the news of the upcoming super app is really significant of that. And it's showing that I think the average knowledge worker is going to be using the codex platform a lot. Whatever that looks like in the new super app. I mean, that remains to be seen. But what this means to you, if you have not used Codex yet, start using it now, right? These plugins, I think, do make it easier.

Starting point is 00:15:05 And you can probably find some great use cases, right? Even one of the plugins I just mentioned there, Google Drive, right? Being able to edit Google Drive files, that's pretty big, right? That's pretty big. It's a, you know, a big shortcoming a lot of the, you know, front end AI connectors, not being able to actually edit files. All right. our next piece of AI news.

Starting point is 00:15:27 Well, Apple, right? We're getting all the big companies here. So Apple may actually get AI that works. We'll see. They've been saying that for years. Like the boy that cried both. We'll see if we believe them this year. But maybe it will happen.

Starting point is 00:15:41 Because according to Bloomberg, Apple will soon let third party AI chatbots, such as Google's Gemini and Anthropics Claude, integrate with Siri, starting in iOS 27. That means iPhone users can route unanswered series queries to their preferred chatbot app. So also Apple announced that at their upcoming WWDC keynote, they're going to have a lot more of an AI focus and third party integrations that will work with iOS. So essentially it seems like Apple saying like, yeah, we've spent billions of dollars in three or four years and we can't get it right. So essentially we're going to open up the generally extremely restricted Apple ecosystem

Starting point is 00:16:27 and allow users to integrate with other AI chatbots. So if you've been like me and you're using your iPhone and you're like, this thing is a very expensive dumb phone and there's no AI that works in it. It's kind of my thought, right? Siri doesn't work. It doesn't do anything. You know, yeah, they have these, oh, these right tools. They're useless, right?

Starting point is 00:16:49 So maybe now we can all finally get. it AI that actually works on our devices. So according to reports, users will choose which services Siri can access through the new extensions settings in iOS 27, iPad OS 27, and MacOS 27 found in the Apple Intelligence and Siri panel of settings. So then the change ends the practical exclusivity of Apple's prior Open AI tie up, right? So they've had this feature not very well integrated, I would say, right, where essentially Siri can just kick things over to chat.

Starting point is 00:17:25 But OpenAI's chat GPT will still remain supported, but will no longer be the only external bot that Siri can call. Apple also is still planning a major Siri overhaul, which we've been hearing about for years, and will ship its own Siri chatbot built on the Google Gemini models, which we've talked about pretty extensively on that show. on the show while extensions give users the option to direct requests to other chatbots instead of just Siri. So Bloomberg reports that Apple will announce these new features and the new updated series that can talk to other chatbots at their WWDC keynotes this summer in June with a feature arriving in iOS 27.

Starting point is 00:18:13 All right. We'll see if that actually happens, right? Apple, right? It's kind of funny because two years ago, right, Apple was like, oh, Apple intelligence. You know, we are AI. It's going to be the best AI ever. And then they got sued because they didn't release anything that actually worked. There was nothing actually intelligent that Apple released.

Starting point is 00:18:38 And then they kind of, quote unquote, took a year off, you know, from their big WWDC. That's the worldwide developer conference, right? That's their one big time of year where they come out with all their announcement. So, you know, two years ago, they're like, oh, yeah, we are Apple Intelligence. We are the smartest AI in the world. They got sued because they couldn't deliver. So then they, quote, unquote, took a year off from AI. They didn't really, you know, announce anything of substance at last year's WWDC.

Starting point is 00:19:03 And now apparently this year, they're back to being the Apple intelligence. So hopefully they've learned their lesson and they won't lean into trying to redefine the actual AI category. Probably not a good idea, especially if all they're ultimately doing here is allowing you, to use better and smarter AI. So it should actually be pretty telling how they approach this from a branding angle. But I do think, however, the markets, right, if you care about that, I think the markets will actually like this move from Apple because they're like, yeah, Apple, we understand you can't build AI.

Starting point is 00:19:38 So you should probably start integrating with as many third party AI chatbots as possible. All right. Our next piece of AI news, well, this one is the on-gris. going. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant, now live in the Adobe Firefly app, the all-in-one creative AI studio.

Starting point is 00:20:09 Powered by Adobe's Creative Agent, Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the Assistant. The Assistant orchestrates multi-step workflows, drawing on 60-plus, pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in

Starting point is 00:20:52 the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. Rama and it's not over, but we have a new chapter in the novella. So a federal judge has temporarily stopped the Pentagon and other federal agencies from enforcing directives that would have immediately halted the government's use of Anthropics AI tools, keeping the companies widely used. clawed system available while a broader legal fight plays out. So a U.S. district judge in California issued an order late last week, preventing the

Starting point is 00:21:38 enforcement directives from President Trump and defense secretary Pete Hegeseth that sought to bar anthropics tools from government use for now. All right. So essentially, the federal government labeled Anthropic, a support. supply chain risk for a couple of reasons, which we'll get to here in a minute. But essentially a judge said, nope, doesn't make sense, right? So the order now lets Anthropics AI continue to be used inside the government for now and by outside contractors working with the military while the lawsuit proceeds.

Starting point is 00:22:17 So the judge wrote that the government actions looked aimed at, quote unquote, crippling Anthropic and a chilling public debate. And the judge characterized statements by officials as appearing to be classic First Amendment retaliation. So the dispute began after public criticism from President Trump and Hexeth, who labeled Anthropic a supply chain risk. And the first public use of that designation against a U.S. company ever. And a label typically reserved for companies tied to adversarial.

Starting point is 00:22:52 nation. So yeah, it was very strange that the federal government decided to label Anthropic a supply chain risk because that's never literally happened against a U.S. company. Then Anthropic in response sued the Department of Defense and other agencies earlier this month, saying the government's designation in public attacks harmed its business and violated its free speech rights. So the judge noted that officials public comments attacking Anthropic were more on political grounds. For example, they called the company Woke and its employee in Anthropics employees, quote unquote, left wing nut jobs rather than pointing to specific security defects.

Starting point is 00:23:38 Right. So this drama is not yet over. Right. Anthropic essentially, you know, they had two little clauses that they wanted to have in their agreement with the government for reasons. they said that would ultimately give them more protection over how the military would not use its AI. As an example, they said they didn't want the government to use its clawed systems for fully autonomous weapons in war without human oversight. Right. So we'll see how this continues to go on.

Starting point is 00:24:09 But it is, again, one of those stories that is going to continue to drag on. But the latest one here, pretty big update, a judge essentially saying, no, government. you were wrong. You can't do this. And this was political. And there's really not a lot of merit. So yeah, it will continue to be legislated. All right. Our next piece of AI news, a little technical one here, but it actually had pretty big ramifications, both instantly and in the long run. So Google researchers have introduced a new methodology called turboquant, a two-stasy. Vector Compression Method Method that reduces transformer

Starting point is 00:24:55 key value cache memory by about six times while preserving downstream accuracy on tested workloads. So the most newsworthy part here is that TurboQuant lets models quantize key value caches down to

Starting point is 00:25:11 three bits without retraining, which could sharply improve lower memory requirements for large language models. So, the system requires no model retraining, which that's huge, making it potentially compatible with existing open models and easier to adopt in current production stacks. So here's in a simple way, right, this is technical.

Starting point is 00:25:36 I had to read this one a couple of times because, you know, as much as I talk about AI, I am not super technical on the pre-training side. So it's kind of like a super shredder, right, for an AI's memory. So when you chat with an AI, it has to store a lot of data. to remember what you just said. And usually that takes up a ton of, well, expensive memory. And turboquant from Google, so essentially a new quantized technique, right? It just kind of squishes the data.

Starting point is 00:26:06 So it shrinks the memory to about one sixth of that size without losing any of the information. And by doing that, it obviously speeds things up because the data that has to remember, right, about your conversations is small. the AI can quote unquote read it up to eight times faster. And the impressive things here is, well, this new technology or technique can apparently be applied to any new model. So it doesn't have to be only new models that haven't been developed yet.

Starting point is 00:26:39 So there's no extra work. So you don't have to retrain the AI. So this can be applied to open models. So as an example, it can work instantly on models, maybe like Google's open source and Gemma. So also interestingly enough, so this happened late last week and it instantly triggered a pretty sharp but probably temporary sell-off on the stock market of, you know, memory chip stocks. So pretty big deal here. We'll see what Google does with this technology, how it may be used across the spectrum.

Starting point is 00:27:13 But this could ultimately, right, bring way more powerful models to way smaller devices. Right. So as an example, as of recently, there's been this big, you know, OpenClaw and OpenClawe-esque kind of surge, right? And a lot of people are buying very expensive, you know, $10,000 max studios or, you know, Nvidia DGXs to run the most powerful local models that they can so they don't have to pay for cloud inference, right? They don't have to rack up API bills within Frappig, Google, or Open AI. But the problem is, well, you have to have a huge, very expensive computer to run these models. Well, maybe. you might be able to get a six times more powerful model on the same size or just have a much smaller computer or a not as powerful GPU to be able to bring all of these things. So pretty exciting news for the future of AI development where we just might have way better local models available on phones and computers in the future. If TurboQuant ends up being what it could be, which is a game changer for how long large language models work.

Starting point is 00:28:22 All right. Here's, well, our pretty big, biggest story of the week, although we are going to be able to break some news here in about five minutes from one of the big companies. So a major leak at Anthropic has exposed nearly 3,000 internal files. And, well, probably 99.9% of them weren't very important, except there was a new unpublished draft about a new power. model from Anthropic called Claude Mythos.

Starting point is 00:28:56 I think that's how it's pronounced, right? In details of an even larger tier named Coppabara. So Anthropic did confirm the reporting to Fortune that an accidental data leak exposed nearly 3,000 assets uploaded to its content management system on its website, but marked as private, making them publicly accessible in a data late. So the leak collection included unused marketing assets, PDFs, images, all that stuff. Also employee and corporate event information. But the big thing was a draft blog post about a new AI model labeled Claude Mythos.

Starting point is 00:29:34 So according to the leaked draft, Mythos is by far the most powerful AI model that Anthropic has ever developed. And the company calls its performance a step. change. The model is in trials right now with selected early access customers. So the documents revealed a planned new top tier as well right now codenamed Kappa Berra, I think that's how it's pronounced, which would sit above Anthropics' current highest tier, which is Opus, making Kappa Berra the company's latest and most capable offering. So Anthropics leaked materials warns that Claude Mythos and Capabara could significantly increase cybersecurity risks, saying the models are currently far ahead of any other AI model in cyber capabilities.

Starting point is 00:30:27 The leaked draft states Anthropic intends to study and share findings about near-term cyber risks so defenders can prepare, and it is giving early access to organizations to give them time to harden their code bases against potential AI driven exploits. But I think the potentially bigger news here might ultimately be this new model's access because reports point to a very limited early access rollout with select customers only. And those leaked materials described access expanding gradually through the Cloud API rather than broad availability in a standard paid plan.

Starting point is 00:31:11 So yeah, this might be the first, well, major model we've seen. from any company that might not be available to regularly paying business users, right? So if you have a, you know, clawed, you know, paid plan where you're paying monthly and you expect, oh, well, I'm going to have always the, you know, most powerful models from Anthropic, well, maybe not or maybe not right away, which I think is actually a pretty big pivot overall with how these companies work, right? We've seen reports that OpenAI and Anthropic are likely both going public this year. And this could be a new kind of tactics to bring in more revenue, maybe right before in IPO.

Starting point is 00:31:57 We'll see. But again, reports are saying that this might, even if you're a $200, you know, like myself, you know, paying $200 a month for Claude Max, you might not get access to the new Claude Mythos whenever it's released. All right. Speaking of release, well, we're safe now. The embargo has lifted and we can talk about new updates from Microsoft. So yes, if you are listening here on the live stream, you are the first in the world to hear about this. So Microsoft is now just introducing that co-pilot co-work is available through its frontier program.

Starting point is 00:32:36 And there's some new, which I think are really, really good updates that they're rolling out to researcher. All right. But first, co-pilot co-work. All right. So we know that Microsoft co-pilot co-work isn't new. They announced this a couple of weeks ago, but this is now pretty big news because they're making it available via the frontier program. So that's when a lot of companies are now going to get access to it. Also, they're introducing a Microsoft 365 co-pilot feature designed for long running multi-step work instead of just the one off chat prompts. So that's with you know, co-work, you know, this is a kind of partnership with Anthropic. It's not really a white-labeled version of Claude Co-work, but it kind of is, right?

Starting point is 00:33:23 They're leveraging that technology. And obviously, Microsoft is a big investor in Anthropic. So initially, this was rolling out to a very small beta group. So now it's going to be rolling out to a pretty big group in the Frontier program. So the company said co-pilot cowork, let's users describe the output. comes they want, then creates a plan, works across files and tools, and then shows a visible progress. Right. So you kind of a, the screenshot here I have for our live stream audience, you can kind of see it work through its entire plan. Very similar to if you've used, uh, in tropics Claude co-work, right, but Microsoft style.

Starting point is 00:34:00 Uh, so Microsoft said co-pilot cowork includes skills from both Claude and Microsoft, right? So it's not just, uh, in exact, uh, you know, duplicate of. of co-pilot co-work because it does include obviously a lot of specific and Microsoft exclusive capabilities, including calendar management and daily briefing. And the company said it can handle both one-time tasks and repeatable workflows, such as monthly budget review. All right, but here's actually some of the announcements that I am maybe more personally excited about.

Starting point is 00:34:34 So Microsoft has announced a new and improved researcher agent. Here's the big part, though. built on multi-model intelligence. So it's aimed at a more complex knowledge work by synthesizing information across sources and producing cited reasoning analysis that users can act on. So one of the most important additions is the new critique feature, where OpenAI's GPT model will draft a response and then Anthropics Claude model reviews it for accuracy, completeness and citation integrity before delivery showing that Microsoft is now productizing

Starting point is 00:35:15 a multi-modal review workflow rather than relying on a single model's first pass. Y'all, I can't believe it is essentially now quarter two of 2026 and we're now just getting this for the first time from one of the big players. Granted, it's really only Microsoft or Google that could do this, right? essentially bringing a new model in from a different company, right? So that's completely different training. It works in a completely new way. And essentially having those two models work well with each other, but technically against

Starting point is 00:35:52 each other, right? So having open AIs GPT creates something and then if Rappix Claude essentially tear it apart, right? For anyone that's a power of user of AI, if you're using it for high value work, this is what you've been doing manually, right? For me, this is probably what I spend 70% of my time doing, right? I think there are some great third-party solutions, such as Perplexities Model Council, but it's honestly very expensive, right?

Starting point is 00:36:21 It eats up a bunch of credits. There are some other, you know, third-party, less popular tools that kind of do this. But it's actually a pretty big deal that Microsoft is the first of the big four, right? So that's Microsoft, OpenAI, Anthropic, and Google. they're the first one to provide this. And like I said, anyone else can't really do this. I think technically Google could, right?

Starting point is 00:36:43 Because Google is also a big investor in Anthropic, but I think they're probably competing a little bit too closely on the model side as well. Because right now, Microsoft is, even though they are developing kind of their next generation of models, right, in their Microsoft AI, you know, Mustafa Salimon, now working on that side. Right. They don't really have today, at least, you know, frontier level models.

Starting point is 00:37:10 So this is pretty big, both expanding co-pilot cowork to the frontier program, which is a lot of enterprise organizations here in the U.S., but then also with these new updates in the model council, which is huge, right? So that's a new feature. So you have the critique feature in researcher, but then you also have the model council feature, which lets users compare responses from different models side by side so they can see where answers agree, where they diverge, and what each model contributes. So like I said, that's something that's been kind of available inside perplexity,

Starting point is 00:37:45 but pretty cool to see this offer now from one of the big four. All right, that's it for our big stories of the week. But let's roll into our what's new and what's next. So this is the kind of bullet point roundup of all the other big stories in other weeks. Some of these might have been some of the biggest stories of the week, but there's a lot going on this week with all the new leaks, the big bottles, released from Microsoft, a ton going on. So let's go over this, what's new and what's next.

Starting point is 00:38:14 So Anthropic released a new economic index, which shows a widening AI fluency gap. Google released their updated Lyria 3 Pro. You can create audio tracks up to three minutes. Open AI reportedly close their funding round. that's reaching $120 billion. All right. Meta introduced Sam 3.1.

Starting point is 00:38:37 So that's their segment anything model. So that allows you to segment any object in image or videos from simple prompts like click or boxes. SoftPank reportedly secured a $40 billion loan tied to their open AI investment. A report said that ChadGPT's new ad pilot passed $100 million in annualized revenue. So we saw some reports that even though maybe it was. wasn't as successful as they had hoped. Well, they've already brought in $100 million in annualized revenue. My hot take is they're going to like 10x that by the end of the year.

Starting point is 00:39:10 It's going to be a cash machine. The White House announced members of the President's Council of Advisors on Science and Technology, including Mark Zuckerberg, Jensen Wong, and others. Google released Gemini 3.1 Flash Live and Search Live. We went over that on our Friday show, going over our new Friday features, our weekly show. Open AI upgraded their Chad GPT shopping and product discovery, really emphasizing shopping less and product discovery more. Google released an updated version of Google Translate.

Starting point is 00:39:41 It's actually really good. I've been using it. We went over that Friday as well. Arm introduced that it will start making its own chips. Open AI revamped Chad GPT shopping with agentic commerce protocol in the Walmart app. Luma AI launched a pretty impressive and kind of out of nowhere. a new multimodal image model called Uni1, fairly impressive so far. Drug maker Eli Lilly agreed on a $2.7 billion deal within Silico for AI developed drugs.

Starting point is 00:40:15 Sam Altman reportedly has now shifted his own priorities and will no longer directly oversee security or safety at OpenAI. Chad GPT rolled out its library feature for central file management. Anthropic is reportedly working on. a computer use for mobile after their very viral computer use for macOS. In Gemini business, Google is testing skills. Meta conducted another round of layoffs, this time around 700 employees amid their AI shift. Open AI shelved reportedly their adult mode indefinitely alongside cutting and killing off SORA as a dedicated app for now. Meta rolled out new AI shopping experiences on Facebook and Instagram.

Starting point is 00:41:00 Instagram. Open AI officially relaunched their nonprofit armed and named leaders. And they planned to spend at least a billion dollars in doing so. Manus now offers a computer control from their iPhone app. I haven't tried that one yet. I do have a man of subscription. So I'll have to give that one to try. There's a new benchmark in town. Arc AGI 3 since Arc AGI 1 and 2 have essentially been saturated, but we're still

Starting point is 00:41:24 saying we haven't achieved AI. Well, now there's Arc AGI 3, a new benchmark that has launched. with $2 million plus in prizes. Apple will reportedly, like I said, oh, we already cover that one. They are opening Siri to rival AI assistance. Google Gemini is launching a chat history and memory import tools to make it easier to start with Gemini to transfer over.

Starting point is 00:41:48 ChatGVT is getting unified Google Drive integration for business and enterprise. Suno release version 5.5.5. A lot of cool features in there, allowing you to also kind of use your own voice, which is cool. And then last but not least, agile robotics in Google. DeepMind announced a strategic research partnership. Y'all, that was a ton happening this week in AI. If you missed anything, well, we just covered it. So you don't have to spend hours each and every week saying, oh, what's happening in AI? What's worth paying attention to? What's not? What should I be using in my company? That's what we do on our Monday show. So I hope this was

Starting point is 00:42:26 helpful. If so, please subscribe to the show. leave us a rating. I'd really appreciate that. Takes like 30 seconds, whether you're listening on Apple Podcasts or Spotify. And then if you haven't already, make sure you go to your EverydayAI.com. Sign up for the free daily newsletter where we recap, not just each day's podcast, but everything else you need to know to be the smartest person in AI at your company. So thank you for tuning in. We hope to see you back tomorrow and every day for more Everyday AI.

Starting point is 00:42:50 Thanks, y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words. and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premier Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution.

Starting point is 00:43:19 Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - Ep 744: Leaks show insights on OpenAI and Anthropic’s next models, Claude Computer Use goes viral and more

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.