Everyday AI Podcast – An AI and ChatGPT Podcast - OpenAI releases GPT-5.2, Trump stops states from regulating AI, big AI rivals team up and more

Episode Date: December 15, 2025

Weird week in AI. 🤪OpenAI partners with..... Disney? Goofy. Meta might go ..... close source? Avocado. And Adobe's brining Photoshop to..... ChatGPT? Print that. Tons of movement. We'...ve got you covered with this week's AI News That Matters. OpenAI releases GPT-5.2 and teams up with Disney, Trump signs order to stop states from regulating AI, big AI rivals team up and more.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion:Thoughts on this? Join the convo and connect with other AI leaders on LinkedIn.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Disney Invests $1B in OpenAI Sora PartnershipDisney Grants OpenAI Access to Copyrighted CharactersTrump Executive Order Blocks State AI LawsAI Regulation: Federal vs. State ControversyMeta Shifts Llama Model Open Source StrategyOpenAI CEO Teases Upcoming ChatGPT FeaturesGoogle Releases Gemini Deep Research AgentAdobe Photoshop Integrates with ChatGPT AppsAgentic AI Foundation Unites Top AI RivalsOpenAI Launches GPT-5.2 Model BenchmarksGPT-5.2 Performance: Spreadsheets & PresentationsLong-Context AI Model Retrieval ImprovementsAnthropic Claude Agents and Accenture PartnershipOpenAI, Google, Anthropic Update Agent FrameworksTimestamps:00:00 "Everyday AI: News & Insights"04:08 Disney's AI Integration & Legal Moves07:15 AI Regulation: Federal vs State10:30 "Meta Tightens Grip on AI"14:57 "Google Dominates AI Year-End Race"20:39 "Photoshop Meets ChatGPT Integration"23:50 "Agentic AI Collaboration in Tech"25:07 "Agentic AI & OpenAI GPT-5.2"30:48 "Four Needles Benchmark Analysis"32:07 "Improved Memory in AI Models"35:23 "AI Tools & Industry Updates"38:56 "Everyday AI: Subscribe & Connect"Keywords:GPT-5.2, OpenAI, generative AI, AI model, AI regulation, Trump executive order, federal AI law, state AI laws, Disney AI partnership, Sora video generator, Disney OpenAI investment, Star Wars AI content, Marvel AI content, copyright infringement,Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. What a weird week in AI.
Starting point is 00:00:48 I mean, we got arguably one of the most powerful AI models in the world. Dropped in a tweet. Meta, the open source AI company might be going closed source. President Donald Trump is trying to take away the state's powers when it comes to AI. Disney is going all in on AI. And the biggest AI competitors are all teaming up for an AI. AI project. It's like backwards week in AI. And it might leave you scratching your head with a couple of questions. Well, we're going to be giving you a lot of answers today on Everyday AI. What's
Starting point is 00:01:25 going on, y'all? Welcome. My name is Jordan Wilson and this is Everyday AI. This is for you. If you've ever felt confused by the onslaught of AI news, well, Everyday AI is for you, but specifically our Monday segment AI news that matters. So if you're new here, Everyday AI is a daily live podcast and free daily newsletter helping everyday business leaders like you and me make sense of all this AI and how we can leverage the practical and actual stuff to grow our companies in our career. So if that's what you're trying to do, awesome. It starts here with the unedited, unscripted live stream podcast, but to take it to the next level. Make sure you go to our website at your everyday AI.com in our free daily newsletter, which you've got to go grab on our website.
Starting point is 00:02:06 We're going to be recapping the highlights from today's episode as well as everything else happening in the AI world today and just a little announcement. Make sure you join us tomorrow and Wednesday. So that is December 16th and December 17th for our 2025 AI roadmap rewind. So back in January, we made 25 kind of crazy AI predictions. And here we are. The test is due. Did we fail?
Starting point is 00:02:38 Did we leave you astray? Right? Because if you go back and look at it, literally, go back and look in the comments. People are like, this is crazy. This isn't going to happen, right? I think some of it kind of did. But you got to make sure to tune in. We did it originally.
Starting point is 00:02:53 It was like five really quick shows. We're not going to do five recap shows. We're just going to jam it all into two shows. So make sure to join us for that. All right. Let's get into the AI News that matters for the week of December 15th. And first one, yeah, this one's a big head scratch. Sure. Disney and Open AI shaking hands. So the Walt Disney company announced a one billion dollar investment in Open AI, making the entertainment, the entertainment giant, one of the most prominent legacy media companies to formally partner with a leading AI firm. So this deal grants OpenAI access to Disney's copyrighted characters from franchises, including Star Wars, Marvel, and other properties for use and
Starting point is 00:03:40 its AI-powered short-form video generator SORA. Disney plans to allow more than 200 animated characters, along with costumes, props, and accessories to appear in SORA-generated videos that can run up to a minute long. And Disney is going to reportedly be playing these on its platforms. Weird. Something I predicted back in January, but no one believed. Not specifically this, but that you see A.A.
Starting point is 00:04:10 generations on the big screen and it would look real. Disney CEO Bob Eiger said the agreement reflects a push to extend Disney's storytelling through generative AI while also protecting creators rights and safeguarding original works. As part of the deal, Disney will become a major open AI customer and plans to integrate AI tools across its business operations, including making Chad CheapT available to employees. So SORA and ChadjiPT.
Starting point is 00:04:40 are expected to begin offering licensed Disney character context early next year, marking one of the first large-scale uses of licensed Hollywood IP in generative video. This one's important because on the same day, Disney also issued a cease and desist letter to Google accusing the tech giant of massive copyright infringement through its Gen AI models. You know, I said this many years ago. Like for the most part, yeah, these big media companies, small media companies, everyone, you're going to have a couple of choices when it comes to generative AI.
Starting point is 00:05:18 You either partner, you sue, or you lose money and maybe go out of business. So Disney chose one of each here. They partnered with Open AI. They invested a billion dollars for an equity stake in the company and also granted some exclusive access to Open AI. Yet on the same day, they sued Google. and said, hey, you're, sorry, they didn't sue Google. They set a cease and desist letter.
Starting point is 00:05:44 They sued mid-jurney correction on that a couple of months ago. But, you know, ultimately, this is what's going to happen in the long run, especially with copyright and IP. That's much easier to decipher, right? The written word, little hard, right? If you see a Marvel character or Mickey Mouse a little easier to say, maybe copyright has been infringed here. All right.
Starting point is 00:06:09 next piece of AI news. A little about face from the Trump administration, at least when it comes to their normal policy, but not at all when it comes to AI. So President Donald Trump signed an executive order aimed at blocking U.S. states from enforcing their own AI regulations, instead calling for one central source of approval at the federal level. So the United States currently has no comprehensive federal law governing AI. even as there's more than a thousand AI-related bills that have been introduced across state legislature. So, yeah, essentially, U.S. President Trump saying, yeah, we should govern this at the federal level, but there's no official laws or regulations at the federal level, at least congressionally.
Starting point is 00:06:59 So White House AI advisor, David Sachs, said the order will give the administration authority to push back against what it sees as the most honorous state rules. while still allowing regulations related to children's safety. So tech companies have lobbied for nationwide rules instead of state rules because they said a patchwork of state laws could slow innovation and weaken the U.S.'s position against China as firms invests billions of dollars in AI development. And right now, I mean, state laws do vary widely. And critics say the executive order undermines state. authority to protect residents with advocacy groups arguing that state level rules fill critical
Starting point is 00:07:45 gaps in the absence of meaningful federal guardrails. So California Governor Gavin Newsom accused President Trump of siding with tech allies, allies saying the order seeks to preempt laws designed to protect Americans from unregulated AI systems. Yeah, this one, I mean, it's really like three or four states that you would actually need to worry about how they respond to this. One would be, well, California, because that's where a lot of the big AI companies
Starting point is 00:08:14 like Anthropic OpenAI and Google are headquartered. You also might want to keep an eye on Washington, where Microsoft is headquartered, as well as maybe just New York as well. I'm guessing there's going to be a lot of pushback on this. It's probably, if you're looking at it just from a U.S. versus China innovation, it's probably technically the right move just from that angle, but from the safety consumer protection,
Starting point is 00:08:44 you're not going to have a lot, right? Essentially companies, for the most part, they're going to write, like if something happens with AI, there's not going to be a federal, congressionally passed law to hold this up to, right? A lot of people don't talk about this or don't know. I used to be a journalist back in the day, right?
Starting point is 00:09:04 So much of what is governing social media, right now from a federal legislation standpoint was things passed in the 90s, early internet, early internet era legislation, right? To get actual, uh, congressionally passed laws across in the U.S. takes a very long time, especially anything meaningful. Uh, so yeah, uh, there's probably going to be some, some good AI regulation that's passed with other larger bills,
Starting point is 00:09:31 but anything meaningful is probably going to take like a decade or longer. So this is just, a signal for U.S. innovation. It is a signal for trying to keep up with China. In terms of consumer protections, probably not a good thing. But presumably this will be heading up to higher courts, as I would expect some of the bigger states that I mentioned to object to this executive order. Next piece of AI news, the company that's been known as the open source AI company is now maybe going to be going to be going to close source. So meta is reportedly making a dramatic shift in its AI strategy, moving away from open source AI and potentially going to a proprietary approach
Starting point is 00:10:18 according to reporting from CNBC. So meta is reportedly preparing to launch a major new AI model codenamed avocado in early 2026. And unlike previous Lama models, the avocado variant will be expected to be proprietary, meaning external developers will now will not have free access to the core technology. Yeah. So right now, if you have a powerful enough computer, you can go download, you know, Lama 4 and you can fine tune it and fork it and do it, kind of do what you want with it. But that may not be the case anymore for meta's future models. So the change in direction follows the kind of lackluster reception of Lama 4 when it was released. in April and concerns inside meta, reportedly about rivals using their open source architecture without restrictions. So CEO Mark Zuckerberg has also been spending billions of dollars in the same
Starting point is 00:11:18 way like I drink ounces of water, just like haphazardly, right? Spent countless billions of dollars to recruit and retain leading AI talent, including the aqua hire of scale AI and its CEO. Now their Meta's AI leader, Alexander Wang, to help Meta catch up with OpenAI, Google, and Anthropic. Meta increased its 2025 capital spending forecast to as much as $72 billion on AI, highlighting the scale of its AI ambitions. So we'll see if this actually plays out. Is Meta going to become a closed source proprietary AI company? I think, regardless, we found out that their approach maybe didn't work, right? If you rewind back to around February, March of this year, before Meta released Lama 4,
Starting point is 00:12:18 there was a lot of people, not myself, who said, oh, yes, this next version of Lama is going to be state of the art. It is going to be the best model. Open source is going to take over proprietary. I don't think so. Although, you know, you do have to see this. their impact overall of open source. Right.
Starting point is 00:12:39 If meta had not been pushing open source large language models for the last few years, I don't think China would have made open source what it is today. Because yes, you do have open source in some fields competing really closely with proprietary models. And maybe that did even force open AI's hand at producing a very impressive GPTOSS open source model. Google has been pushing their Gemma three models very, very good. So, you know, even if meta's strategy of open source models didn't work in the long run,
Starting point is 00:13:16 or maybe they'll still continue it, I don't know. It has pushed the space forward, which has helped innovation in general. And it has ultimately caused consumers to win. So even if we're not going to get any more meta, Lama open source, I hope they still keep the Lama name, though, nothing against avocados. But yeah, maybe they'll continue to do the open source and have a proprietary. We'll see and we'll obviously continue to cover. All right.
Starting point is 00:13:43 Next. One little tweet that people haven't been talking about, but if you're a chat GPT user, like 900 million of you out there and you like new things, which is a lot of people, this tweet didn't really get a lot of notice. So CEO, Sam Altman right after. the release of its new GPT-52 model, which we'll be talking about here in a couple of minutes. OpenAI CEO Sam Altman took to Twitter and tease that OpenAI is not done. So in a tweet late Thursday afternoon, Altman said that OpenAI had a, quote, few little Christmas
Starting point is 00:14:22 presents for you next week. So if you're confused what that might mean, well, it might mean that ship miss is upon us. So last year, uh, during what they called the 12 days of ship miss, or shipping features, if you don't speak geeky AI talk, Open AI released a lot of new features. So they released finally SORA after teasing it months earlier. They introduced and released their 01 reasoning model, the chat GPT Pro tier, projects, upgraded, advanced voice mode, and more. However, Google kind of grinched them.
Starting point is 00:15:00 They stole the show because Google, although Open AI opted for some heavy marketing and daily live streams and really leaned into it, Google just quietly woke up and chose shipmiss violence because I think most close observers noted that Google probably won the late year AI shipping season because they rolled out their family of Gemini 2.0 models, their VO2 video upgrade, their Imagine 3, they upgraded notebook LM and they announced and released deep research, which set off the entire category. So that's led many to wonder, are we going to get the same end of the year flurry this year? And what could be left from Open AI?
Starting point is 00:15:48 So we do know that probably Google is going to be releasing, you know, maybe Gemini 3 Flash and a smaller version of Gemini their nanobanana. But what does Open AI have left? Well, we'll have to see and we'll obviously be covering it in this week's newsletter. We do know that they've been testing a new image model and some different versions of coding models that they haven't released yet. But not a lot is known outside of that. So could be interesting to see what they release. Speaking of releases and competition, well, Google actually updated and released their deep research and made it an agent on the same.
Starting point is 00:16:34 day that OpenAI released GPT 5.2. So Google has released a reimagined and improved version of its Gemini deep research agents, a major update powered by its newest foundation model, Gemini 3 Pro, marking a shift from research reports to embeddable AI research tools for developers. Well, you might be saying like, why does this matter? Well, Google is opening up its deep research, which is extremely good to third party apps now through this new interactions API and the Gemini deep research agent. So what this means, if you've ever used Gemini's deep research or Open AIs deep research, and if you haven't, by the way, you're just burning money, right?
Starting point is 00:17:25 In the, in the, uh, the nature of wasted time. If your company is not using open AIs deep research or Google Gemini's deep research, you seriously stop we'll start using it and stop not using it anyways what this means by google releasing this as an agent um and opening this up is i think we're going to get a lot of agentic research tools for specific domains right um you know it's it's actually not a saturated market right i thought that we'd see dozens uh domain specific deep research tools but maybe with this new update we finally will so essentially google's very impressive uh, Deep research agentic tech is now open for the masses.
Starting point is 00:18:08 So, you know, let's say you work in, I don't know, construction management or something, right? There's probably going to be a construction management deep research agent that is going to be released and likely built on Google's new tech. So Google says the updated deep research agent can handle massive amounts of context and synthesize large volumes of information and is already being used for high. stakes tasks like financial due diligence and drug toxicity safety research. The company, this part's important, also plans to integrate deep research, the new version directly into Google search, which is interesting, Google Finance, the Gemini app, so updating it there, and also Notebook LM. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI assistant.
Starting point is 00:19:09 live in the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks, like batch editing photos, creating mood boards, portrait retouching, and creating social variations. Every step the assistant takes is visible so you can refine, redirect, or take over at any time.
Starting point is 00:19:57 You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adobie.com. All right. This one's awesome. If I'm being honest, Photoshop might be the app that I've used for the longest. I feel I've been using Photoshop for, let me do the math here, 25 years at least. And now they've teamed up with ChatGBT.
Starting point is 00:20:31 So Adobe and OpenAI have joined forces to make Photoshop, Acrobat, and Adobe Express available within ChatGPT. So this integration means now like 900 million weekly active chat GPT users can now edit photos, design invitations, and even manage PDFs for free directly inside of the chat GPT interface. So users can connect Adobe apps by going into their settings, going to the app and connectors inside chat GBT, then selecting and linking their preferred Adobe tool. So once connected, users can simply upload their images or documents, select the Adobe app and just describe in natural language the edits that they want. Right.
Starting point is 00:21:17 And this is really, really cool because it also gives you a slim down version of the editor if you want. If I'm being honest, I love Photoshop. But maybe I need like a new computer because sometimes when I open Photoshop, it's like some of these like resource heavy desktop applications, they just crush my computer. Right. So now being able to. bring in kind of the best of Photoshop into Chad GPT is going to be really impressive in what creatives can do.
Starting point is 00:21:51 And this goes back to what I've been saying for multiple years about how large language models, front end large language models are AI operating systems, right? And this just goes to that point, right? Being able to work with the tools, the software that you use every day, right? when Chad GPT launched their apps, I think they launched with seven. Now they're up to more than 30, including these three from Adobe. And I do assume, even though the GPT store never really took off, I do assume that this will, just because of the way that it handles and shares data and it keeps that context active within the conversation.
Starting point is 00:22:34 So not even just being able to, you know, naturally edit with natural language with the, you know, the Adobe Photoshop app. But obviously then you can keep the conversation going and chat with a different app and you can still have that going. So imagine, I don't know, let's just say there's, you know, five different creative tools. In the future, you'll probably be able to work with all of those and keep the context going. And projects, creative projects that used to take, you know, multiple days, you might be able to get done in, I don't know, a couple of minutes. just by being able to direct it all with natural language. All right, our next piece of AI news. Some big AI rivals coming together.
Starting point is 00:23:16 So the Linux Foundation has launched the Agentic AI Foundation or the AAIF, bringing together nearly every major tech company, AI company, and are setting open standards for autonomous AI agent. So this means that for the first time, direct competitors like OpenAI, Anthropic, Google, and Microsoft are working side by side to create interchangeable protocols for AI agent development. So the foundation centers on three main open source projects for now. Anthropics MCP or their model context protocol, blocks Goose framework, and OpenAI's agents MD standard. So Anthropics MCP, probably one of the most well-known ones, was introduced last November and is now the core standard for connecting AI models with external tools, data, and applications. Blocks Goose was released in early 2025, and it gives developers a structured MCP integrated framework for building agent workflows, combining language models and extensible tools.
Starting point is 00:24:27 And then you have OpenAIs, Agents, MD, which is a markdown, based instruction format for coding agents, and that's been adopted by more than 60,000 open source projects and is also supported by major frameworks like GitHub co-pilot and Gemini CLI. So the AAIF, again, that is the Agenic AI Foundation. The member list, it's like a who's who of tech and enterprise. So aside from the companies already mentioned, OpenAI, Anthropic Google, Microsoft, and Block, You also have Amazon Web Services, Bloomberg, Cloudflare, IBM, Oracle, Salesforce, SAP, Snowflake, HuggingFlace, Uber, and a lot more.
Starting point is 00:25:11 So, yeah, essentially, everyone has decided, hey, and this goes to, you know, more the geopolitical side, I believe, right? Most of these companies based here in the U.S. and saying, well, if you want to standardize AI adoption in general and if the U.S. wants to keep pace with China, you need these different companies' frameworks to be able to talk to each other, right? And you need enterprises to be able to come in and interchangeably be able to move around these systems and to have agents from different platforms, be able to talk to each other and understand each other. So in the same way that the Linux Foundation has set up
Starting point is 00:25:49 other kind of open standards across the web, this is another big step forward. And I think a much needed one, right? Because there's so many now agenic frameworks. So I think it's good to have an overarching umbrella foundation in the Agentic AI Foundation to kind of hopefully help create a little bit more cohesion among competitors. All right. In our last big AI news story, Open AI has released GPT 5-2. And it's pretty good. So OpenAI new GPT-5-2 model delivers significant performance gains in writing, coding, reasoning, and more. So the launch, though, may be more noteworthy than the actual benchmarks and specs of the model. So the launch follows the internal Code Red, reportedly declared by CEO Sam Altman a few weeks ago,
Starting point is 00:26:49 which was a company-wide push to prioritize improving OpenAI's models in recent. response to intense competition, specifically from Google and Anthropic. Mainly Google Gemini's 3. So essentially, Google has been gaining on OpenAI like crazy. So Google is now up to 650 million monthly active users compared to OpenAI's 800 million weekly active users. I did see that number get reported at 900 million. So we'll have to see when and if.
Starting point is 00:27:25 Open AI confirms that. But essentially, over the last few quarters, Google is catching up to Open AI in terms of just number of users. So it seems like, according to reports at least, that CEO Sam Altman said that they needed to focus just on model performance, that maybe they were getting a little too diluted, focusing on things like shopping experiences and potential ads and agents and browsers and all these other things. So reportedly, Open AI had a little bit of a code.
Starting point is 00:27:55 You know, their last two models were not overwhelmingly positively received in GPT5 and GPT51. So with GBT 52, at least the sentiment so far seems good, right? The vibes overall seem good. The benchmarks are great. So GPD 52 is available in three versions. You have your instant designed for fast information retrieval, thinking, which is optimized for coding, math, and planning. in pro, which is the most powerful tier for complex tasks.
Starting point is 00:28:29 So here's one that I thought is really telling. Some benchmarks, I'm just going to actually focus on two. They're kind of specific. They're a little dorky, but I think we really got to talk about them. So OpenAI says that GPT52 thinking outperform human professionals in over 70% of tasks on GDPVal benchmark and also completed them 11 times faster. So if you don't know what the GDP Val benchmark is, it's actually pretty important. It is an open AI created benchmark that measures model performance on economically valuable
Starting point is 00:29:09 real world tasks across 44 occupations. So essentially the work that most of us do across 44 applications. And here's why I think it's important to talk about. You might be saying, okay, well, okay, open AI's. own benchmark. Well, yeah, of course, they're going to have the best, you know, score on it now. But opening I actually released GDP bow back in September. And guess what? When they announced it, they were not the top model. Anthropic actually was, right? Which I think said something about open AI, you know, doing good research, right? And putting
Starting point is 00:29:45 something out is kind of gutsy to put out a benchmark that says this is showing how well models can do economically valuable work and then to have one of your biggest competitors be they actually got the best score but now gpt 52 thinking wipes away the competition on this specific benchmark and i think that's important and i will say anecdotally it seems it's really based on two things uh gpt 51 was not good at creating spreadsheets right a lot of errors and they're also not good at creating powerpoints they just weren't very good they weren't effective They weren't great at storytelling. And I think that's two things in my experience so far. GPT5-2 is actually really, really good at, right? That's one of the, we did a show a couple months ago on like, hey, this is like the secret cheat code of Claude. It's the best front end large language model at creating presentations and spreadsheets,
Starting point is 00:30:43 which is what so many of us do, right? That is economically valuable work. That is consultants in a nutshell, right? This is what consultants do, right? They go find data, you know, they find these trend lines, connect data, and then put it in spreadsheets and put it in PowerPoints. And Claude's been really good at that, but they've maybe been the only front end large language model that's truly excelled at that. Right. Google's good in concept, but not good at creating the actual files. And GPD51, not really good at it.
Starting point is 00:31:14 GVT 5-2, really, really good, impressive. All right. One other quick benchmark to talk about with GPD 52. This one, I'm only going to say the acronym once because it's pretty long. It is the MRCRV2. But let's just call it the four needles benchmark. So this essentially measures how well models retrieve specific needles or kind of like hidden information.
Starting point is 00:31:39 It's kind of like a needle in a haystack, right? So it measures how well models can find these four needles across a long context window. So if you look at. GPT51, right? So when you start at the lowest context, so it measures it across 8K tokens to 256K tokens. So GPT51, well, you know, 100%, right? About 100% in the beginning. But as it went on, as the context window grew at 256K, it only got 40% right. That's a huge drop off. Right. And this is something that even people that are, you know, quote unquote, good at using large language models, they don't keep context
Starting point is 00:32:23 window in mind, right? I do. I watch it like a hawk. I have token counters. I'm always doing internal needle in the haystack. I created a needle in the haystack, you know, kind of internal benchmark three and a half years ago before there was an official benchmark. It's extremely important because these models, it's like they're like any of us on a Friday.
Starting point is 00:32:43 On a Friday, your brain is fried. You forget things that you would normally remember on a Tuesday morning. or, you know, Monday midday. Large language models are kind of the same. You know, there's a lot of things that most of us don't understand about how they work, right? But over the longer context, they start to forget things, hallucinate, and just lack in performance. So, GPD 5-2 on this new four needles benchmark, nearly 100% retrieval at 256K tokens, which, right, I was trying to read and understand because this just has. happen over the weekend.
Starting point is 00:33:20 You know, essentially, I saw an open AI employee just say like, oh, they multiplied something in a different way. So I have to like go and like do more research or see if I can get someone at open AI to be like, what the heck do you mean? Right. But presumably they figured something out internally that allows models to retain, um, essentially their power to, uh, retain their knowledge and retention over a longer period, over a longer context window, which is huge, right?
Starting point is 00:33:47 not just for Open AI, but presumably this is something that will get figured out by the other labs as well. So yeah, it's it's something that a lot of people don't pay attention to, right? If you run the same kind of test early on, but you extend the context window and then you run it later, it's usually going to be a lot worse later on. So pretty big update there. Small dorky thing from GBT 5.2. All right, that is all of our big news stories for the day. but let's wrap with our what's new and what's next. So some of these are rumors, some of these are little updates, some of these maybe are bigger updates that just didn't make our,
Starting point is 00:34:25 you know, top eight or top 10 stories for the week. So here we go in bullet point fashion, what's new and what's next for this week. All right, here we go. IBM will acquire data streaming specialists confluent for $11 billion. Next, Chad GBT will soon support apps in custom GPs. Not yet, but it is coming. Open AI quietly started supporting their own version of Anthropics Skills feature. So some tipsters online found that out.
Starting point is 00:34:56 We'll share that in today's newsletter. Google released an ultra version of NoBook LM for Gemini Ultra subscribers. Just way better, way better rates. And it may be the version of Notebook LM that gets Gemini 3 first because Gemini 2.5 Pro is still powering notebook L.M, Instacart and OpenAI partnered on grocery checkout in ChatGPT. Mistral released in their DevStrol 2 coding model. ChatGBTGPT added an extended thinking feature on its Pro model, which didn't exist before. So now when you go to ChatGBT GBT Pro, there's actually an extended. So you can stack more thinking on the biggest thinking
Starting point is 00:35:36 model. So it's like thinking XXXXL. Then you have Open AI released its state of Enterprise Report, which we, sorry, State of Enterprise AI report, which we shared about in our newsletter last week. Google teased Gen tabs, a feature that builds apps out of your open tabs. This one looks awesome. So it's on a sign up right now, unfortunately. But yeah, if you have like 10 tabs open, you know, researching a vacation or something like that, Gen tabs, part of Disco, will just create an app for you based on that. Really cool. Next, we have thinking machines released Tinker in LLM fine-tuning API. So we finally know one thing thinking machines is working on.
Starting point is 00:36:21 Claude hinted at full agent mode aside from a chat mode. So maybe we'll see that. Anthropic is partnering with Accenture to train 30,000 consultants on Claude. This comes right after Accenture had a similar announcement with OpenAI. Like I alluded to later, we might see a Gemini 3 flash or a nanobanata 2 flash. around the corner from Google. Open AI is testing a new image model themselves, which looks much better than their current image model.
Starting point is 00:36:52 Time named the architects of AI as person of the year. OpenAI named former Slack CEO as its chief revenue officer. Google added VO-powered video abilities to its Pomelli marketing tool. Pameli is a great, like, I think it was just made for small businesses, but it's amazing. by the way, Open AI launched their Open AI certifications courses with select partners. But yeah, if you're part of the general public, you can't go get Open AI certified just yet. And last but not least, cursor released a new visual editor.
Starting point is 00:37:27 All right, that is it for the AI news that matters this week. So remember, we do this almost every single Monday. We cut through all the marketing, all the hype, all the confusion. And hopefully for our busy leaders out there, we say, Here's what matters. Here's what it means. And here's how you can leverage it. So I hope this was helpful.
Starting point is 00:37:48 And as a reminder, make sure, please tune in tomorrow and Wednesday for our 2025 AI roadmap rewind. So we did our 25 AI predictions back in January of this year, the culmination of hundreds of hours of research from the year before. So we're going to grade ourselves. We're going to go over it and think of this almost as like a state of AI for 2025. and end of the year recap, but also we got to hold ourselves accountable. Did we lead you astray, right?
Starting point is 00:38:19 A lot of people come up with these crazy AI predictions and none of them come true. I think we're going to do okay. But, I mean, we're going to ramp it up even more in 2026. So make sure to tune in for that special two part series. All right. Thank you for listening. I hope this was helpful. If so, tell someone about it.
Starting point is 00:38:39 Make sure to share this online. If you're listening on the live stream, if you're on the podcast, please make sure click that follow button. If you could, leave us a rating. We'd appreciate that. Then go to our website, your everyday AI.com. Sign up for the free daily newsletter. Thanks for tuning in. We'll see you back tomorrow and every day for more everyday AI.
Starting point is 00:38:58 Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Creative. of cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time.
Starting point is 00:39:30 See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your Everyday. a.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.