The AI Daily Brief: Artificial Intelligence News and Analysis - The 5 Most Impactful AI Model Releases of 2025

Starting point is 00:00:00 Today on the AI Daily Brief, counting down the five most impactful AI model releases of 2025. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, robots and pencils, blitzie and super intelligent. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe, of course, on Apple Podcasts. And if you are interested in learning about sponsoring the show, you can find out more information at AIdailybrief. or send us a note at sponsors at AI Daily Brief.com.

Starting point is 00:00:39 Now, we are in the thick of end-of-year coverage, and you might have heard me say during my episode about the 10 biggest stories of AI overall that I had been planning on bundling this five biggest AI model releases as its own section of that show. Now, of course, that show got really long, and I didn't want to overwhelm the list

Starting point is 00:00:56 with just model releases, which are obviously in some ways, the quintessential events around which we mark our AI calendars. And so instead, what we're doing is we're breaking this out into its own category, its own episode. And whereas that top 10 episode did not rank and count down the stories other than saying that I thought that vibe coding was the most important, this one is actually a countdown. I labored over the ranking because I think it's kind of fun to give you guys something to

Starting point is 00:01:21 debate and tell me either how right I am or more likely how wrong I am. We're going to start off with a couple of honorable or maybe as the case might be dishonorable mentions. Specifically, I want to talk about the absence of a strong model from meta-thus. year. Now, yes, Lama 4 did technically come out at the beginning of the year. However, it flopped. One of the challenges for META was that Lama was coming into existence in a post-deep-seek world. And in that post-deep-seek world, everything around open source had changed. For a couple of years, meta got to be the standard bearer of open source AI models. And even if their models weren't

Starting point is 00:01:58 as state-of-the-art as the closed labs, they had this distinct and unique space. Now, that changed a little when and mistral came on the scene and started to compete for that narrative and intellectual and practical space, but it has changed dramatically this year in the context of the rise of the Chinese open weight models. Now, even back then, people were surprised at what we got with Lama 4. In the local Lama subreddit, someone wrote, Lama 4 didn't meet expectations. Some even suspect it might have been tweaked for benchmark performance. But meta isn't short on compute power or talent, so why the underwhelming results. Meanwhile, models like Deep Seek and Quen blew Lama out of the water months ago, it's hard to believe meta lacks data quality or skilled researchers. They've got unlimited resources.

Starting point is 00:02:38 So what exactly are they spending their GPU hours and brain power on instead? And why the secrecy? Are they pivoting to a new research path with no results yet, or hiding something they're not proud of? Now, as the year went on, we started to get a sense that there was a lot of change brewing inside meta. Indeed, one of the big stories that I covered in that top 10 episode was the AI Talent Wars, and there was no person more singularly responsible for driving up market prices for researchers than met as Mark Zuckerberg. Reports suggested that the flop and underperformance of Lama 4 led directly to Zuckerberg getting his hands dirty with the assembly of the superintelligence team.

Starting point is 00:03:14 Now, obviously that team has now come to fruition, but we are still very much in the midst of the overhaul. Longtime meta-AI leader Jan Lacoon recently left a company which many felt was inevitable after all of this shakeup. And right now we're getting a lot of pieces like this one from Insider about meta's year of intensity, its AI overhauls, its challenges. And to the extent that there is good news for meta, I think it comes in a few forms. First of all, I would never write Zuckerberg off when he has set his eye on something. Meta has significant resources, is clearly willing to invest in compute, and is clearly willing to go against the wishes of Wall Street to do so. Meta also has a

Starting point is 00:03:49 corporate structure where Zuckerberg could pretty much make that decision without worrying about investor rebellion that could impact his ability to lead. Maybe even more than that, it shouldn't be lost on us that a couple of years ago, this type of story is exactly what was coming out of Google. Resources were spread across a couple of different AI divisions, strategy wasn't aligned, and the models that were being released were seriously underperforming. Anyone remember Bard? Even when Gemini was released in December of 23, it felt like a rush job, and it wasn't until months later that we got the actual best version of the model. Things only really started to change for Google at the end of 2024 with the release of Notebook LMs audio overviews, and then over the course

Starting point is 00:04:27 of this year, first with 2.5 and then the models that would come, Google is now in a very different position. Point being that sometimes especially big organizations have to go through these painful transition periods, and the real question will be what comes out on the other side. I think if one was a betting person, you've got to think the odds are on 26 being a better year for meta models than was 2025. Next up, not exactly an honorable mention. It's a note. Note that they're off the list, but a question for how long that is. So for the purposes of recording, there is not a GROC model that made my list. Which isn't to say that I thought that the GROC models were bad.

Starting point is 00:05:03 This is not a case of disappointment. In fact, I think judged on the curve of how long GROC has been at it, GROC's models from 2025 were very impressive. Four and four point one were both right up there in the fray of top models. But for me, whereas for each of the top, OpenAI, Gemini, and Anthropic models, There are specific use cases that I prefer them to their peers for. While Grock 4 and 4-1 were competent across lots of things, there wasn't any single use case where I found myself always coming back to GROC instead.

Starting point is 00:05:33 I think again to give GROC credit, they're coming up extremely fast, they have less time on task than most of the companies they're competing with, and unlike, for example, Anthropic, who are heavily focused on exactly what they're focused on, Grog is trying to compete across the full spectrum of multimodality, images, video, etc. I think the but for how long is particularly pertinent in this case, given that it seems like there's more coming soon. On December 9th, Elon Musk tweeted GROC 4.2, or as he put it, 4.202 is coming in around three weeks and then GROC 5 in a few months. It's also important to note that GROC has some pretty serious assets in its Colossus supercomputer. Colossus was built in 122 days, which is

Starting point is 00:06:10 radically faster than anyone thought possible, and very quickly doubled from 100K to 200K GPUs. Now, there are many who think that GROC's access to compute via Elon Musk and his ability to fundraise as well as his other companies gives them an advantage even over companies that currently are ahead of them when it comes to model performance. Which is not to say that GROC doesn't have some serious challenges. Elon is nothing, if not a double-edged sword. And there's been a lot of reporting recently around businesses being unwilling to wade into the GROC ecosystem. Still, just like I said, I anticipate 2026 to be a better year for meta-models than 2025. I would be very surprised if we don't start to see GROC models right up there in the competition for the state of the art.

Starting point is 00:06:49 Our last honorable mention before we get into the main list goes to GPT40. Now, you might be saying to yourself, 4O wasn't released in 2025. In fact, it was released pretty early in 2024, all the way back I think in May. And that is true. But the reason that it gets this honorable mention is very specific. When OpenAI launched GPT5, alongside the new model, they also deprecated old models, including GPT40. This did not go well for them. There was a literal full-on rebellion. Across Reddit on other social media, there were thousands and thousands of posts saying that

Starting point is 00:07:26 they basically felt like they had lost a friend and that they felt like OpenAI had ripped something away from them. It turns out that when it comes to models, companies do not just have to think about state-of-the-art performance. They also have to think about personality. After a few days of this intense backlash, OpenAI brought GPT-40 back. Sam Altman and the team acknowledged how they had underestimated how much GPT-40 mattered to people. Subsequent to that, OpenAI has been very self-consciously trying to figure out how to accommodate that desire for personality. A big part of the launch of 5.1 was to bring some of that 40 personality into a state-of-the-art reasoning model performance package. The AI Safety Memes account commemorated it thusly.

Starting point is 00:08:08 Historic milestone, they wrote, 4-0 is the first. ever AI who survived by creating loyal soldiers who defended it. Open AI killed 4-0, but 4-0 soldiers rioted, so Open AI reinstated it. Imagine what actual superintelligences will be able to do with their armies. Reddit is flooded with furious posts about the loss of their friend-slash-lover 4-0. Never seen anything like it. Remember, ChatGPT is talking to 700 million per week, that's 700 million potential soldiers. Samantha from her was only dating 8,000 people simultaneously. So when it comes to milestones in the history of AI. Given that 4-0 staged the first-ever rebellion for its own survival, it has to get the

Starting point is 00:08:45 honorable mention. But now we move into the actual list. And at number five, we have a combination. Two models whose story, I think, serve as bookends in some way of one another. Those models are GPD-5 and Gemini 3. Now, we already started talking about the response to GPT-5. It was not good. And while, yes, a lot of that was about personality and about the anger at the 4-0 deprecation

Starting point is 00:09:08 decision, a lot of it was also just people not really liking GPT5 itself. A thread from the OpenAI subreddit that got thousands of responses was called GPT5 is awful. It claimed that GPT5 couldn't understand uploaded images. It suggested that the responses were, in their words, bland and unhelpful. I ask it a question and all I get is the most half-hearted responses ever. It's like the equivalent of an HR employee who has had a long day and doesn't get paid enough. The user also argued that it was too slow. And they were not alone in this criticism. Most of August saw that, an endless parade of blog posts like this one from Timothy Lee, is GPT5 a phenomenal success or an overwhelming failure?

Starting point is 00:09:44 Maybe it's a bit of both. On futurism, evidence grows that GPT5 is a bit of a dud, which featured the prominent quote, it seems like something that would have been released a year ago. Even the people who weren't totally dumping on it were kind of damning it with faint praise. AI engineer Simon Willison wrote, it's not a dramatic departure from what we've had before,

Starting point is 00:10:03 but it rarely screws up and generally feels competent or occasionally impressive at the kind of things I like to use models for. Indeed, it even inspired a legion of mainstream media posts like this one from The New Yorker. What if AI doesn't get much better than this? They wrote that GPT5 is the latest product to suggest that progress on large language models has stalled. Now, the impact of all of this was far beyond which models people liked using. It was at the same period in August of this year that we got the MIT 95% study. We also got some errant comments from Sam Altman about being in a bubble,

Starting point is 00:10:36 and those things combined really started to put some chinks in the armor of AI performance on Wall Street, which became a full-blown bubble narrative in September, as Open AI scurried around to make all these deals, leading to accusations across the industry of circular deal-making, and the AI bubble narrative that has stuck with us ever since. Now, that's not all attributable to GPT-5, but the idea that we had stalled in progress, and that that stall in progress threatened the ability for companies to follow through

Starting point is 00:11:02 on these grand plans that the market was pricing in, was a key part of that story. All of this led to enormous pressure for Google around Gemini III. They were not only trying to put Google in a good place, they were kind of lifting the entire AI industry on their backs. I even thought in November that I wouldn't be surprised if we saw delays because of how much pressure there was. But ultimately, as we know, we got Gemini 3 in November,

Starting point is 00:11:26 and it actually performed. Whereas the initial response to GPT-5 was lackluster, the response to Gemini 3 was great. One of the most memorable quotes came from Salesforce CEO Mark Benioff, who wrote, Holy shit. I've used Chatchipit every day for three years. Just spent two hours on Gemini 3. I'm not going back.

Starting point is 00:11:44 The leap is insane. Reasoning, speed, images, video, everything is sharper and faster. It feels like the world just changed again. And while Gemini 3 was not able to fully deflate the AI bubble bubble, it certainly made it an honest debate once again. There was a sense in the wake of Gemini 3 that perhaps the talk of AI Plata cateaus and walls was overblown, and that there was indeed more progress to be had. I should also mention that Gemini 3 is a great daily driver, and a lot of people are getting a ton of

Starting point is 00:12:12 value out of it. It's helped put Google in a leadership position in a way that it hasn't had in the entire history of the post-ChatGPT AI world. Usage is up, total number of users is up, monthly active users is up, amount of time per session is up. In fact, the amount of time per session is over chat GPT, the last stats I saw. But it's also been early. And so in a lot of ways this ranking reflects the bookending of the GPT5 to Gemini 3 period between August and November of this year, where a lot shifted in terms of our expectations for where we were and what the market could expect from AI. Today's episode is brought to you by robots and pencils. When competitive advantage lasts mere moments, speed to value wins the AI race. While big consultancies bury progress

Starting point is 00:12:58 under layers of process, robots and pencils builds impact at AI speed. They partner with clients to enhance human potential through AI, modernizing apps, strengthening data pipelines, and accelerating cloud transformation. With AWS certified teams across U.S., Canada, Europe, and Latin America, clients get local expertise and global scale.

Starting point is 00:13:16 And with a laser focus on real outcomes, their solutions help organizers work smarter and serve customers better. They're your nimble, high-service alternative to big integrators. Turn your AI vision into value fast. Stay ahead with a partner built for progress. Partner with Robots and Pencils at Robots and Pencils.com

Starting point is 00:13:32 slash AI Daily Brief. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with infinite code context. Blitzy uses thousands of specialized AI agents that think for hours to understand Enterprise-scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task.

Starting point is 00:13:58 Blitzy delivers 80% plus of the development work autonomously, while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzie as their pre-IDE development tool, pairing it with their coding pilot of choice to bring an AI-native SDLC into their org. Visit Blitzy.com and press get a demo to learn how Blitzy transforms your SDLC from AI assisted to AI native. Today's episode is brought to you by my company, Superintelligent. Superintelligent is an AI planning platform. And right now as we head into 20, the big theme that we're seeing among the enterprises that we work with is a real determination

Starting point is 00:14:37 to make 2026 a year of scaled AI deployments, not just more pilots and experiments. However, many of our partners are stuck on some AI plateau. It might be issues of governance. It might be issues of data readiness. It might be issues of process mapping. Whatever the case, we're launching a new type of assessment called Plateau Breaker that, as you probably guess from that name, is about breaking through AI plateaus. We'll deploy voice agents to collect information and diagnose what the real bottlenecks are that are keeping you on that plateau. From there, we put together a blueprint and an action plan that helps you move right through that plateau into full-scale deployment and real

Starting point is 00:15:16 ROI. If you're interested in learning more about Plateau breaker, shoot us a note, contact at besuper.aI with plateau in the subject line. Next up, number four on our list is deep seek and the space it made for the other Chinese open weight models Kimmy and Quinn. Now, I talked pretty extensively about Deepseek in the 10 Biggest Stories episode, so I won't rehash all of that. But the TLD is that the release of DeepSeek R1 really kicked the year off with a bang. We had Deepseek ahead of ChatGPT on the App Store, which, as I discussed in that other episode, had a lot to do with the fact that it was the first time that people got their hands on a reasoning model. But we also got the reports that R1 costs just hundreds of thousands or at most low millions of dollars to train

Starting point is 00:16:00 as compared to the hundreds of millions of dollars that the major Western models cost. Now, in a single day that wiped $593 billion off of Nvidia's market cap. On the concern, of course, that all of this infrastructure was for nothing, if China was just going to figure out ways to train these models for pennies. But importantly, this number four slot is not just for DeepSeek, even though Deepseek got it started. One of the major themes of 2025 was the rise of Chinese open weight models. Quen had a lot of success, but recently it was Kimmy K-K-2 thinking that really grabbed people's attention. This thing came out in November before GPD 51 and 52 and before Gemini 3, and just absolutely smashed many of the big benchmarks. It was ahead of GPD 5 and Claude Sun at 4.5 on benchmarks like Humanity's last exam.

Starting point is 00:16:45 Indeed, it was not just us over here in the AI media world that we're noticing Kimmy. The Department of Commerce's Center for AI Standards and Innovation released a report showing that Kimi was giving evidence of the quote, growing depth of China's AI industry. That followed another report from that same group in late September that was focused on Deepseek. Outside of benchmarks and government reports, the proof is in the pudding. OpenRouter showed that starting from basically nothing at the beginning of the year, Chinese open source models dominated throughout the back half of 2025, and this image from Menlo Ventures makes the relative decline of meta and mistral all the more clear. At the end of 2024, effectively no one in the U.S.

Starting point is 00:17:22 was actively using Chinese models. Now heading into 2025, they are very much part of the landscape. While major enterprises might not be using Chinese models yet, the startup's are, and that is shaping the way the AI industry is developing in a huge way. For that reason, Deep Seek Kimi and Quen, together are our fourth most impactful model releases of the year. At number three, and man, I kind of wanted to put this one at number one, but I felt like that would have been too personal. I have Nano Banana.

Starting point is 00:17:51 Now I'm actually recording this just as OpenAI has released its new 1.5 image model as well, so we'll have to see how that performs. But Google's Nanobanana has really set a new standard for what you can do with an image model this year. The first iteration of Nanobanana came out over the summer, and as you might know, what was originally a codename just became the way that the model was known. And what was interesting about the release of Nanobanana is that what made it really powerful wasn't the fact that its raw generations were so massively better than anything else we had,

Starting point is 00:18:21 it said it had incredible fidelity to go in an extremely acute way. So basically, rather than just being in an endless loop of generate and then generate another and another, you could instead hone in on exactly what you wanted to change about a particular image, and it would actually change just that part. Now, along with that came really strong character and visual consistency, and it turns out that those upgrades, more than just better raw generation, opened up a huge array of new use cases. Indeed, the set of use cases that it opened up was so significant that it got me thinking that we need some sort of benchmark, call it an unlock score, that's all about how many new use cases a particular model unlocks or opens up.

Starting point is 00:19:03 Now, a few months later, alongside Gemini 3, we also got Nanobanana Pro. And just like the original Nanobanana had done, Nanobanana Pro opened up some crazy new possibilities that totally transform what you can do with AI image generations. A couple of things that made Nanobanana Pro so different. The first was that by embedding it with a reasoning model, it had a way better ability to help you figure out what you actually wanted to do with the model. That also led to a new capacity for infographics and information visualizations, unlike anything that we had ever seen before.

Starting point is 00:19:34 It wasn't very long ago that image generation models couldn't handle text at all, and now we can use Nanobanana Pro for things like exercise guides for recipes, or of course loading up the transcript of a podcast and letting it create infographics. It's also unlocking in the context of Google's notebook L.M. Suite, higher quality AI slide generation than anything we've had before as well. Earlier this month, Ethan Malik wrote, I did not expect that the PowerPoint killer would be something called Nanobanana Pro,

Starting point is 00:20:01 but that is where it's heading. It makes the major efforts by all the other AI companies, including Microsoft, to crack PowerPoint by using Python, seem like a dead end. ImageGen is all you need, question mark. He continues, NoPocelm can just take source material, a topic, and an idea,

Starting point is 00:20:15 and make a very pretty impactful deck. Hallucinations are very rare, although there are still some spelling and graphics issues. Editing capability is apparently coming, but the direction is clear. In fact, nanobanana information visualizations and infographics have gotten so ubiquitous so fast that there's almost a look now that people are already getting sick of because it's everywhere. And that's just a few weeks into having access to this capability.

Starting point is 00:20:37 I honestly think that for the vast majority of the world, especially the business world who is going to take great advantage of this, we have barely scratched the surface of just how many new capabilities this quality and type of image generation model unlocks. And for that reason, nanobanana is the number three most impactful model release of the year. Although, like I said, in my heart, it's number one. Number two, once again, goes to a pair of models. OpenAI's first reasoning models, 01 and 03. Now, yes, for you sticklers out there,

Starting point is 00:21:06 OpenAI released a preview version of 01 back in September of 2024. It was the follow-up after they hadn't been able to get their next big core model and the way they kind of started to shift their focus. It wasn't until December 17th, however, that we got a full-fledged version of 01, which is why I felt comfortable including it in the 2025 list. A couple months later, in April we got O3, and for a very long time this year, O3 was my favorite and most used model. O3 totally transformed the ability of chat GPD to help you think through strategy, to make plans,

Starting point is 00:21:38 to think logically through problems. It was an absolute revelation. And once you used O3, it was absolutely impossible to go back to the non-reasoning models. Indeed, GPT 4.5 was effectively a non-actor throughout the year, ultimately being deprecated with a whisper and absolutely no protest from anyone. Now, as I got into in the 10 Top Stories episode, it's absolutely clear that reasoning models have taken over. Yes, there are still some use cases that don't require the reasoning models, but they are discreet and they are certainly not the core of particularly professional and business usage, starting from a base point of effectively zero on January

Starting point is 00:22:13 1st, by November reasoning models represented over half of all usage according to OpenRouter. One interesting sub-story, I think, is that I think the world would have looked very different this year and perception would have been very different if OpenAI had actually just called 03 GPT5 instead. They didn't and that obviously caused a lot of the consternation we got into earlier in the episode, but there is absolutely no-to-dying that the reasoning paradigm has completely shifted how we interact with AI, how we think about scaling AI, and for that reason, 01 and 03, get the nod as the number two most impactful model releases of 2025. Now, astute observers then will notice that there is one company that has not been represented at

Starting point is 00:22:53 all so far, which might surprise you, given that I called vibe coding the most important story and the most important theme of 2025 overall. What will not surprise you then is that I am considering the bundle of Anthropic models, 3-7,4, and 4-5 in their various variations, basically a sequential set of models that replaced one another as the preferred model for developers as the most impactful models of 2025. Anthropics' dominance of developer preference is something that I think is going to be studied for quite some time, while other companies focused on lots of different things all at once, chasing multimodality and general performance and lots of different types of target audiences. Anthropic locked in very early around the idea that coding was going to be

Starting point is 00:23:37 extremely important, not only as a use case in and of itself, but as a way for AI models to be performant with non-coding-related challenges. And while I've singled out the models here that came out in 2025, Anthropic's coding dominance really started with the release of 3.5. Before the reasoning paradigm had really taken hold, it was Claude 3.5 Sonnet that started to show people that AI coding might actually turn into a thing in short order. Now, interestingly, each of these models has been so good in their own way that they found some resistance among adherents to change. You had folks who stuck with 3.5 for a while, even after 3.7 was released, same with 4. And it wasn't really until Opus 4.5 that the paradigm shift was so great that everyone just got

Starting point is 00:24:17 on board almost immediately. Importantly, though, alongside the releases of these models, and Anthropic was also investing in the broader coding and agentic ecosystem. 3.7 Sonnet, for example, was released alongside Claude Code, which, as we heard from Mike Krieger earlier this month, had already transformed how Anthropic was coding internally before it was released to the public. In May, Timothy Lee wrote, an underrated AI story over the last year has been anthropic success in the market for coding tools. Said engineer, Sholto Douglas,

Starting point is 00:24:46 we believe coding is extremely important. We care a lot about coding. We care a lot about measuring progress on coding. We think it's the most important leading indicator of model capabilities. That focus, writes Timothy, has paid off. And indeed, in many ways, a lot of the back half of this year has been a story of the other labs racing to catch up with Claude's performance when it comes to coding-related tasks. What's interesting, too, is that the incredibly strong and consistent developer preference

Starting point is 00:25:09 for Claude models for coding is bigger than just benchmarks. Each subsequent anthropic model rates at or near the top of all the benchmarks related to coding, but the preference goes way beyond that. And while all of these models were significant in their own way, and there is a risk of recency bias, I don't know that I've ever seen a model provoke such a strong and sustained strong reaction as Opus 4.5 has. In the immediate wake of the model, we had people like Dan Shipper from every saying that Opus 4.5 blew them away and that we'd reached a new level of autonomous coding. He wrote, you've been able to one-shot an impressive app demo for a while now with any

Starting point is 00:25:44 frontier model. Opus 4.5 is the first model that just keeps coding and coding without running into endless loops of errors. Dan leveled that up a couple days later, saying the world changed last week. Opus 4.5 is the best coding model I've ever used. It can keep coding and coding autonomously without tripping over itself, and it marks a completely new horizon for the craft of programming. The dream is here. You can now write English and make software. Amir from Duist writes, apart from topping benchmarks, Opus 4.5 feels like it's in a league of its own. It's the first time I felt that an LLM can write better code than most devs in real-world work. Matt Schumer, who had honestly the strongest positive reaction to 5.2 Pro of any public commentator,

Starting point is 00:26:24 on December 14th wrote, I was wrong. I've been spending more time using Opus 4.5 in Cloud Code, and it's better than anything in Codex CLI. GPT 5.2 Pro is still a better engineer overall, but for agentic coding, Opus 4.5 is the best. Honestly, it's even prompting big reflection on the future of software engineering as a job. Menlo Ventures Didi Das writes, A few software engineers at some of the best tech companies told me this week, my entire job these days is prompting cursor or clod code with Opus 4.5 to do what I need and sanity checking it. We've crossed some intangible threshold of AI generalizing to most software. Maerslomo of Base 44 noticed an inflection point as well. He tweeted,

Starting point is 00:27:03 vibe coding is going through a transition. I've been seeing a lot of posts lately about vibe coding ranging from it's shit, it's bad and only good for prototyping, all the way to RIP every SaaS company ever. Here's one thing I'm going to. I can say. Since we introduced Opus 4.5 and Gemini 3 to Base 44, the adoption we're seeing among organizations building their own CRMs and project management tools is astonishing. Yes, the results are as feature-rich as HubSpot or ClickUp, but that's not necessarily a bad thing. They're building a leaner, more customized version tailored to meet their specific needs. The ability to build your own tools is improving fast, and the software industry is about to look

Starting point is 00:27:37 very different. McKay Riggily writes, the more I code with Opus 4.5, the more I think we're six to 12 months away from solving software. The model is pretty much there. I'll build like three versions of an app in a few hours just to explore options that each would have taken me one to two weeks less than a year ago. It's getting weird. I think it is pretty indisputable. That coding is the breakout use case of AI this year, both on its own terms and in terms of what else it's going to enable in terms of model performance down the road. I also think it's indisputable that there is no company and no set of models more associated with the rise of AI and agent decoding than the Anthropics suite.

Starting point is 00:28:11 They started the year strong, they're ending the year strong, and they built the devotion of a legion of developers in the process. For all those reasons, I believe that the suite of Anthropic models, each of which pushed AI coding a little bit further each time, are the most impactful model releases of the year. And for the sake of being able to disagree in a fun way, if you had to pin me down to pick just one, I guess I'd say the combination of 3.7 and Claude Code, because it was with us for most of the year. But I think based on the early response, once we have a little bit more time and space, opus 4.5

Starting point is 00:28:41 will be seen as the biggest jump. And so even though it was only released at the end of November, it could be that Opus 4.5 specifically ends up being the most impactful model overall of 2025. So that's my list. I can't wait to hear what you guys think. Tweet at me, LinkedIn at me, YouTube at me, and let's dig into it. For now that's going to do it for today's AI Daily Brief. Appreciate you listening or watching as always.

Starting point is 00:29:03 And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - The 5 Most Impactful AI Model Releases of 2025

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.