The AI Daily Brief: Artificial Intelligence News and Analysis - Anthropic Accidentally Revealed Their Most Powerful Model Ever

Starting point is 00:00:00 Today on the AI Daily Brief, are we entering the era of vertical AI models? Before that in the headlines, a big leak with Anthropic confirming the existence of Claude Mythos, what they call by far the most powerful AI model we've ever developed. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, KPMG, Blitzy, assembly, and robots and pencils. To get an ad-free version of the show, go to Patreon.com. slash AI Daily Brief. And if you are interested in sponsoring the show, send us a note at sponsors

Starting point is 00:00:38 at AIdailybrief.aI.com. Late breaking one last night, a data leak revealed that Anthropic is testing a new model referred to as Claude Mythos. Anthropic has confirmed the existence of this model, with a spokesperson saying that it was a step change, their words, in performance, and quote, the most capable we've built to date. They said the model is currently being trialed by early access customers. So here's what happened. On Thursday evening, a draft blog post describing the model was left in an unsecured publicly searchable database. The blog post says we've finished training a new AI model, Claude Mythos. It's by far the most powerful AI model we've ever developed.

Starting point is 00:01:15 Mythos, they write, is a new name for a new tier of model, larger and more intelligent than our Opus models, which were until now, are most powerful. We chose the name to evoke the deep connective tissue that links together knowledge and ideas. Compared to our previous best model, Claude Opus 4.6, mythos gets dramatically higher scores on tests of software coding, academic reasoning, and cybersecurity among others. In preparing to release Claude Mythos, however, they say, we want to act with extra caution and understand the risks it poses, even beyond what we learn in our own testing. In particular, we want to understand the model's potential near-term risks

Starting point is 00:01:45 in the realm of cybersecurity and share the results to help cyber defenders prepare. Mythos is also a large compute-intensive model. It's very expensive for us to serve and will be very expensive for our customers to use. We're working to make the model much more efficient before any general release. For those reasons, we're taking a slower, more gradual approach to releasing Mythos than we have with our other models. We're beginning with a small number of early access customers who will explore the model cybersecurity applications and report back what they find. Now, this blog post is very undercooked. It ends not too long after that. Now, if you hear the term Capybara thrown around, apparently the model was also referred to as that. I'm not sure

Starting point is 00:02:20 if Capybara was the codename and Mythos is the intended launch name. But regardless, this draft blog post was in a cache of unsecured documents. In total Fortune reports, there appear to be close to 3,000 assets linked to Anthropics blog that had not previously been published. Now, there is a lot of chatter about this one, not least of which is the choice of name, which many people associate with the Cthulhu mythos, which, given how much the AI safety folks use those sort of literary reference points to describe their concerns about AI may not be the most advised name. People also compared it to the recently revealed spud from OpenAI, with Jason Botterill writing, I like how Anthropics' mysterious booky-new model is codenamed

Starting point is 00:02:57 mythos, while Open AI named there's after a freaking potato. Still, the broader sentiment was captured by Gavin Purcell, who says, it will only go faster from here. Obviously, there will be a lot to watch with this one. Unfortunately, for those of us who want to get our hands on the most powerful models at any given time, it kind of looks like the blog post was not even announcement of the release of the model, just an advanced warning about it, so who knows how long it'll take before we actually see it in practice.

Starting point is 00:03:22 Now, one model that is available now, Google has dropped a small voice model that could have big implications. The model is Gemini 3.1 Flash Live, which brings real-time dialogue to voice models. Up until now, most voice models have been turn-based, causing awkward stumbles and terrible interruption handling. Flash Live is designed to work more like a human conversation, with a continuous back-and-forth rather than a jarring stilted experience. The model apparently shows a step-change improvement on multiple audio benchmarks, including one designed to measure multi-step function calling. That's the feature that converts voice commands into complex agentic actions. Some customers like Home Depot have already deployed the model, and Google noted a big improvement in handling

Starting point is 00:04:01 complex details like alphanumeric product codes and noisy environments. So the obvious implication is the quality of personal voice agents on mobile devices, and especially given that Apple is looking to Gemini to power the new version of Siri, the long winter of our discontent of Siri not understanding a single damn word we say may finally be coming to an end. One small product announcement from Shopify that I actually think could be fairly significant, one of my weirder or more out-there predictions for 2026 was that I thought that Shopify has kind of an outsized role to play in the positive normalization of AI. The reason for that is that Shopify is where a ton of small business entrepreneurship lives.

Starting point is 00:04:38 Shopify's tools have already, even in the pre-AI era, given people who felt overwhelmed by what they needed to do to start a business, enough help to get over the hump. Although, as you well know, I am not a jobs doomer, I do think that we're going to see a lot of shifts in the average way that people get employed and make money. One piece of that, I believe, will be an increase in small business entrepreneurship. If Shopify is the home of where a lot of that new energy goes, the way that they use AI to provide value for their people could make a big difference in people's perceptions of it. It's one thing when the only thing you hear about AI is that it's going to take your job and it uses all the water.

Starting point is 00:05:12 It's another thing when you see your income rise 30% from the month before because of the tools you were able to use through your store's hosting platform. So what Tinker is is a free mobile app with more than 100 AI tools for e-commerce. merchants can generate logos, product photos, advertising videos, and much more. It's an iterative, experimental, playful canvas where you can try out all sorts of different brand identities, product placements, and more. The entire concept is about flattening the learning curve. Apps are arranged by outcome, so merchants only need to select what they want to create.

Starting point is 00:05:44 Once inside an app, they can see a range of examples demonstrating what it can do and how to use it. They can then describe a desired outcome in natural language, drop in a reference image, and Tinker automatically turns those inputs into high-quality prompts on the back end. Shopify's director of product for so, Kasi said, if you want more artists, lower the cost of paint. And cost isn't just money. It's the time spent keeping up, the friction of signing up for everything separately,

Starting point is 00:06:06 and the learning curve of figuring it all out. We wanted to lower all of it. So like I said, may seem small, but I really do believe that Shopify potentially has an outsized role to play in the positive integration of AI into the broader economy, and I think Tinker, from my first glances, looks awesome. By the way, hopefully this goes without saying, but this is a completely unsponsored opinion. Over an OpenAI land, Codex gets a big upgrade with the integration of plugins.

Starting point is 00:06:31 The OpenAI Devs account writes, with plugins, codex can now support more real work, including the planning, research, and coordination that happens before you write code and the workflows that follow. The team at OpenAI also used the occasion of the plugins launch to go for Anthropics' throat around some controversy of recent changes from Claude, to reek from the Claude team writes, to manage growing demand for Claude where adjusting our five-hour session limits for free Pro Max subs during peak hours. During weekdays between 5 a.m. and 11 a.m. Pacific time,

Starting point is 00:06:56 you'll move through your five-hour session limits faster than before. People were not happy about that. And OpenAI took full advantage. Tebow from the Codex team writes, Hello, we have reset Codex usage limits across all plans to let everyone experiment with the magnificent plugins we just launched. You can just build unlimited things with Codex. Have fun.

Starting point is 00:07:15 Speaking of OpenAI, the company has made a decision which I think is extremely the right one, putting their erotica plans on Holt. The Financial Times reports that OpenAI has decided to shelve plans for adult mode indefinitely as they consolidate resources around coding and enterprise sales. This is, to put it mildly, not all that surprising. Earlier this month, the Wall Street Journal reported that OpenAI's independent advisory council was unanimously against the feature.

Starting point is 00:07:39 Reportedly, their age detection system had a 12% failure rate, and the experts on the council weren't even satisfied, adult mode would be safe for adults, warning it could encourage an unhealthy emotional dependence on chat GPT. The feature was also controversial among staff, with some departing the company over the issue. Speaking with the Financial Times, sources said that OpenAI wanted to have more long-term research on the effects of sexually explicit chatbots and emotional attachment to AI before they released the product.

Starting point is 00:08:04 Now, my feeling about this, as I said last fall, is that on the one hand, I have a very socially libertarian bent that basically thinks that adults should be able to do whatever they want as long as it's not hurting other people. That said, viewing this question from an entrepreneur's lens, it did not make sense to me for Open AI to be the one to offer this. There is going to be, I promise you, no shortage of adult AI experiences that are available to any adults who want them. And I just think that all of the costs of going down this route were so obviously going to be higher than the upside for OpenAI. So one other thing that I did want to note about OpenAI's recent moves, there is a lot of chatter

Starting point is 00:08:39 right now about how many products are being killed by OpenAI, Instant Checkout, SORA, the erotic chatbot, with people seeming to suggest that it's the company flailing. I either think in many ways it's the opposite. It would be the worst business decision that OpenAI could make to stick with something that wasn't the right move, even if it looked like the right move just a couple of months ago. Nothing will kill a business faster than sunk cost fallacy, and OpenAI being willing to scrap efforts, even where a lot of effort went in, is NetNet a good thing for that company? And it couldn't come at a better time because boy, oh boy, is the competition going to do nothing but heat up. Latest rumors suggest that Anthropic is discussing going public as soon as the

Starting point is 00:09:14 fourth quarter, with follow-up Bloomberg reporting, saying that they might be looking to IP as soon as October. That, of course, puts open AI on the clock, as Sam Altman has reportedly said he would prefer to go first. Meaning all in all, I think my prediction that we actually don't get IPOs this year might be one that is wrong. Noel Moldvey writes, according to the Zodiac, 2026 is the year of the mega IPO. Indeed. For now that that is going to do it for the headlines. Next up, the main episode. All right, folks, quick pause. Here's the uncomfortable truth. If your enterprise AI strategy is we bought some tools, you don't actually have a strategy. KPMG took the harder route and became their own client zero.

Starting point is 00:09:56 They embedded AI and agents across the enterprise, how work it's done, how teens collaborate, how decisions move, not as a tech initiative but as a total operating model shift. And here's the real unlock. That shift raised the ceiling on what people could do. Humans stayed firmly at the center while AI reduced friction, surfaced insight, and accelerated momentum. The outcome was a more capable, more empowered workforce.

Starting point is 00:10:18 If you want to understand what that actually looks like in the real world, go to www.kmg.org.us slash AI. That's www.kmg.comg.coms slash AI. Blitzy is driving over 5x engineering velocity for large-scale enterprises. A publicly traded insurance provider leveraged Blitzy to build a bespoke payments processing application, an estimated 13-month project, and with Blitzy, the application was completed in live in production in six weeks. A publicly traded vertical SaaS provider used Blitzy to extract services from a 500,000 line monolith, without disrupting production, 21 times faster than their pre-Blitzy estimates. These aren't experiments. This is how the world's most innovative

Starting point is 00:10:58 enterprises are shipping software in 26. You can hear directly about Blitsey from other Fortune 500 CTOs on the modern CTO or CIO classified podcasts. To learn more about how Blitsey can impact your SDLC, book a meeting with an AI Solutions consultant at blitzie.com. That's BLYTZY.com. You've heard me talk about assembly AI and their insanely accurate voice AI models, but they just ship something big. Universal 3 Pro is a first-of-its-kind class of speech language model that lets you prompt speech recognition with your own domain context and vocabulary, instead of fixing transcripts and post-processing.

Starting point is 00:11:33 It's more flexible than traditional ASR and more deterministic than LLMs, so you get accurate output at the source and can capture the emotion behind human speech that transcripts often miss, all without custom models or post-processing hacks. And to celebrate the launch, they're making it free to try for all of February. If you're building anything with voice, this one's worth a look. Head to assemblyaI.com slash free offer to check it out. Most companies don't struggle with ideas. They struggle with turning them into real AI systems that deliver value.

Starting point is 00:12:02 Robots and Pencils is a company built to close that gap. They design and deliver intelligent cloud-native systems powered by generative and agentic AI, with focus, speed, and clear outcomes. Robots and Pencils works in small, high-impact pods. Engineers, strategists, designers, and applied AI specialists working together to move from idea to production without unnecessary friction. Powered by RoboWorks, their agentic acceleration platform, teams deliver meaningful results including initial launches in as little as 45 days depending on scope.

Starting point is 00:12:30 If your organization is ready to move faster, reduce complexity, and turn AI ambition into real results, robots and pencils is built for that moment. Start the conversation at Robots and Pencils.com slash AI Daily Brief. That's Robots and Pencils.com slash AI Daily Brief. Robots and Pencils, Impact at Velocity. Welcome back to the AI Daily Brief. I noticed this really interesting story yesterday, where Intercom announced that their new dedicated customer service-focused model, Finn,

Starting point is 00:13:01 had achieved something very significant. CEO Ewan McCabe called it objectively the highest-performing, fastest, and cheapest model for customer service, beating the very best models in the industry, including GPT 5.4 and Opus 4.5. Now, it has been a persistent question in AI about how much custom models would matter. You might remember way back in the immediate post-ChapT fever, a number of companies figured, well, since we have such unique proprietary data, training our own model on that data surely will outperform. Maybe the best known of those efforts was Bloomberg GPT, which they called a 50 billion parameter

Starting point is 00:13:40 large language model purpose built from scratch for finance. Now, it turned out that in practice, that model got absolutely smoked by the general models, reminding everyone once again of the bitter lesson. The bitter lesson is a very famous essay from computer scientist Rich Sutton from back in 2019. He writes, The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective and by a large margin. He gave as one of his first examples, computer chess.

Starting point is 00:14:10 He says in computer chess, the methods that defeated the world chess, champion Kasparov in 1997 were based on massive deep search. At the time, this was looked upon with dismay by the majority of computer chess researchers who had pursued methods that leveraged human understanding of the special structure of chess. When a simpler search-based approach, with special hardware and software, proved vastly more effective, these human knowledge-based chess researchers were not good losers. They said that brute force search may have won this time, but it was not a general strategy. And anyway, it was not how people played chess. These researchers wanted methods based on human input to win, and were disappointed when they did not.

Starting point is 00:14:48 So basically what this essay is arguing is that throughout AI history, and as a reminder, because this often surprises people, AI as a field, at least as a named field, is older than computer science. If you go back to the 50s and look at the laboratories at places like MIT, there were already back then artificial intelligence labs, but the idea of computer science as a field wouldn't come until a little bit later. In any case, what the bitter lesson is arguing is that Throughout AI history, researchers have tried two basic approaches. The first is encoding human knowledge and clever tricks into systems, essentially trying to teach computers how humans think. The second is giving computers massive amounts of data and compute and letting them figure

Starting point is 00:15:27 things out on their own through search and learning. The bitter lesson is that the second approach wins every single time. It's bitter because it's a blow to human ego. Researchers spend years crafting elegant, domain-specific solutions. Encoding chess strategy or linguistic rules or visual perception models. And then a brute force method powered by more compute just steamrolls all that careful work. Now we have that example from chess, but that example repeated across Go, speech recognition, computer vision, and now language. The systems that scale with more computation always eventually beat the systems built on human design shortcuts. And so taking the bitter lesson and applying it to LLMs, kind of explains why Bloomberg's highly specialized model ultimately got beat by much bigger

Starting point is 00:16:07 and more computationally intensive models. And yet, coming into 2026, There was an interesting question, specifically of whether a specific type of data might actually change this equation. That data that people were interested in is last mile usage data, basically user interaction data at the very edge of the experience. And the specific place where many were watching this was around AI coding. The question was whether a company like Cursor could ultimately have some advantage in their own proprietary model because they had such a tremendous amount of experiential data around the

Starting point is 00:16:42 actual interaction point. Now, it wasn't really so much a question of whether that data is valuable. Obviously, it is. But there's a difference in it being valuable for product design versus model design. Inevitably, that data is extremely useful in figuring out the right products or the right harnesses for models. That was never in question. What was a question is whether all that information could actually change the destiny of customized vertical models. Latent Space wrote about this last year in November, and their piece titled the Agent Labs thesis. The point that SWIX and latent space were making was that if it is the case that we are close to hitting the limits of pre-training data, that perhaps shifts the future of model

Starting point is 00:17:22 performance to post-training. The agent lab's thesis asked, can post-training make up the gap between the best open models and the best frontier models, and how long until they start exceeding? In other words, the tweak here is that a company like cursor isn't training a model from scratch, they are taking the best available open weights models that are out there, which are admittedly a little bit behind the state-of-the-art, and adding in this post-training process with the idea of actually performing better in a specific domain than the general state-of-the-art model can.

Starting point is 00:17:53 Now, Cursor placed a pretty high importance on this. The company had said explicitly that they needed to train state-of-the-art coding models to keep up with competitors, which some reports suggested was a financial imperative, with Cursor burning too much money reselling API access to OpenAI and Anthropic. Now, earlier this month, we got the release of their Composer 2 model. The model was in the same ballpark as GPT 5.4 and actually beat Opus 4.6 on coding benchmarks while being much cheaper to run, meaning, of course, that it fit cursors needs extremely well. However, an ex-user called Flynn triggered a controversy, revealing that Composer 2 was just, and boy, is this just

Starting point is 00:18:29 doing a lot of heavy lifting, Kimmy K2.5 with some extra reinforcement learning applied. Cursor themselves did not deny this. Dev relations rep Lee Robinson commented, yep, Composer 2 started from an open source base. We will do full pre-training in the future. Only a quarter of the compute spent on the final model came from the base. The rest is from our training. This is why evals are very different. Now, some amount of the controversy was about cursor, in the eyes of some, failing to disclose their use of an open source base model, but others seemed genuinely dismissive of the practice. As Flynn had done, they wrote off the model as quote-unquote just Kimmy K-2.5 without a second thought. Others thought, though, that maybe something important was going on here. Leet-LLLLM

Starting point is 00:19:10 writes, as someone who basically lives in Opus 4.6, seeing an open-weight Kimi 2.5 fine-tune actually beat it on coding benchmarks is wild. If Composer 2 could really perform that well, cursors seem to have demonstrated that reinforcement learning on a quality dataset can actually go quite a long way, vaulting an adequate base model into the top tier. This, of course, in some ways, seems to run counter to the bitter lesson. But if it's correct, would suggest that there's a lot of fertile ground for training models around particular verticals. which gets us to the announcement yesterday from Intercom. Intercom's chief product officer Paul Adams tweets,

Starting point is 00:19:44 We have a very significant announcement here that will change how we think about the AI landscape. We have built a brand new model for Finn called Apex, which has a higher resolution rate, fewer hallucinations, and is far cheaper than any other model provided by any other company in the world, and it isn't close. This is an incredibly hard thing to achieve and is only possible with the domain-specific proprietary evals

Starting point is 00:20:07 from our billions of human and agent customer service interaction data points. We also have a flywheel here where we will continue to get better at the edges. This is, you might recognize exactly what we were talking about in my 2026 predictions, when we talked about the lab loop and the importance of this last mile usage data. Paul continues, So what does this mean? It means that vertical models can and will outperform general models. It means that many successful companies in the future will need to be full stack,

Starting point is 00:20:35 app layer, AI layer, and model layer. And critically, as it becomes much easier to copy and clone at the app layer, durable differentiation will move down the stack and ultimately to the model layer. Now, this got a ton of chatter. BNAFOG writes, The story isn't that APEG's beat frontier models. It's the domain-specific post-training close the gap this fast. Any vertical SaaS with enough labeled interaction data is sitting on an untapped fine-tuning asset.

Starting point is 00:21:02 The infrastructure remote is eroding faster than most realized. Abhijid, who's on the board of Intercom but does new products at OpenAI, writes, model quality depends a lot on judgment, and that judgment lives in proprietary evals, real-world usage, and fast feedback loops, being close to the work. This creates all kinds of opportunities for companies that are willing to think big and bet on themselves. Now, while he doesn't seem worried for his main employer OpenAI, the implications for them is certainly where many people's heads went. Theo Bloshae writes,

Starting point is 00:21:30 Very cool feat from Intercom, though reading this makes me wonder what value the Frontier Lab companies actually deliver long term, if every industry, cursor for coding, now fin for CS, can build better and cheaper specialized models from open source bases. And interestingly, this wasn't the only story around these themes. Decagon co-founder Ashwin Shrinivas writes, over 80% of model traffic at Decagon now runs on models we've trained in-house, structured as a network of specialized models handling different parts of the interaction. Now, this is a little bit different because there is actually an architectural change here. In their announcement host, they write,

Starting point is 00:22:04 Instead of relying on a single model, we built a network of specialized models each responsible for a specific part of the interaction, detection, orchestration, response generation, and evaluation. That separation lets us optimize each layer independently and drive better speed and quality across the system. Regardless, though, the point is that here you have another company that is shifting off reliance on the major close foundation models and towards models that they've trained, at least in part, themselves. Chakar says, I think this is a trend we'll see going forward. The reliance on general-purpose frontier models will hit a wall for domain-specific tasks. Custom post-training pipelines will be the way forward. Clem DeLang from Hugging Face agrees, writing,

Starting point is 00:22:42 after Pinterest, Airbnb, Notion, Cursor, today it's UN and Intercom publicly sharing that they're finding it better, cheaper, faster, to use and train open models themselves rather than use APIs for many tasks, and hundreds of other companies are doing the same without sharing. Ultimately, I believe the majority of AI workflows will be in-house based on open source versus API. It took much more time than we anticipated, but it's happening now. Now, obviously, if this is the case, there are significant business model implications. Adriana Sabato writes,

Starting point is 00:23:10 The API tax is starting to look like the cloud markup of 10 years ago. Once teams realize they can run fine-tuned open models for a fraction of the cost, the switch becomes obvious. Ewan from Intercom agrees that this is the beginning of something bigger, writing a companion post called The Age of Vertical Models is here. He reinforces that the model just is better across numerous dimensions. It has a 2.8% higher resolution rate, but he writes, importantly, it's also dramatically faster, has fewer hallucinations, in fact a 65% reduction in hallucinations, and is far cheaper

Starting point is 00:23:41 than all other available models. In his post, Ewan referenced the recent interview with Andre Carpathy, where Carpathy said, I do think we should expect more speciation in the intelligences. The animal kingdom is extremely diverse in the brains that exist, and there's lots of different niches of nature, and I think we should be able to see more speciation. and you don't need this oracle that knows everything, you kind of speciate it, and then you put it on a specific task. And we should be seeing some of that because you should be able to have much smaller models that still have the cognitive core. From there, Ewan picks up, the frontier labs still have the very best models, but the open weight models are not that far behind.

Starting point is 00:24:17 So it's not hard to see pre-training as a commodity of sorts. Where we think the frontier will move next is to post-training. Carpathy's prediction is exactly what we're seeing with Apex and Cursor's Composer 2, and what we're going to see significantly going forward. As such, the labs are in an interesting position where on one hand, the horizontal general-purpose models are actually over-serving the market for specific use cases, e-g., their models are more generally intelligent than is needed for customer service,

Starting point is 00:24:41 and on the other hand, the open-weight models are more than good enough where high-quality domain-specific post-training can make the resulting model superior at the special-purpose jobs and in the way that matters to that particular job. Personally, I'm still very bullish on the labs. We remain very heavy customers of Anthropic, Yet classic disruption is now at their door. The only way out is to disrupt themselves by building cheaper specialized models too.

Starting point is 00:25:03 And the only way to do that is to acquire the evals, or the companies with the evals, needed for that specific task. Which means there will be some interesting data partnerships or M&A consolidation, and you're going to see some hyper-specific model providers who go it alone and compete with the labs head-to-head. Likely, all of the above. Now, going back to the bitter lesson, it kind of feels at first glance, like this would run counter to that, right?

Starting point is 00:25:27 that in the long run, the sheer additional volume of computational data should beat out the specialized knowledge and data of the edge providers. Except the bitter lesson isn't just about the amount of data. It's about brute force data and compute as opposed to human knowledge. But we're not exactly talking about human knowledge here. Instead, we're talking about experience. The data that a cursor has, or an intercom has, is not the data of some human expert.

Starting point is 00:25:52 Instead, it's millions of interactions which show how things actually happen in the real world. It turns out that Richard Sutton himself actually discussed this very thing as an example of the next phase of the bitter lesson on the Dwarkesh podcast last year. Will they reach the limits of the data and be superseded by things that can get more data just from experience rather than from people? In some ways, it's a classic case of the bitter lesson. With the more human knowledge we put into the large language models, the better they can do. and so it feels good. And yet, one, well, I in particular expect there to be systems that can learn from experience.

Starting point is 00:26:33 And those could well perform much, much better and be much more scalable, in which case it will be another instance of the bitter lesson, that the things that used human knowledge were eventually superseded by things that just trained from experience and computation. Putting it simply, this new model apex, Composer 2, are post-trained from experience, exactly as Sutton said. Now, this might feel like an inside baseball kind of story, but I think that the implications could be massive in terms of how the whole industry evolves.

Starting point is 00:27:07 One thing I don't think that this means is that every company that has any sort of customer data is all of a sudden going to be successfully able to spin their own model. There are ultimately not that many people who are good at doing post-training, and so I don't think that we're going to see this massive fragmentation of vertical models, but you better believe that these results are encouraging enough that many, many more companies who do have this type of data, and the post-training talent or the ability to get it are going to be doing some experimenting in this area. It's something we will continue to watch and explore, but for now, that's going to do it for today's AI Daily Brief. Appreciate you listening or watching,

Starting point is 00:27:38 as always, and until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Anthropic Accidentally Revealed Their Most Powerful Model Ever

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.