The AI Daily Brief: Artificial Intelligence News and Analysis - Apps vs Models: Who Wins AI?

Episode Date: November 14, 2025

Today’s episode examines the core debate shaping the AI industry: whether application-layer companies can survive the pace and instability of the model layer. The discussion covers the arguments tha...t apps can’t outrun rapid model shifts, the counter-case for deep vertical products, and what Cursor’s momentum reveals about where durable value might emerge. The episode also includes a fast headlines sweep on agentic cyber-espionage, major infrastructure investments, breakthrough agents, and the latest updates to GPT-5.1.Brought to you by:KPMG – Discover how AI is transforming possibility into reality. Tune into the new KPMG 'You Can with AI' podcast and unlock insights that will inform smarter decisions inside your enterprise. Listen now and start shaping your future with every episode. ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.kpmg.us/AIpodcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Rovo - Unleash the potential of your team with AI-powered Search, Chat and Agents - ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://rovo.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, a massive fundraising round for cursor and what it says about app layer companies versus the model layer. Before that in the headlines, welcome to the agentic hacker age. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick note before we dive in. First of all, thank you to today's sponsors, KPMG, Blitzy, robots and pencils and robo. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can sign up
Starting point is 00:00:35 at Apple Podcasts. to learn about sponsoring the show or anything else, including job opportunities, speaking gigs, etc., visit us at AIDailybrief.aI.i. And of course, while you're there, check out the AI-R-OI benchmarking study.
Starting point is 00:00:48 At this rate, we are going to put together one of the biggest collections of information about actual ROI for actual use cases. If you want to get the full version of the study, come and share which use cases are driving the most value for you. You can get all of that at ROISurvey.a.i.
Starting point is 00:01:03 Now, with that, let's get into some very interesting conversations to close out our week. Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. We kick off today with a story that could very easily be a main episode. Anthropic say they've thwarted the first reported use case of AI-enabled or really agentic cyber espionage.
Starting point is 00:01:22 In mid-September, Anthropic detected suspicious activity that was later determined to be a, quote, highly sophisticated espionage campaign. The company said that they have high confidence that the threat actor was a Chinese state-sponsored hacking group. The unprecedented part was that the group, didn't just use AI for planning, Claude's agenda capabilities were used to carry out the attack. The hackers reported that used Claude Code to automate an infiltration of 30 global targets with a small number of successes. The targets were organizations like large tech companies,
Starting point is 00:01:52 financial institutions, chemical manufacturers, and government agencies. Anthropic monitored this activity across 10 days, banned accounts as they were identified and coordinated with authorities as appropriate. They said that Claude Code was able to perform 80 to 90 percent of the attack, with human intervention only required during a handful of key decision points. This allowed the attack to be carried out at a speed that would have been impossible for human hackers. Claude's guardrails were circumventing the attack into smaller tasks, which each seemed innocent but added up to a massive system breach. In their post-mortem Anthropic wrote,
Starting point is 00:02:22 this campaign has substantial implications for cybersecurity in the age of AI agents, systems that can be run autonomously for long periods of time, and that complete complex tasks largely independent of human intervention. agents are valuable for everyday work and productivity, but in the wrong hands, they can substantially increase the viability of large-scale cyber attacks. Anthropic believes this issue will grow as AI models become more capable, so they're expanding their detection capabilities.
Starting point is 00:02:47 They wrote, With the correct setup, threat actors can now use agentic AI systems for extended periods to do the work of entire teams of experienced hackers. Less experienced and resourced groups can now potentially perform large-scale attacks of this nature. They further noted that this is an escalation of the vibe hacking findings they reported over the summer, as those incidents still had, quote, humans very much still in the loop directing the operations. Sure, this is a topic that we will be hearing a lot more about
Starting point is 00:03:11 in the months to come. But one other story from Anthropic in a very different dimension of their work, they are joining the infrastructure buildout announcing a $50 billion commitment for U.S. data centers. Up until now, Anthropic has been a renter of compute, getting most of their access through partnerships with Google and Amazon. On the financial side, this hasn't been a big problem, allowing Anthropic to functionally spend equity instead of cash on their largest expense during their early growth phase. But it has come with trade-offs. At certain points, Anthropic has been required to use in-house chips from Amazon and Google, when they might have preferred to be using Nvidia's GPUs. They've also been repeatedly bottlenecked by compute, leading to severe rate limits that hampered customer retention at times. With this year's rapid growth,
Starting point is 00:03:49 Anthropic has stepped up to another echelon, and consequently they're looking to own some of their own infrastructure. The announcement discussed several sites to be built across the U.S., including in Texas and New York. UK-based data center developer Fluidstack will partner on the project with the expectation that the data centers will start coming online next year. Anthropics spoke about the project in terms of the administration's AI goals, saying it was about, quote, maintaining American AI leadership by strengthening domestic technology infrastructure. CEO Dario Amade said in a statement,
Starting point is 00:04:16 we're getting closer to AI that can accelerate scientific discovery and help solve complex problems in ways that weren't possible before. These sites will help us build more capable AI systems that can drive those breakthroughs while creating American jobs. Now, speaking of $50 billion, that is also the reported valuation from an upcoming fundraising round for Miramirati's Thinking Machines Lab. According to Bloomberg reporting sources, the deal terms haven't been finalized, and some sources said the round could close at 55 or even 60 billion. For those keeping track at home, that would be a very quick Forex from TML's $12 billion valuation from their fundraising round in July. The new valuation would catapult TML to become one of the most valuable private companies ever less than a year from launch.
Starting point is 00:04:56 For some quick comparisons, Stripe's most recent mark in secondary markets is around $106 billion, Databricks recently raised at $100 billion, and Canva reportedly marked up to $42 billion during a tender offer to employees in August. Now, it is true that TML is no longer a pre-product company with the release of their reinforcement learning platform Tinker last month, but they are still pre-revenue and haven't really established a clear business model or even a firm product niche. Sources said that Tinker is being used by several university research groups as well as some paying enterprise customers, but this valuation certainly isn't going to be based on anything like revenue forecast or anything like that. As with earlier rounds, it's a bet on talent, with TML boasting a stacked roster of some of the best AI researchers drawn from OpenAI, DeepMind, and other labs. Really the only comp that truly makes sense is Ilias Sutskiver's safe superintelligence, which is also a pre-product bet on talent.
Starting point is 00:05:42 SSI established a $32 billion valuation in April. Moving over into product land, Google has added deep research to Notebook LM. Now, Notebook LM has already proven to be one of the most interesting in popular. tools in AI, but until now, the way to get the best results was pretty manual. Google says the addition of deep research will allow users to automate the process of putting together source documents, allowing Notebook LM to function more like an AI research assistant. Their example videos showed a user simply typing in latest breakthroughs in quantum physics and setting the agent to work. Come back a few minutes later, and notebook has an entire dossier
Starting point is 00:06:14 ready to read or transform into a podcast or video slide deck. Speaking of video slide decks, in addition, Notebook LM has introduced the ability to prompt custom styles for video overviews. They showed a variety of different styles like 8-bit pixelated art, pop art, turn-of-the-century art nouveau, and these are firmly in that category of app updates, which aren't about some underlying model improvement, but about making a product simply more aligned with what its users need from it. Still, that wasn't Google's biggest launch of the day, DeepMind has released an agent called Seema 2 as a research preview. Sima, which stands for scalable, instructable multi-world agent, was described by DeepMind's CEO
Starting point is 00:06:51 Demasasasabas as a general agent that can understand and reason about complex instructions and complete tasks in simulated game worlds, even ones it has never seen before. He continued, incredible to see how it can just learn from self-play, a crucial step towards AGI. Now, the first version of Sima was released in March of 2024 and was fairly primitive. It learned to complete some simple tasks like following instructions like turn left, climb the ladder, or open the map across a wide range of video games. It had a total of 600 different instructions it knew how to follow. The most interesting part about that result was that the agent could take what it learned from training conducted in one game and apply it to a game it had never seen before. Over DeepMind's total eval set, Seema 1 had just a 31% success rate and the rate plummeted to just a couple of percentage points on games it hadn't seen before.
Starting point is 00:07:36 Seema 2 has demonstrated a dramatic improvement in task completion. It has a 65% success rate across the eval set, which is starting to get pretty close to the human level of 76%. On games the agent hadn't seen before, it achieved around a 13% success rate. The ability to generalize across different environments is one of the reasons many researchers are looking to world models as one of the keys to AGI. DeepMind even tested how Seema 2 would perform an entirely novel game that were generated on the fly by their Genie 3 World Simulation model. Seema 2 is able to orient itself, understand instructions, and take meaningful actions
Starting point is 00:08:08 towards a goal, despite never having seen the environment before. Super interesting and firmly in this theme of alternative paths to AGI that we'll be increasingly spending time on. Lastly, a couple quick follow-up notes to GPT-5. 5.1, it is now available via the API, and OpenAI has also published a prompting guide to help developers migrate their use cases. The guidance actually reveals a lot about the design decisions made for this model update. For example, OpenAI suggested 5.1 has a tendency to be too verbose in providing an answer. They suggested it's worthwhile giving specific instructions about how
Starting point is 00:08:38 much detail you want to be contained in the outputs. The guide also noted that the model is much more steerable than previous iterations, so developers can dial in very specific behaviors when it comes to agents. I'm continuing to have great early experiences with GPT 5.1, and I'm excited to see what you guys think of it. For now that that is going to do it for the headlines. Next up, the main episode. What if AI wasn't just a buzzword, but a business imperative?
Starting point is 00:09:07 On You Can with AI, we take you inside the boardrooms and strategy sessions of the world's most forward-thinking enterprises. Hosted by me, Nathaniel Wittamore, and powered by KPMG, this seven-part series delivers real-world insights from leaders who are scaling AI with purpose. from aligning culture and leadership to building trust, data readiness, and deploying AI agents. Whether you're a C-suite executive, strategist, or innovator, this podcast is your front row seat to the future of Enterprise AI. So go check it out at www.kpmG.org.us slash AI podcasts or search you Penn with AI on Spotify, Apple Podcasts, or wherever you get your podcasts. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Content.
Starting point is 00:09:50 Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise-scale codebases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform, bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously, while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzy as their pre-IDE development tool, pairing it with their coding pilot of choice to bring an AI-Native SDLC into their org.
Starting point is 00:10:27 Visit blitzy.com and press get a demo to learn how Blitzy transforms your SDLC from AI-assisted to AI native. AI isn't a one-off project. It's a partnership that has to evolve as the technology does. Robots and pencils work side by side with clients to bring practical AI into every phase, automation, personalization, decision support, and optimization. They prove what works through applied experimentation and build systems that amplify human potential. As an AWS-certified partner with global delivery centers, robots and pencils combines reach with high-touch service.
Starting point is 00:10:59 Where others hand off, they stay engaged, because partnership isn't a project plan. It's a commitment. As AI advances, so will their solutions. That's long-term value. Progress starts with the right partner. Start with robots and pencils at robots and pencils.com slash AI Daily Brief. Meet Rovo, your AI-powered teammate. Rovo unleashes the potential of your team with AI-powered search, chat, and agents, or build your own agent with Studio. Rovo is powered by your organization's knowledge and lives on Atlassian's trusted and secure
Starting point is 00:11:31 platform, so it's always working in the context of your work. Connect Robo to your favorite SaaS app so no knowledge gets left behind. Rovo runs on the teamwork graph, Atlassian's intelligence layer that unifies data across all your apps and delivers personalized AI insights from day one. Robo is already built into Gira, Confluence, and Gira Service Management Standard, Premium, and Enterprise subscriptions. Know the feeling when AI turns from tool to teammate. If you rovo, you know. Discover Rovo, your new AI teammate powered by Atlassian. Get started at ROV as in Victory, oh.com. Welcome back to the AI Daily Brief. One of the big news items to end out this week
Starting point is 00:12:16 was that AI coding startup cursor just raised a fresh $2.3 billion at a $29.3 billion valuation. Now, that sort of rarefied error is a valuation that so far has been exclusively for the model companies. And so what's interesting to me about it is not just to explore the fundraising and isolation,
Starting point is 00:12:35 but as a representative example of how people are thinking about the battle between the application layer and the model layer. You might have seen this tweet floating around this week. It comes from investors and entrepreneur Yashan and got 20 million views this week for what is ultimately sort of an insider
Starting point is 00:12:51 baseball type of conversation. This is the foundation of our entire conversation, so let's read what he has to say and then break it down a little bit. Yashan writes, my AI investment thesis is that every AI application startup is likely to be crushed by rapid expansion of the foundational model providers. App functionality will be added to the foundational model's offerings because the big players aren't slow incumbents. It is wrong to apply the analogy of fast startup slow incumbent here. They're just big. Far more so than with any other prior new technology, there is a massive and fast-moving wave
Starting point is 00:13:22 that obsoletes every new app almost as fast as it can be invented. There is almost no time to build a company and scale it. Wong continues, there are two ways AI application startup founders can make money. One, make a flash-in-the-pan app that generates a ton of cash and bank the cash. My estimate is that you have about 12 to 18 months of cash flow generation. Or two, make it good enough app that you get acquired by one of the big players for sufficient equity. The situation is highly unstable.
Starting point is 00:13:48 We don't know if it's going to crash or go to the moon, but both scenarios make it very unlikely that any AI application startup will independently become a generational super company. The best odds are finding an application niche in a highly specialized field with extremely unique and specific data barriers, ideally ones related to real atoms,
Starting point is 00:14:05 hardware or world-related data and not software and finance. So the key elements of the argument here are one, that foundation model providers will eat the app layer, basically that we have to throw out our old heuristics around slow incumbents versus fast startups, because the incumbents here are driving disruption at extreme speed. The second point, however, which he gets into in a follow-up post on his own thread, is that the foundation is too unstable to build lasting app businesses.
Starting point is 00:14:30 So, Yishan continues in a subsequent post, the entire novelty of this thesis is that unlike in the past, specific elements of the AI industry are likely to make it so that application companies cannot outrun the wave of obsolescence, which will rush along far, far more quickly than prior technology waves. The foundational technology has not stabilized in any way whatsoever, and applications require a sufficiently stable foundation for some extended period of time in order to create value and then a system for monetizing that value. The wholesale rate of change in the nature of the foundation is the reason why I think almost all application startups will not survive to achieve any significant scale, not because the current large players
Starting point is 00:15:05 are special. So this is the nuance that it would be easy to lose in this conversation. What he's really talking about is a speed of change argument, and he's effectively arguing that app startups will get overtaken by seat changes before they can become real businesses, and that it's not that the big labs are quote-unquote better in any specific way, but that only they have enough internal stability and resources to survive the chaos that they themselves are creating. He concludes in his second post, seed changes are now happening on a nine to 12-month cycle. Very few startups can turn into a mature business in that time frame, and by mature, I mean having all the boring stuff like sales relationships and brand recognition. Yes, your engineers can make
Starting point is 00:15:42 the change, but human hiring cycles and team solidification and market relations are incompressible, e.g., if you hire 100 people a month, your organization will implode. Thus, application companies never quite make it to a full business threshold before the sea change happens out from under them. When I say the incumbents will take the application space, I mean that they're the only ones who can provide enough internal stability and resources to survive the sea changes they themselves will be driving, not that they're going to provide a superior product. They're just the ones who won't starve. So like I said, this had 20 million views and generated a huge amount of conversation,
Starting point is 00:16:15 both on the post and even in other channels like LinkedIn. So let's talk first about the people who thought that Yashan was wrong in some fundamental way. Many of these themes can be sort of bundled into the idea that vertical apps, workflows, or U.S. still matter hugely. David Roberts writes, I think you're underestimating how much unique UX, context engineering, integrations, human in the loop, and embedded workflows need to exist for any vertical business application to actually get from 70% decent to 100% outcomes with AI. Vertical applications are
Starting point is 00:16:44 going to be enormous and they will not be eaten by the foundational model providers. Now, implicit in David's argument is that the stuff that it takes to make a vertical application, specifically for business, a B2B application work, is so immense and complex that it's just not in the incentive of the foundation model companies to do that. And certainly this is a point that I resonate with, seeing how much last mile integration work it takes for a very power. powerful AI tool to be actually useful inside the context of a business. Now, Yashan actually responded to this one saying, your reasoning here supports my thesis rather than undermining it.
Starting point is 00:17:17 What I think he means is that there's going to be so much change so fast that the app layer companies aren't going to be able to survive long enough to do that sort of complex last mile work that David is talking about, ultimately leaving it only to the foundation model companies, even if they don't prioritize it in the short term. Aaron Levy from Box, who's one of the most thoughtful thinkers when it comes to Enterprise AI says the counter dynamic to the AI model doing everything is that, at least in the enterprise, bridging the AI model's capabilities to the
Starting point is 00:17:42 customer's environment still requires a tremendous amount of long-tail work. The gap between an AI agent working for 90 or 95% of the solution and 100% is usually about 10x more work than most realize. So here you see Aaron reinforcing many of the themes from David's post. He continues getting access to the enterprise data, connecting to the enterprise workflows, delivering the change management that employees need to adopt the technology, handling the regulatory and compliance requirements of that industry and so on, all require some degree of highly dedicated focus in a domain. Others argue that Yashan might be underestimating the new types of moats that could be formed. Investor Natasha Malpani writes, I'd say the opposite. The real white space is at the application
Starting point is 00:18:20 layer. Everyone wants to sell shovels, but the gold is in how people actually use them. The infarice is a knife fight between hyperscalers, open AI, Google, Anthropic, meta, Amazon. They'll undercut each other on price, latency, context window, and token cost until margins collapse. developer tooling looks safer but it's crowding fast and every improvement gets absorbed upstream by the foundation models or downstream by open source forks. Meanwhile, applications are where behavioral modes form. Data isn't the only barrier. Habits are.
Starting point is 00:18:48 Users don't live in APIs or eval dashboards. They live in experiences. Context, workflow, brand, and trust compound fast. Distribution and feedback loops create data advantages that scale locally even when models converge globally. You win if you own feedback surface to capture every edit, action, and intent, build domain depth and embed in daily workflows collect proprietary exhaust, behavior and telemetry that the model providers will never see. Some info will break through, security, evals, low latency, edge, compliance,
Starting point is 00:19:13 but the broader white space is still at the application layer, where people, agents, and systems actually interact. Go deep enough that a foundation model can't care and sticky enough that users won't leave even when it can. Now again, I really want to double click on this foundation model can't care piece. A huge amount of the work that is required right now for AI applications to work inside enterprises is work that foundation models do not have the luxury of caring about. It is simply too much complex, boring, repetitive, but still customized to the customer work, which is why that outside of the foundation model companies, the firms that have done the best from the AI boom are the big systems integrators and consulting firms.
Starting point is 00:19:53 The fact that the foundation model companies have to compete on other vectors creates a window of opportunity for a different category of company to swoop in and do the work that it takes to actually bring these solutions to market and practice. Now, the other point from Natasha that I want to really double-click on is this idea of proprietary exhaust. For those of you who don't live in Silicon Valley jargon, that paragraph might have seemed really dense. Let's read it again.
Starting point is 00:20:17 You win if you own feedback surface to capture every edit, action, and intent, build domain depth, embed in daily workflows, collect proprietary exhaust, i.e. behavior and telemetry that the model providers will never see. Exhaust is the data that comes out of the usage of a product. And many of the folks that are most excited about the application layer when it comes to AI have a thesis that when it comes to improving model performance, this type of behavioral exhaust is the real gold
Starting point is 00:20:47 because it's the only thing that's not commoditized to everyone else. In other words, the foundation model companies all have access to the exact same trading data more or less, or some version of the same trading data. but a company that gets enough usage can create a feedback loop where they actually see how people are interacting with the models and that data stream can be used to refine how the model and also the experience that the model lives in works. This is going to be particularly relevant to our example of cursor, which will come to in a moment. Still, even with all of these arguments for why Yushan's thesis might be wrong or at least limited,
Starting point is 00:21:20 there's a big overlap in the Venn diagrams between these two camps that I think would acknowledge that many AI apps are just flimsy wrappers and that the real winners are likely to be the deep autonomous systems. Jacques Reynolds writes, most new AI apps aren't defensible. They're just UI rappers on top of someone else's model. The mode disappears the moment open AI or anthropic ships the same feature natively. The real upside isn't in building another AI app, in my opinion. I think it's in implementing AI inside existing business workflows where data, context, and customer relationships create real barriers. Chong call builds this sees us out even farther. He writes, the issue isn't that foundational models will kill application startups.
Starting point is 00:21:58 It's that most AI applications today aren't really applications. They're shallow automations built to impress investors on a six-month time frame. He basically makes a comparison to early SaaS and says today the same story is repeating with AI agents, duct tape workflows, zero defensibility, no reliability at scale. But the core question hasn't changed. Who's building a system that delivers real value repeatedly, reliably, and autonomously? So the implication of this is that if you are building an application, You have to build it deep, you have to be hands-on, you have to be in a position to actually
Starting point is 00:22:29 capture that behavioral exhaust data. Now Fall writes, I think even if a new application starts on this constantly evolving base, it can endure if it embeds itself in existing workflows, rights to proprietary systems of record, builds proprietary data, and learns from usage and or captures distribution before incumbents bundle the feature. More importantly, AI wrappers that continue to swiftly ship features that solve users need, even as competition arrives, are difficult to compete with even for the foundation models. And so, again, I think that you're starting to see the through line here that acknowledges
Starting point is 00:23:01 the incredible speed at which things are changing and the new challenges that creates for the app layer, as well as the innovation capability of the big foundation model companies, but still sees this core path for some number of extremely high-performance application layer companies. And indeed, a lot of the responses was about what it takes to be one of these actually successful application layer companies. Sarah Katanzaro writes, My AI investment thesis is that AI application startups will need to solve research and engineering problems that the labs are not currently focused on, thereby accumulating more technical defensibility. At times, their objectives may even diverge. We already see this in creative
Starting point is 00:23:38 industries, where post-training alignment impedes the ability of models to produce diverse outputs. It will be hard to survive since the app companies will also need to define compelling workflows and user experiences, but with the right team and support, some, but not all, will make it. A16Z's Anisha Shara writes about a few approaches that he thinks advantage app layer startups. The first are categories that benefit from being multimodal, basically where the experience for the end customer is better if they can access models from different providers, cornered resources, those locked proprietary datasets, and ecosystems that, quote, imply a ton of feature surface area.
Starting point is 00:24:13 He gives the example of granola. Sure, you can replicate granola's recorder, but is open AI really going to build the entire ecosystem of productivity apps implied by it? Now, regardless of what we all think about this, the reality is that money is still pouring in. The information, for example, recently published a piece called investors chase Neo Labs to outflank OpenAI and Anthropic. They point out that over the last month, those investors have made or discussed $2.5 billion of investments into just five startups. The information writes, the NeoLab startups founders say they hope to exploit new approaches to developing AI models and research. They say major developers like OpenAI and Anthropic may have overlooked.
Starting point is 00:24:49 And that brings us to the cursor part of the story. Now, Cursor is, of course, one of the big breakout leaders of the last year. When in the story of 2025 is written, AI coding will be at the very top of the narratives, and one need look no further than the valuation jumps of Cursor to see just how big a deal, at least investors, are treating that whole theme as. The company has raised $2.3 billion in a new round that values them at $29.3 billion. That is close to triple their $9.9 billion valuation from their C in June. and a 12x compared to their valuation from the beginning of the year.
Starting point is 00:25:24 In addition to the funding, Curser also announced that they've reached a billion dollars in ARR and that they now produce more code than any other coding agent. Yucheng Jin did the research and commented, Cursor is almost certainly the fastest company in history to reach a billion dollars in ARR, achieving this milestone in a little over two years. He added, and let's see if you can spot the connection to our broader theme today, people said Cursor would go to zero because it's just a wrapper. AI products won't be monopolized by model labs in my opinion.
Starting point is 00:25:51 One, products win by delivering real user value. Model capability alone isn't enough. Two, once they hit product market fit, companies can train their own models, often based on open source models combined with their own unique data and RL environments. Cursor's Composer 1 is an example. Now, Composer, which is Cursor's proprietary model, seems central to their business strategy moving forward. They said that they intend to use this fresh capital to invest further in developing Composer.
Starting point is 00:26:17 The Wall Street Journal framed this raise, in fact, as being a test case to see if app layer startups can transition away from relying on the foundation model companies. They noted that both OpenAI and Anthropic are now directly competing with Cursor. When asked about this, Cursor CEO and co-founder Michael Truill gave a diplomatic response, stating, we're excited to be one of the first examples of a large company built on their platforms. All of the AI labs are important partners to us. But clearly, Composer, their unique model is top of mind.
Starting point is 00:26:43 Trudel said, it does take significant resources, both specialized talent and also GPUs, to do something at composer's scale. This funding lets us do it in a big way. Cursor also showed just how much the model environment is changing. Back in April, the most popular models on Cursor were Claude 3.7 Sonnet, Gemini 2.5 Pro, Claude 3.5 Sonnet, and then in fourth and fifth place, GPD 41 and GPD40. The fastest growing in April were 03, 04 Mini, and Deepseek version 3.1. Today the most popular models are in the first place, Sonnet 4.5, in the second place, Composer 1, and then after that, GPT5, GPD5 Codex, and Sonnet 4.
Starting point is 00:27:23 The fastest growing, however, is Composer 1. All of which brings us to an interesting point about where this then diagram between the app layer and the model layer overlaps, which is at some point, do the handful of app layer companies that can break through and reach the scale to survive, just become model companies themselves. That certainly seems to be part of the direction here with Cursor, and I think will be an interesting thing to watch. Anyways, it's a fascinating discussion, and I think if you take away anything, it just shows that right now things are changing so fast that even the people whose entire job it is to watch and understand and allocate against these movements don't really
Starting point is 00:27:58 have any idea what's happening. We are all just students with the very fast-spinning world, our teacher. For now, that's going to do it for today's AID Daily brief. Appreciate you listening or watching as always. Until next time, Peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.