Tech Brew Ride Home - Is OpenAI In Trouble?

Episode Date: November 25, 2025

Now Anthropic says it’s leapfrogged OpenAI with its new model and is the AI horserace in play? OpenAI is still focusing on things like shopping. Nvidia answers a question people weren’t asking. An...d is Google soaring because they also might be able to go after Nvidia’s chip throne? Anthropic introduces cheaper, more powerful, more efficient Opus 4.5 model (Ars Technica) ChatGPT's new shopping research tool is fast, fun, and free - but can it out-shop me? (ZDNet) Nvidia’s ‘I’m Not Enron’ memo has people asking a lot of questions already answered by that memo (The Verge) Google Further Encroaches on Nvidia’s Turf With New AI Chip Push (The Information) Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 It's peak pollination season, and my business is scaling fast. To keep the nectar flowing, I need a phone plan with top priority data speed. That's why I chose GoogleFi Wireless. My connections stay strong even when the hive is buzzing. Plus, unlimited plans start at $35 a month. Now, that's a deal that doesn't stay. Explore Google Fi Wireless plans today. Plus taxes and government fees.
Starting point is 00:00:24 GoogleFi Wireless is not subject to data traffic deprioritization during times of high network usage. Welcome to the Tech Brew Ride Home for Tuesday, November 25th, 2025. I'm Brian McCullough today. Now Anthropics says it's leapfrogged OpenAI with its new model and is the AI horse race in play. Open AI is still focusing on things like shopping. Invita answers a question people weren't asking and is Google soaring because they also might be able to go after Nvidia's chip thrown. Here's what you miss today in the world of tech. I know it's getting a bit repetitive announcing a new model every other week, but given the recent heavy, discourse over open AI may be losing the lead in terms of the AI cutting edge. We got to make note of this. Anthropic has launched Claude Opus 4.5, saying it is the best model in the world for coding, agents, and computer use, and meaningfully better at everyday tasks. Quoting ours, Technica, perhaps the most prominent change for most users is that in the consumer app experiences, web, mobile, and desktop, Claude will be less prone to abruptly hard-stopping conversations because they have run too long.
Starting point is 00:01:39 The improvement to memory within a single conversation applies not just to Opus 4.5, but to any current Claude models in the apps. Users who experienced abrupt endings, despite having room left in their session and weekly usage budgets, were hitting a hard context window, 200,000 tokens. Whereas some large language model implementations simply start trimming earlier messages from the context when a conversation runs past the maximum in the window, Claude simply ended the conversation rather than allow the user to experience an increasingly incoherent conversation where the model would start forgetting things based on how old they are.
Starting point is 00:02:14 Now, Claude will instead go through a behind-the-scenes process of summarizing the key points from the earlier parts of the conversation, attempting to discard what it deems extraneous while keeping what's important. Developers who call Anthropics API can leverage the same principles through context management and context compaction. Opus 4.5 is the first model to surpass an accuracy score of 80%, specifically 80.9% in the SWE Bench-Verified benchmark, narrowly beating Open AIs recently released GPT-5.1 Codex at 77.9% and Google's Gemini 3 Pro at 76.2%. The model performs particularly well in agentic coding and agentic tool-use benchmarks, but still lags behind GPT 5.1 in visual reasoning. Anthropic also claims that Opus
Starting point is 00:03:01 4.5 is far less susceptible to prompt injection attacks than prior clod models or than competing models like GPD 5.1 and Gemini 3 Pro. Still, none of these models has perfect performance on that front. While the improvements to performance in benchmarks are worth noting, the most meaningful improvement in Opus 4.5 is arguably that it is significantly more efficient with tokens. Anthropics blog post offers examples. Set to a medium effort level, Opus 4.5 matches Sonnet 4.5's best score on an SWE bench verified, but uses 76% fewer output tokens.
Starting point is 00:03:34 At its highest effort level, Opus 4.5 exceeds Sonnet 4.5 performance by 4.3 percentage points while using 48% fewer tokens. The Opus 4.5 launch is accompanied by other new features for developers and users. For example, the developer platform now includes a new effort parameter, allowing developers to more precisely tune the balance they want between efficacy and token usage. Also, Claude Code is now available in the desktop cloud apps. Previously, it was available via command line, IDE extensions, and the web. A few places, just not in the native desktop apps.
Starting point is 00:04:09 The cloud desktop interface is now tabbed between the traditional chat experience and the cloud code experience. And lastly, and for some, most importantly, there's a big pricing change for the API for Opus 4.5. The cost is now $5 input and $25 output per million tokens down from 15 and 75, respectively, end quote. And quoting venture beat. Anthropics' internal testing revealed what the company describes as a qualitative leap in Claude Opus 4.5's reasoning capabilities. The model achieved 80.9% accuracy on the SWEB bench verified, a benchmark measuring real-world software engineering tasks, outperforming GPT5.1,
Starting point is 00:04:48 Anthropics' own sonnet, and Google Gemini 3 Pro, according to the company's data. The result marks a notable advance over OpenAI's current state-of-the-art model, which was released just five days earlier. But the technical benchmark tells only part of the story. Albert says employee testers consistently reported that the model demonstrates improved judgment and intuition across diverse tasks, a shift he described as the model developing a sense of what matters in real-world contexts. The model just kind of gets it, Albert said. It just has developed this sort of intuition and judgment on a lot of real-world things that feels qualitatively like a big jump up from past models. He pointed to his own workflow as an example.
Starting point is 00:05:26 Previously, Albert said he would ask AI models to gather information but hesitated to trust their synthesis or prioritization. With Opus 4.5, he's delegating more complete tasks, connecting it to Slack and internal documents to produce coherent summaries that matches priorities. The new model also scored higher than Anthropics' most challenging internal engineering assessment than any human job candidate in the company's history, according to materials reviewed by Venture Beat, end quote. Again, a lot of chatter about Open AI maybe falling behind in the horse race over the last week. Maybe Open AI has taken its eye off the ball. We were talking yesterday how they made their models arguably worse in an effort to increase user engagement. And they've been rolling out stuff like what I'm about to tell you, which, while interesting, the argument people are making is, you know, they only have so much in the way of resources.
Starting point is 00:06:26 Open AI has unveiled a free shopping research feature in ChatGPT that delivers a personalized buyer's guide powered by a custom version of GPT5 Mini. Now, this does sound interesting, but again, maybe they need to focus on staying cutting edge, quoting ZDNet. Similar to deep research when prompted with a product description, ChatGPT will now sift through the internet to put together a guide for you. It will also ask you a series of clear. clarifying questions using the context from past conversations and considering product reviews to
Starting point is 00:07:01 develop your guide. Shopping research is designed to act as an assistant that can create a personalized shopping experience tailored to your specific criteria and needs in just a few minutes. Opening eye said research outputs can help with a variety of different tasks, including finding a product that meets specific criteria, for example, like, Help Me Find a Smartphone with 18 plus hours of battery life under $1,500. Other examples include finding dupes or lookalikes of a comparing different products with a detailed trade-off list that is catered toward your specific needs, finding product deals and helping you choose gifts for people on your list. The entire experience is powered by a version of GPT5 Mini that was trained specifically for shopping tasks according to
Starting point is 00:07:42 OpenAI. The company said that it was trained to read trusted sites, site reliable sources, and synthesize information across many sources as well as refined its prompts in real time. When compared to other chat GTPT models such as GPT5 thinking or chat GPT search, shopping research leads in product accuracy. Yet OpenAI acknowledges that it occasionally makes mistakes about product details such as pricing and availability and recommended that users always double-check its work. I found the experience of using it to be interactive and intuitive to get started, all logged in chat GPT users, including those with free, go, plus, and pro plans,
Starting point is 00:08:18 can either ask a shopping question which will automatically activate the feature or select the shopping research option from the menu in the text box. In your first prompt, describe what you want it to do for you, then chat GPT will follow up with questions pertinent to your search, such as your budget or the features that are important to you. It will also use the context it knows about you if you have those personalization toggles on to tailor the response toward you. As it conducts the research, it will display sample products it is found. With every product you can indicate whether you are interested or not, and why you made that decision guiding the research further. This was my favorite part of using the feature, as it felt like an engaging Tinder-like experience
Starting point is 00:08:55 where you can quickly click through to indicate whether you like or dislike. Then, after a few minutes, it will provide you with a personalized buyer's guide that includes the top products, comparisons, and links that take you directly to the retailer's website to place the order. In the future, the company plans to integrate this feature into the instant checkout experience enabling you to make purchases directly on the site. Open AIs said that user chats are never shared with retailers and that the results are generated organically based on publicly available websites.
Starting point is 00:09:23 Sites that want to appear in results must allow open-Ais-crawlers to access their site, which can be done by following the instructions for the allow listing process, end quote. Study and play. Come together on a Windows 11 PC. And for a limited time, college students get the best of both worlds. Get the unreal college deal, everything you need, to study and play with select Windows 11 PCs. eligible students get a year of Microsoft 365 premium and a year of Xbox GamePass Ultimate with a custom color Xbox wireless controller. Learn more at Windows.com slash student offer.
Starting point is 00:10:04 While supplies last ends June 30th, terms at AKA.m.m.m.m.com slash college PC. Ready to soundtrack your summer? With Red Bull Summer All Day Play, you choose a playlist that fits your summer vibe the best. Are you a festival fanatic? A deep end DJ, a road dog, or a trail mixer. Just add a song to your chosen playlist and put your son. summer on track. Red Bull Summer all day play. Red Bull gives you wings. Visit red bull.com slash bright summer ahead to learn more.
Starting point is 00:10:33 See you this summer. Ambition comes in all shapes and sizes. At First Citizens Bank, we roll with your goals because we're built for what you're building. Fit for your ambition for Citizens Bank. This news has been raising eyebrows. Invidia reportedly wrote an internal memo refuting accounting questions that people have been whispering about saying, quote, unlike Enron, NVIDIA does not use special purpose entities to hide debt and inflate revenue,
Starting point is 00:11:08 which has led to a sort of Streisand effect thing where, you know, some people hadn't been asking if NVIDIA is like Enron, but now maybe they are, quoting the verge. So over the weekend, a strange substack post from what appears to be a CEO of a pet relocation company went viral. This post, which, to be clear, is BS, alleges that Nvidia is engaged in, quote, what may become the largest accounting fraud in technology history. That's a load-bearing may in the sense that there's no credible reason to believe Nvidia is engaged in fraud at all. Quote, there is no Neocloud that exists without NVIDIA's CEO, Jensen Wong, says the sub-stacker. That makes Neoclouds and effect extensions of Nvidia, he says, and none of them make money, so to expand, they must take on debt.
Starting point is 00:11:55 If we are looking at these as being metaphorically, NVIDIA's special purpose vehicles, then it doesn't really matter if the companies are any good or will survive in the long term. Their job is to boost NVIDIA's sales. Even Open AI, also an NVIDIA investment, kind of falls into this category because the massive data center build out that Open AI wants the government to backstop sure involves an awful lot of Nvidia chips.
Starting point is 00:12:16 The post continues. If you are old enough or possessed of a certain kind of disposition, you may be thinking, wait a minute, aren't you describing Enron? and, uh, in some sense, yes. Enron's whole thing was special purpose vehicles with extremely speculative valuations that were used to take on debt, the postnotes. But Enron lied about what it was doing, and that's fraud and illegal. It also got up to other illegal stuff besides.
Starting point is 00:12:38 Invidia's relationship with Corweave is all happening in plain sight, so are all the relationships with other neocloud companies. It kind of seems like the tech company version of the GameStop, open, pump, and dump. It's not good behavior, and it's not healthy behavior, Loria says, But it's legal. Any investor can see this. Many are just choosing not to, end quote. So, since NVIDIA has clarified, and this is back to the Verge post, I'd like to add a clarification of my own. The problem is that NVIDIA's behavior is perfectly legal. In its note, NVIDIA says it does not use special purpose entities to hide debt and inflate revenue. This is true. Every single NEOCloud
Starting point is 00:13:17 NVIDIA has invested in is its own company. Any debts those companies may have are on their own balance sheets. It's not Nvidia's debt. That's one of the reasons why NeoClouds are so convenient for Nvidia. As the company itself informs analysts, Nvidia doesn't control those companies and doesn't provide the financing for them either. They're just very useful sin-eaters. In the case of Corweave, Nvidia is propping it up by investing in the company, including to make sure its IPO actually happened and serving as a customer. Corweave's CEO has even bragged about the close relationship saying, I'm not bashful about reaching out to Jensen. Personally, I think accusing invidia of accounting fraud is effectively taking one's eye off the ball. It doesn't have to commit
Starting point is 00:13:56 fraud to have a very cozy arrangement with a whole network of companies that juice its earnings and maybe inflating an AI bubble, all while its own executives sell shares to lock in their status as millionaires and billionaires. Invita has created seven new billionaires, in fact. If and when the AI bubble pops, everything that inflated will have been obvious the entire time. That's very in keeping with the times, isn't it? After all, I reported my core weave story in the verge from the company's public filings following its risk factors section closely. Should the AI bubble burst, anything that's accelerated Nvidia's growth is likely to accelerate its losses. It will have to mark down its investments in the companies it propped up, for instance. Should those companies go under,
Starting point is 00:14:35 that will mean a glut of Nvidia chips on the market as debt holders try to recoup their money, meaning Nvidia will effectively be competing with its own used product at fire sale prices. It's all very stupid, but as far as I can tell, not actually illegal, end quote. suddenly people are asking questions of both OpenAI and Invidia. People seem to have come to a consensus that Google has leapfrogged Open AI with Gemini, and maybe Anthropic has now too, though, you know, give it a couple months. But Google's stock has soared to all-time highs, not just because they suddenly seem to be leading the AI pack, but also because the whole fear was AI was an existential threat to Google. The fact that that doesn't seem to be the case has sort of led to a relief
Starting point is 00:15:22 rally for Google investors, but what if there's more upside? What if Google could also take a bite out of Nvidia? The information is reporting that Google has been pitching customers, including meta and big financial institutions on using its TPUs in their data centers. Quote, for years, the search giant has rented its own AI chips known as tensor processing units to cloud customers who use them in its Google Cloud data centers. Now, though, Google has begun pitching some of those customers, including meta platforms and big financial institutions, on the idea of using TPUs in their own data centers, according to people involved in the talks. Meta, the parent company of Facebook and Instagram, is currently in talks with Google about
Starting point is 00:16:04 spending billions of dollars to use TPUs in Meta's data centers in 2027, as well as to rent Google chips from Google Cloud next year, according to a person involved in the talks. Meta currently relies on NVIDIA's graphics processing units. For Google, such an arrangement could help it expand the market for its TPCs. As part of its pitch to companies about using the chips in their own data centers, Google has said customers have told it they want to do so to meet higher security and compliance standards for sensitive data according to a person with direct knowledge of the matter. Google also said TPUs could be particularly helpful to high-frequency trading firms that run AI models in their
Starting point is 00:16:39 facilities. Such a business could also sharply boost Google's revenue. Some leaders of the Google cloud unit have suggested it could help the company grab as much as 10% of NVIDIA's annual revenue according to a person who heard the remarks. That share would amount to billions of dollars in annual revenue for Google. One of the ways Google has attracted customers to use TPUs in Google Cloud is by pitching that they're cheaper to use than pricey Nvidia chips. The high prices for Nvidia chips have made it difficult for other cloud providers like Oracle to generate solid gross profit margins from renting out Nvidia chips, end quote.
Starting point is 00:17:22 Right as I was recording that, I got a financial times notification saying, NVIDIA stock is down because fears of Google. By the way, I did bang out the second part of the Phil Hartman episode over on Rad History yesterday, so check the Rad History feed if you want to hear the sad second half of that story. Talk to you tomorrow.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.