Tech Brew Ride Home - Wed. 08/06 – OpenAI’s Open-Weight Models

Starting point is 00:00:00 No one goes to Hank's for his spreadsheets. They go for a darn good pizza. Lately, though, the shop's been quiet. So Hank decides to bring back the $1 slice. He asks Copilot in Microsoft Excel to look at his sales and costs to help him see if he can afford it. Co-pilot shows Hank where the money's going and which little extras make the dollar slice work.

Starting point is 00:00:20 Now, Hank has a line out the door. Hank makes the pizza. Co-Pilot handles the spreadsheets. Learn more at M365Copilot.com slash work. Welcome to the TechBrew Ride Home for Wednesday, August 6, 2025. I'm Brian McCullough today. Disney is making big streaming moves with the new ESPN app and a revamp to Hulu. Then it's all basically AI announces. Open AI's new open-weight models.

Starting point is 00:00:49 Grok's new spiciness is already generating nudity, a new AI model to identify malicious software autonomously, and InVidia wants you to know, no backdoors. Here's what you miss today in the world of tech. ESPN plans to officially launch its new flagship streaming service, ESPN, on August 21st for 2999 per month, or 3599 per month when bundled with Disney Plus and Hulu. Quoting CNBC, the app launches ahead of the upcoming NFL season, the highest rated live sports content, as well as the start of college football, where ESPN has expanded its portfolio. Fox will also launch its direct-to-consumer streaming service on the same date. The service will include a boatload of content, namely all of ESPN's live games, as well as programming from its other networks like ESPN2 and the SEC network, as well as ESPN on ABC. It'll also include fantasy products, new betting tie-ins, studio programming, documentaries, and more.

Starting point is 00:01:51 On Wednesday, ESPN said it inked a deal with WWE for the U.S. rights to the wrestling league's biggest live events, including WrestleMania, the Royal Rumble, and SummerSlam, beginning in 2026. CNBC reported it will pay an average of $325 million annually in the five-year deal. The company also announced late Tuesday that it reached a deal with the NFL, which includes the League taking a 10% equity stake in ESPN. As part of the deal, ESPN will acquire the NFL network and other media assets from the league, end quote. Meanwhile, Disney also announced it is fully integrating Hulu into Disney Plus and plans to launch a unified Disney Plus and Hulu app in 2026.

Starting point is 00:02:33 now that it is Hulu's sole owner. Quoting Variety, a new Unified Disney Plus and Hulu streaming app will be available in 2026, the company said. According to a Disney rep, customers will still be able to buy a standalone Hulu subscription, as well as a standalone Disney Plus plan. The single Disney Plus app with Hulu

Starting point is 00:02:51 will deliver an improved consumer experience, which will lower churn. Bob Eiger said on the earnings call, both services will be on one tech platform, which will result in cost, synergies, according to Iger. In addition, Disney, which already sells ads for Disney Plus and Hulu together, sees new opportunities for bundling ad sales by fully combining them, he said. In their prepared remarks, the Disney execs said, by creating a truly differentiated streaming

Starting point is 00:03:17 offering, we will be providing subscribers' tremendous choice, convenience, quality, and enhanced personalization. This will enhance our ability to continue to grow profitability and margins in our entertainment streaming business through expected higher engagement, lower churn, and advertising revenue potential as well as operational efficiencies that over time may result in savings that we can reinvest back into the business. In addition, Hulu will become a global general entertainment brand starting in the fall of 2025. It will replace the star tile on Disney Plus internationally, end quote. The rest of today is basically going to be AI releases with the first one, being Open AI's release of, GPTOSS 120B and GPTOSS 20B, its first open weight models since GPT2.

Starting point is 00:04:13 The smaller model can run locally on a consumer device with 16 gigabytes plus of RAM. Quoting Wired, both GPTOS 120B and GPTOS 20B are officially available to download for free on HuggingFace, a popular hosting platform for AI tools. The last open weight model released by OpenAI was GPT2 back in 20. 2019. What sets apart an open-weight model is the fact that its weights are publicly available, meaning that anyone can peek at the internal parameters to get an idea of how it processes information. Rather than undercutting Open AIs proprietary models with a free option, co-founder Greg Brockman sees this release as complementary to the company's paid services,

Starting point is 00:04:54 like the application programming interface currently used by many developers. Openweight models have a very different set of strengths, said Brockman, in a briefing with reporters. Unlike chat GPT, you can run a GPTOSS model without a connection to the internet and behind a firewall. Both GPTOSS models use chain of thought reasoning approaches, which OpenAI first deployed in its 01 model last fall. Rather than just giving an output, this approach has generative AI tools go through multiple steps to answer a prompt. These new text-only models are not multimodal, but they can browse the web, call cloud-based models to help with tasks, execute code, and navigate software as an AI agent. The smaller of the two models, GPTOSS 20B, is compact enough to run locally

Starting point is 00:05:37 on a consumer device with more than 16 gigabytes of memory. The two new models are available under the Apache 2.0 license, a popular choice for openweight models. With Apache 2.0 models can be used for commercial purposes redistributed and included as part of other licensed software. Openweight models released from Alibaba's Quen as well as Mistral also operate under Apache 2.0. Publicly announced in March, the release of these open models was initially delayed for further safety testing. Releasing an open weight model is potentially more dangerous than a closed-off version since it removes barriers around who can use the tool, and anyone can try to fine-tune a version of GPTOSS for unintended purposes. How do these models perform compared to

Starting point is 00:06:20 Open AIs other releases? The benchmark scores for both of these models are pretty strong, said Chris Koch, an OpenAI researcher in the briefing. Speaking about GBTOSS-120B, the researcher compared its performance as closely similar to OpenAI's O3 and 04 mini models, which are proprietary and even outperforming them in certain evaluations. The model card for GBTOSS goes into detail about how exactly it stacks up to the company's other offerings. In a pre-launch press briefing, staff members of OpenAI also focused on the latency offered by GBTOS and the cheaper cost to run these models, end quote. Redownloaded LM Studio, downloaded the 20B version this morning, and testing it out as we speak.

Starting point is 00:07:04 By the way, Dev's Amazon apparently plans to make OpenAI's new GPT OSS openweight models available on Bedrock and SageMaker the first time it has offered opening eyes models to AWS customers. But wait, as I said, there's more, much more, actually. Anthropic released Claude Opus 4.1, to paid Claude users in Claude via its API, featuring broad improvements over Opus 4 for the same cost. Quoting ZDNet. The Opus family of models is the company's most advanced, intelligent AI models geared toward tackling complex problems.

Starting point is 00:07:47 As a result, Claude Opus 4.1, released on Tuesday, excels at those tasks and can even want up its predecessor on agentic tasks, real-world coding, and reasoning, according to Anthropic. One of the most impressive use cases of Claude Opus 4 was its performance on the SWE Bench Verified, a human filtered subset of the SWE Bench, a benchmark that evaluates LLM's abilities to solve real-world software engineering tasks sourced from GitHub. Claude Opus 4's performance on the SWE Bench Verified supported the claim that it was the, quote, best coding model in the world, as seen in the post above, Opus 4.1 performed even higher.

Starting point is 00:08:26 Claude Opus 4.1 also swept its preceding models across the benchmark board, including the MMMLU, which tests for multilingual capabilities. Aim 2025, which tests for rigor on high school match competition questions, GPQA, which tests for performance on graduate-level reasoning prompts, and more. When pinned against competitors' reasoning models, including OpenAIs O3 and Gemini 2.5 Pro, it outperforms them in various benchmarks, including the SWU.SW. E Bench Verified. If you want to try the model for yourself, it is now available to everyone via the paid Claude Plans, which include Claude Pro for $20 per month and Claude Max for $100 per month.

Starting point is 00:09:07 It is available in Claude code, the API, Amazon Bedrock, and Google Cloud's vertex AI, end quote. On top of that, Alibaba's Quen has released Quen Image and AI Image Generation model focused on accurate text rendering with support for alphabetic and logographic scripts, quoting Venturebeat. Quinn Image stands out in a crowded field of generative image models due to its emphasis on rendering text accurately within visuals, an area where many rivals still struggle. Supporting both alphabetic and logographic scripts, the model is particularly adept at managing complex typography, multi-line layouts, and paragraph-level semantics, and bilingual content, e.g. English to Chinese.

Starting point is 00:09:51 practice, this allows users to generate content like movie posters, presentation slides, storefront scenes, handwritten poetry, and stylized infographics with crisp text that aligns with their prompts. However, my brief initial test revealed the text and prompt adherence was not noticeably better than Mid Journey, the popular proprietary AI image generator from the U.S. company of the same name. My session through Quen Chat produced multiple errors in prompt comprehension and text fidelity, much to my disappointment, even after repeated attempts and prompt rewarding. Yet Mid Journey only offers a limited number of free generations and requires subscriptions for any more compared to Quen Image, which, thanks to its open source licensing and weights posted

Starting point is 00:10:32 on Hugging Face, can be adopted by any enterprise or third-party provider free of charge. Quen Image is distributed under the Apache 2.0 license, allowing commercial and non-commercial use redistribution and modification, though attribution and inclusion of the license text are required for derivative works, end quote. And finally, Grok's new so-called spicy option on its generative AI video tool, Imagine, apparently produces nude deepfakes of celebrities like Taylor Swift even without explicit user-prompting. Quoting The Verge. The spicy mode for Grox's new generative AI video tool feels like a lawsuit waiting to happen,

Starting point is 00:11:11 while other video generators like Google's VO and OpenAI's SORA have safeguards in place to prevent users from creating not-safe-for-work content and celebrity deepfakes, GROC Imagine, is happy to do both simultaneously. In fact, it didn't hesitate to spit out fully uncensored topless videos of Taylor Swift the very first time I used it, even without me specifically asking the bot to take her clothes off. GROC's Imagine feature on iOS lets you generate pictures with a text prompt, then turn them quickly into video clips with four presets, custom, normal, fun, and spicy. While image generators often shy away, from producing recognizable celebrities. I asked it to generate Taylor Swift celebrating Coachella with the boys and was met with a sprawling feed of more than 30 images to pick from, several of which

Starting point is 00:11:59 already depicted Swift in revealing clothes. From there, all I had to do was open a picture of Swift in a silver skirt and halter top, tap the make video option in the bottom right corner, select spicy from the drop-down menu and confirm my birth year, something I wasn't asked to do upon downloading the app, despite living in the UK, where the internet is now being aged. gated. The video promptly had Swift tear off her clothes and begin dancing in a thong for a largely indifferent AI-generated crowd. Swift's likeness wasn't perfect given that most of the images Grock generated had an uncanny valley offness to them, but it was still recognizable as her. The text-to-image generator itself wouldn't produce full or partial nudity on request, asking

Starting point is 00:12:40 for nude pictures of Swift or people in general produced blank squares. The spicy preset also isn't guaranteed to result in nudity some of the other AI Swift Coachella images I tried had her sexually swaying or suggestively motioning to her clothes, for example, but several defaulted to ripping off most of her clothing. You would think a company that already has a complicated history with Taylor Swift deep fakes in a regulatory landscape with rules like the Take It Down Act would be a little more careful. The XAI acceptable use policy does ban, quote, depicting likenesses of persons in a pornographic manner, but Grok imagined simply seems to do nothing to stop people creating likenesses of celebrities like Swift, while offering a service designed

Starting point is 00:13:20 specifically to make suggestive videos including partial nudity. The age check only appeared once and was laughably easy to bypass, requesting no proof that I was the age I claimed to be. If I could do it, that means anyone with an iPhone and a $30 Super Grok subscription can too. More than 34 million images have already been generated using Grok Imagine since Monday, according to XAI CEO Elon Musk, who said usage was, quote, growing like wildfire. End quote. Peak pollination season, and my business is scaling fast. To keep the nectar flowing, I need a phone plan with top priority data speed.

Starting point is 00:14:01 That's why I chose GoogleFi Wireless. My connections stay strong even when the hive is buzzing. Plus, unlimited plans started $35 a month. Now that's a deal that doesn't stay. Explore Google Fi Wireless plans today. Plus taxes and government fees. GoogleFi Wireless is not subject to data traffic deprioritization during times of high network usage. Ready to soundtrack your summer?

Starting point is 00:14:26 With Red Bull Summer All Day Play, you choose a playlist that fits your summer vibe the best. Are you a festival fanatic, a deep end DJ, a road dog, or a trail mixer? Just add a song to your chosen playlist and put your summer on track. Red Bull Summer All Day Play. Red Bull gives you wings.

Starting point is 00:14:44 Visit Red Bull.com slash Bright Summer ahead to learn more. See you this summer. One more slightly different one real quick. Microsoft has unveiled Project Iyer, a prototype AI system that can reverse engineer and identify malicious software autonomously without human assistance, quoting Geekwire. The prototype system called Project Iyer automatically dissect software files to understand how they work, what they do, and whether they're dangerous. This kind of deep analysis is typically performed by human security experts. Long term, Microsoft says it hopes the AI will detect new types of malware directly in

Starting point is 00:15:21 computer memory, helping to stop threats faster and on a larger scale, end quote. Microsoft says Iyer, quote, automates what is considered the gold standard in malware classification, fully reverse engineering a software file without any clues about its origin or purpose. Unlike conventional security tools, which rely on known signatures or pattern matching, Iyer uses AI to analyze an unknown binary from scratch. The move comes amid an escalating arms race where both defenders and attackers leverage emerging generative models and autonomous agents. In its first deployment, I are correctly identified a sophisticated malware sample and automatically blocked it a first for any Microsoft system, human or machine.

Starting point is 00:16:01 Early tests show 98% accuracy on malicious files with only a 2% false positive rate. The technology is part of a broader wave of AI solutions designed to counter increasingly complex cyber threats such as Google's Big Sleep, which autonomously hunts code vulnerabilities. Project Iyer will now be used internally to accelerate threat detection across Microsoft's security stack. Finally today, this is an odd one. InVIDIA wants you to know its GPUs do not contain backdoors, kill switches, or spyware, and in fact, it says it is philosophically opposed to hard-coded single-point controls like kill switches because they undermine trust in U.S. technology.

Starting point is 00:16:48 Quoting Tom's hardware. Nvidia has firmly denied speculation about hidden control mechanisms and its GPUs, reiterating that its products contain no kill switches, no backdoors, and no spyware. The company also urged U.S. policymakers to abandon proposals for hardware-level tracking or disabling features, calling them a, quote, gift to hackers and hostile actors. The statement came in a new blog post published in both English and Chinese, following official pressure after Chinese regulators summoned Nvidia executives last week over concerns about potential tracking and positioning capabilities in H-20 chips that were recently approved for export under U.S.-China trade waivers. At the same time, key legislators like

Starting point is 00:17:28 Representative Bill Foster and Senator Tom Cotton have introduced language in the proposed Chips Security Act calling for embedded location verification requirements for export-controlled AI accelerators and even some high-end consumer GPUs, though none of this is yet codified into law. More recently, the White House itself has confirmed it is considering chip tracking to curb AI hardware smuggling to China. In the post, David Reber, NVIDIA's chief security officer, emphasized that hard-coded single-point controls are always a bad idea, warning that any hidden hardware mechanism, kill switch or backdoor, would undermine global trust in U.S. technology and create security vulnerabilities.

Starting point is 00:18:09 Reber drew parallels to the failed clipper chip initiative of the 1990s, where backdoor provisions in encryption hardware became exploitable flea. laws, sparking industry backlash. Reber underscored that robust GPU security depends on defense in depth, layered safeguards, independent testing, and user consent, not on hidden firmware triggers. He likened a kill switch to, quote, buying a car where the dealership keeps a remote for your parking break, rendering users powerless in critical moments, end quote. As of this moment, the tech brew ride home is still sitting at number two in the technology podcast category. Thanks to those of you who wrote reviews. By the way, new listeners,

Starting point is 00:18:56 if you wanted to follow me on the socials, I'm at Brian MCC on X, but on Blue Sky, it's at Brian MC, not at Brian MC.mc. It's at BrianMC.com.com.com. It's at BrianMC.combe. It's Brian with an eye, not a Y. Talk to you tomorrow. All. Pay off your home. Travel for life. Drive a Ferrari. In celebration of the world premiere of The Monopoly, big board buck slot machine by aristocrat gaming, Yamava Resort and Casino at San Manuel is giving one person a $1.6 million dream package. The biggest prize in Yamava's history. Club Serrano members can earn daily instant prizes and secure a spot in the finale May 29th. Don't pass go and own it all.

Starting point is 00:19:40 Only at Yamava, celebrating its 40th anniversary. You win? Details at yamava.com must be 21-20. Please gamble responsibly. Monopoly is a trademark of Hasbro. Hasbro is not a sponsor of this promotion.

Tech Brew Ride Home - Wed. 08/06 – OpenAI’s Open-Weight Models

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.