The AI Daily Brief: Artificial Intelligence News and Analysis - Maybe AI Will Cure Cancer After All

Starting point is 00:00:00 This podcast is supported by Google. Hey folks, Stephen Johnson here, co-founder of NotebookLM. As an author, I've always been obsessed with how software could help organize ideas and make connections. So we built NotebookLM as an AI-first tool for anyone trying to make sense of complex information. Upload your documents and NotebookLM instantly becomes your personal expert, uncovering insights and helping you brainstorm. Try it at notebooklm.com. Today on the AI Daily Brief, maybe we're going to get that AI cancer cure after all. And before that, on the headlines, Google announces VO3.1, but how does it hang compared to SORA 2?

Starting point is 00:00:41 The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. All right, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, Gemini, Notion, Blitzy, super intelligent, and robots and pencils. To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts. If you're interested in sponsoring the show, you can email us at sponsors at AIDaily Brief.ai. Two last quick announcements before we get into the episode. First, as I've mentioned over the last couple of days, I have a podcast growth roll up. The role is simply to grow this show as big as humanly possible. And the way that you apply is doing something interesting

Starting point is 00:01:23 to grab my attention. All of that can be found at AIdailybrief.A.I.S.J.J.J. Lastly, I've had a super positive response to that episode from last Sunday, what 1,000 executives say about AI agents, basically the one where I got deep on what we've learned at Superintelligent. And so as a little bonus, I've decided to pull in Super Intelligence Head of Research, Newfar, who has been a guest on the show before, to do a three-part mini-series on trying to turn what we've learned into more practical, actionable lessons. So we've got an episode on culture, an episode on tech and data readiness, and an episode on use cases, and I'm going to use the next three Saturday slots, which, as you guys know,

Starting point is 00:01:59 is historically the one off day for the week to have those as bonus episodes. So look for the first one of those on Saturday. With that, though, let's get into today's episode. Welcome back to the AI Daily Brief Headlines Edition, all the daily AI news you need in around five minutes. We kick off today with a model update that many people have anticipated, that has been well hinted over the last few days. Google has released an iterative update of their video model, VO3.1.

Starting point is 00:02:25 The new version improves the realism of outputs, boosts prompt adherence, improves audio quality. Still, the big change is new edit features. Users can now include reference images for objects and characters, as well as prompting the model to remove a particular object from a previously generated clip. They can also provide a first and last frame for a video and prompt the model to fill in the rest. Additionally, there's a feature to extend clips based on the last few frames, allowing creators to easily string together clips into minute-long shorts. The update comes five months after the May release of V-O-3, which was the single big game changer for AI video, introducing synced audio for the first time as well as delivering state-of-the-art realism.

Starting point is 00:03:04 That said, a lot has changed since V-O-3 was released, and while it sparked wonder and creativity, the general sentiment around 3.1 has been far less enthusiastic. AI developer Matt Schumer wrote, my initial V03.3.1 impression, disappointment. Unfortunately, it's not just noticeably worse than SORA 2, it's also quite a bit more expensive. One bright spot is the tooling they've added. It seems to me a little bit like the normal pattern, where the disappointment stems from the fact that this is just an iterative update, not some big state-of-the-art advance. VC. Justine Moore noted

Starting point is 00:03:34 that we've passed the threshold where video models are good enough, so we shouldn't expect anything new to be all that mind-blowing. She commented, we have entered the product era for video models. The recent releases, via 3.1, Sora 2, runway apps, aren't a huge leap forward in terms of underlying model capabilities, but they introduce critical features like extending video, character consistency, and editing. I think that is a perfect summary of where things are. a lot of the updates in the immediate term when it comes to AI generated video are going to be around how usable it is in production environments. Speaking of new models, Anthropic has released Claude Haiku 4.5, the latest version of their small model. The model is intended to be fast and

Starting point is 00:04:14 cheap, with the claim being twice the speed of Sonnet 4 at a third of the cost. Anthropic also claims that the new version of Haiku outperforms the previous generation Sonnet 4 in software engineering in the Swee Bench Verified Test. They're also seeing outperformance against Sonnet 4 on computer use tasks, which could make the new Haiku a very capable agentic model. Anthropic Chief Product Officer Mike Krieger said, it's opening up entirely new categories of what's possible with AI in production environments, with sonnet handling complex planning, while Haiku-powered subagents executed speed. We're giving people a complete agent toolbox where each model has the right combination

Starting point is 00:04:48 of intelligence, speed, and cost for different parts of the job. That's exactly what Kat Wu from Anthropics said. Hiku 4.5 is a workhorse that makes the coding experience in Claude Code feel really fast. While Sonnet 4.5 remains the default, Haiku 4.5 now powers the Explore subagent, which can rapidly gather context on your codebase to build apps even faster. Hiku4.5 will be available to free users and can be used to squeeze more capacity out of the free service compared to Sonnet 4.5. Krieger again commented,

Starting point is 00:05:18 Even for my own use, even though it's not as smart as Sonnet, I've started defaulting to it on Clod, especially in the mobile app, because it's just much faster getting an answer. Putting the model through its paces, Swix was impressed, posting, more than twice the speed is underselling Haiku, to be honest. I built a way to directly compare Sonnet versus Haiku 4.5, and it's roughly 3.5 times faster, but the U.X feels so much better because Haiku stays inside the flow window. Obviously, end-to-end latency varies a lot, so Anthropic can't report a real number without production usage, but you should try heads-up comparisons.

Starting point is 00:05:49 One other quick note about Anthropic. We got some absolutely monster revenue numbers reported by Reuters from that company. Their sources suggest that Anthropic is currently running at. at a $7 billion run rate and is on track to hit $9 billion by the end of the year, and then get to somewhere between $20 to $26 billion next year. I'm not going to go too deep into that today because I'm planning to do a deep dive analysis on all of the implications of that and reported open AI numbers, probably for tomorrow's episode.

Starting point is 00:06:16 But suffice it to say that Anthropics coding an enterprise business is going very, very well right now. One company whose AI business is not going so well, to the extent that it can even be said to exist is, of course, Apple. Another high-profile Apple AI researcher has left Apple to sign on with meta-superintelligence team. Bloomberg's Mark German reports that Ki-Yang has left Apple just weeks after being promoted to lead the answer's knowledge and information team. That team was formed recently to develop a perplexity-style AI search product, which was viewed as a central pillar of a major Siri revamp plan for release in March.

Starting point is 00:06:49 Yang was one of the most senior executives among Apple's broader AI and machine learning group. German wrote that this is one of the most high-profile exits from Apple's AI organization, which has seen about a dozen departures this year. What's more, his sources said that even more departures are expected over the coming months. German concluded, The continued departures underscore the instability within Apple's AI ranks at a time when it's racing to catch up with OpenA. And Google, both of which are advancing quickly in generative AI and search. He noted that Apple had also been interviewing outside replacements for John Gianandria,

Starting point is 00:07:18 Apple's senior VP of AI and Machine Learning, who leads the entire AI organization. Lastly today, a study which I found to be just a real bummer as an American citizen, Pew Research has published the results of a new survey showing that global public sentiment is souring on AI. They interviewed people across 25 countries and found that in general they are far more concerned than excited about the increased use of artificial intelligence. Overall, 34% of respondents said that they were more concerned than excited, while only 16% said that they were more excited than concerned.

Starting point is 00:07:48 42% said that they were equal parts excited and concerned. Across all 25 countries, there was not a single one where excitement was the majority feeling about accelerated AI adoption. I should note here that the big notable exception to countries that were included, there is no China here, and I would be very interested to see what those numbers look like. Still, of the countries that were surveyed, only three had more than 20% of their people saying that they were more excited than concerned. Nigeria was at 20%, Sweden and Korea had 22% each, and Israel was at 20%.

Starting point is 00:08:18 29%. Israel and South Korea were, in fact, the only nations where the people who said they were mostly excited about AI outweighed the people who said they were mostly concerned. The part that I said was disappointing to me was that right at the very top of the list for populations most concerned about AI was the U.S. at 50%. While technically we tied with Italy on that number, Italy had more people who were more concerned than excited than us, 12% compared to just the one-tenth of Americans who are more excited than concerned. Now, it should be noted that this data is a little old at this point. Two U.S. surveys were conducted in March and June, while the international survey was conducted between January and April. But given that we've seen the rise of

Starting point is 00:08:57 protests around data center construction and a lot of negative news reporting on AI, I would be surprised if we had seen a major improvement in sentiment, and I wouldn't be surprised if we'd seen it actually get worse. Now, what's crazy about this is that you have significant portions of these populations actually using these tools, and yet they still have this anxiety. There is a lot more work to be done for those of us inside the AI industry, not only from a PR type of perspective, but in ensuring that people's concerns about their futures are actually addressed. My general thesis is that AI is the recipient of more generalized anxiety and that everything is downstream from economic insecurity. So who knows, one day maybe I'll have to spit out the

Starting point is 00:09:34 politics podcast. For now, I'll just leave it at more work to be done. And that's going to do it for the headlines. Next up, the main episode. Chatbots are great, but they can only take you so far. I've recently been testing Notion's new AI agents, and they are a very different type of experience. These are agents that actually complete entire workflows for you in your style, and best of all, they work in a channel that you already know and love because they are purpose-built Notion super users. Notion's new AI agents completely expands the range of what Notion can do. It can now build documents from your entire company's knowledge base, organize scattered information into organized reports, basically do tasks that used to take days and get them complete in minutes.

Starting point is 00:10:16 These agents don't just help with work, they finish it. Getting started with building on Notion is easier than ever. Notion agents are now your very own super user to help you onboard in minutes. Your AI teammates are ready to work. Try Notion AI for free at the link in our show notes. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code Context. Blitzy uses thousands of specialized AI agents that think for hours

Starting point is 00:10:41 to understand Enterprise-scale code bases with millions of lines of code. enterprise engineering leaders start every development sprint with the Blitzie platform bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzie as their pre-I-D-E development tool, pairing it with their coding co-pilot of choice to bring an AI-Native STLC into their org. Blitzie is providing a limited time, 30-day free proof of concept for qualifying enterprises.

Starting point is 00:11:19 The team will provide a 5X velocity increase on a real development project in your org. Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC from AI-assisted to AI Native. That's BLITZY.com. Today's episode is brought to you by my company, Superintelligent. You've got a hundred what-if ideas, but which one becomes an agent. Super Intelligent maps every AI use case across your company and helps you create an agent plan that you can actually execute. We match opportunities to your tech stack, your data profile, and your team.

Starting point is 00:11:53 No more guesswork, just a clear path from pilot to production. If you want agents that deliver business outcomes, start with planning. Go to BSUPER.AI and sign up for a demo. AI isn't a one-off project. It's a partnership that has to evolve as the technology does. Robots and pencils work side by side with clients to bring practical AI into every. phase, automation, personalization, decision support, and optimization. They prove what works through applied experimentation and build systems that amplify human potential. Welcome back to the AI Daily Brief.

Starting point is 00:12:27 The joke over the last couple of weeks, first as OpenAI launched SORA, a short form video app, and then later, as it announced that it would be opening up to adult uses of its platform, goes along the lines of, we were promised great big cures for diseases and novel discoveries, and instead we got a new TikTok and a new place for porn. Major politicians even waded on this line of meming, with Florida Governor Ron DeSantis saying so much for curing cancer and beating China, question mark. And yet, even as that discourse was happening, we got this announcement from Google Sundar Pichai.

Starting point is 00:13:00 He writes, An exciting milestone for AI and science. Our C2S-scale 27B Foundation model, built with Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior. which scientists experimentally validated in living cells. With more preclinical and clinical tests, this discovery may reveal a promising new pathway for developing therapies to fight cancer.

Starting point is 00:13:24 So today we're going to talk about first this particular discovery, and then more broadly how quietly behind all the hype and noise there have been some really interesting advancements which suggest that the whole idea that AI can't make or contribute to novel discoveries in science may be one now for the junk heap of history. So back to this discovery from Google. In their announcement posts, Google wrote,

Starting point is 00:13:45 this announcement marks a milestone for AI and science. C2S scale generated a novel hypothesis about cancer cellular behavior, and we have since confirmed its prediction with experimental validation in living cells. The implications, they say, are new pathways for developing theory, therapies to fight cancer. Now, they explain that one of the biggest challenges in cancer therapy is that many tumors are, quote unquote, cold. In other words, invisible to the body's immune system. A major strategy in cancer treatment, then, is triggering tumor cells to make them turn hot,

Starting point is 00:14:17 i.e. to display immune-triggering signals, in a process that's called antigen presentation. With this background, researchers gave C2S scale a single task, to find a drug that functions as a conditional amplifier, in other words, to boost the immune signal only in specific circumstances. Previous iterations of similar models were not capable of achieving this task, but C2S scale succeeded. The task effectively required a sort of conditional biological reasoning. They designed what they called a dual context virtual screen, where they, one, provided the model with real-world patient samples with intact tumor immune interactions and low-level interferon signaling, and then secondly provided the model with isolated cell line data with

Starting point is 00:14:58 no immune context. Google then simulated the effects of over 4,000 drugs and asked the model to predict which would boost immune signals if only certain conditions were met. Now, this highlights one of the areas where we're seeing AI-enhanced science really flourish. AI models generally excel in situations where a large volume of experimentation is required. In other words, a big part of the value is about speeding through simulated experiments and crunching large data sets that would take human researchers and traditional computing methods much, much longer to sift through. After simulating those 4,000 drugs, the experiment found a set of drug candidates,

Starting point is 00:15:34 out of the drug candidates that the model highlighted, only 10 to 30% were already known in prior literature. The others had no prior link to the screen. Interestingly, the model made a core prediction on how the family of drugs would function, which it used to base its result. They wrote, What made this prediction so exciting

Starting point is 00:15:51 was that it was a novel idea. The model was generating a new testable hypothesis and not just repeating known facts. Researchers then tested the hypothesis on actual cells and observed the predicted effect. The model seems to have correctly identified a new way of turning tumorous cells hot, under the desired conditions. Google concluded, while this is an early first step, it provides a

Starting point is 00:16:11 powerful experimentally validated lead for developing new combination therapies, which use multiple drugs in concert to achieve a more robust effect. This result also provided a blueprint for a new kind of biological discovery. It demonstrates that by following the scaling laws and building larger models like C2S scale 27B, we can create predictive models of cellular behavior that are powerful enough to run high-throughput virtual screens, discover context-conditioned biology, and generate biologically grounded hypotheses. One of the big implications here is that these larger science-specific models seem to actually have emergent capabilities in scientific reasoning, not just language-based reasoning.

Starting point is 00:16:49 To the extent that this is a bitter lesson outcome, i.e. just the byproduct of a better, bigger, more dedicated model, that actually makes it more likely that this is a big unlock for future research rather than a one-off discovery. Basically, the implications of there being a general scaling law for scientific reasoning models is quite large. The reactions from many were excited. We got, of course, the jokes, Packing McCormick wrote, Everyone else, behold, an AI you can beat off to. Google Deep Mind, protein folding, weather prediction, new materials, and now an AI that can make its own cancer discoveries.

Starting point is 00:17:19 There was, however, some skepticism. Lenny Usebi writes, A bit of a stretch to frame this as if they asked a chatbot to solve cancer and it spat out a novel idea. This is much more like they trained a narrow predictive model for a very specific task, and then it was able to do that task well enough to filter out a new candidate drug. Some version of this take is basically presented in every thread. The point, though, with this discovery is that the model demonstrated the ability to take a set of known facts about the science and synthesize them into a novel hypothesis that proved to be correct using reasoning.

Starting point is 00:17:48 If you go look at these threads where inevitably this critique comes out, there are scientists who follow up pointing out that there's really no such thing as scientific discovery created from whole cloth. Everything is built on the synthesis of existing ideas. Rob S follows Lenny's post with, yes, that's how science is done. V.C. Hemet Mahoptra writes, I've always believed new knowledge can be, one, built on existing knowledge but connecting the dots in unique ways, two, creating pure de novo knowledge through hypothesis, experiments,

Starting point is 00:18:14 etc., that might go against current thinking. LLMs are likely great at one, and that's where perhaps a vast majority of the net new knowledge lies. Even if LLMs, as they stand today, never get to number two, their impact on research will be tremendous. Now, what makes this story notable to me, even outside just the profound implications of AI actually being able to help us cure cancer, is that it is not an isolated story. For those who have been paying attention closely, and of course that's hard considering the absolute barrage of new models

Starting point is 00:18:42 and crazy bubble talk and all those things going on, there have been a lot of these really subtle indicators that some big barrier has been surpassed. OpenAI's Kevin Wheel, who used to be their chief product officer but is now their VP of Science, about a week ago tweeted, GPT5 crossed a major threshold. Over the last two months, we've heard repeated examples of scientists, successfully directing GPD 5 to do novel research in math, physics, biology, computer science, and more. Now, he clarified, I'm not claiming GPD 5 is ready to prove the Riemann hypothesis. It's more at the lemma stage today when guided by an expert, it can do bounded chunks of novel science, things that would maybe have taken a professor on her postdoc a few days or a week to work through.

Starting point is 00:19:21 But this is the beginning of accelerating science, because if each path takes a week, you can only explore so many of them. If it takes 20 minutes with chat GPT Pro and you can run them in parallel, suddenly you can explore far more. And remember, the model you're using today is the worst it'll ever be for the rest of your life. The idea that Chatsybt could do novel science sounded crazy a year ago, but here we are. And by the way, this is not just Kevin speaking. Professor Ethan Malick wrote, I'm hearing similar things in economics and the social sciences. Not autonomous work, but expert directed AI is absolutely helping academics do novel research in significant ways. One example that got

Starting point is 00:19:56 a lot of conversation came from back in August. Sebastian Bubeck, a research, researcher at OpenA.I. posted an academic mathematics problem to GPT5, and it appeared to come up with a novel result. The problem was an extension of existing work, which Bubeck explained as, in smooth convex optimization, under what conditions on the step-size edda and gradient descent will the curve traced by the function value of the iterates be convex? It's totally fine if that's jivorous to you, it is absolutely gibberish to me. The fact that you need to understand is that the original paper on Arvix found a general result if the edda is larger than 1.75 divided by L, where L is the smoothness of the curve. The paper also provided the result below 1 divided by

Starting point is 00:20:32 L, so there was a remaining gap between 1 and 1.75. GPD5 Pro appeared to produce a general result for 1.5 divided by L, reducing the lower bound of the solution. Bubek commented that this was, quote, definitely a novel contribution that would be worthy of a nice Arvix note. However, he continued, the only reason I won't post this as an Arvix note is that the humans actually beat GPT5 to the punch. namely the Arvix paper has a V2 with an additional author, and they close the gap completely, showing that 1.75L is the tight bound. Still, he pointed out that GPD5's proof was completely novel, commenting, the fact that it proves 1.5 divided by L and not the 1.75 divided by L proof also shows that it didn't just search for the V2. Also, GPD5's proof is very different from the V2 proof.

Starting point is 00:21:16 It's more of an evolution of the V1 proof. Shortly after Bubeck published his results, Others at OpenAI chimed in that this wasn't the only Aval academic work that GPD5 was capable of. Chief Research Officer Mark Chen posted, GPD5 Pro is starting to develop new mathematics. I'm hearing similar stories in other scientific domains like physics too. Now, what's interesting about these math results is that as much as we are talking about AI's ability to generate new knowledge by synthesizing old knowledge as a pathway for medical and scientific discovery, this math result seems to be an emergent capability of reasoning models.

Starting point is 00:21:47 In coming up with the proof, GPD5 Pro thought for 17 minutes, and then presented work that wasn't previously published. Then again, maybe we shouldn't be so surprised given recent performance. Both OpenAI and DeepMind entered LLMs in the International Math Olympiad this summer and were capable of gold medal performances. The notable thing is that these kinds of theoretical math problems have basically zero calculation. They're about manipulation of logic to come up with a mathematical proof. It's basically an entirely different category of scientific work.

Starting point is 00:22:13 Former quant investor Jeffrey Emanuel highlighted another interesting novel math paper that required a lot of manual labor to come up with a result. In a long thread, he suggested that this could be an example of a hidden discovery, a novel result that was already feasible based on current knowledge but required too much work for a human to reasonably obtain as an individual or an academic team. Which gets us to another point. A recent article in Frontiers was called 90% of sciences lost. And the broader point is that while modern science is about people with 20-year academic

Starting point is 00:22:42 careers of extreme specialization, often the largest scientific breakthroughs are about combining observations across fields. A ton of the big discoveries of the 20th century were, for example, chemistry slash physics or biology slash physics. As Frontiers puts it, most scientific data never fueled the discoveries they should. For every 100 data sets created around 80 remain in the lab. 20 are shared but rarely used. And only one typically drives new findings. The result, delayed cancer treatments, climate model short on evidence and research that cannot be reproduced. That is exactly the type of information that AI could be using and potentially putting

Starting point is 00:23:16 to better efforts. Andrew Curran recently had an interesting post on Twitter where he wrote, We're in a strange spot right now with AI. The anti-AI crowd believes progress has halted and are doing a victory lap. Insiders at all labs maintain advancement continues at pace. Only one of these versions of reality will survive the new year. OpenAI's AID and McLaughlin summed up to the lab point of view in this tweet. 2024 evals?

Starting point is 00:23:39 Can it count letters? Can it do college stuff? Are its solutions diverse? 2025 evals? Has it worked for 30 hours yet? Has it increased GDP? has it discovered novel math, and yet, as we discussed in the headlines today, we're still at this point where the U.S. ranks dead last among many large economies in how much it's concerned

Starting point is 00:23:58 versus excited about AI. A full 50% of U.S. citizens surveyed by Pew were more concerned than excited about AI. I tweeted that this is a depressing indictment about the state of our national psyche, that technology should be a beacon of better futures. Now, it's way beyond the scope of this particular show to get into all of the non-AI factors that I think show up in these numbers. But it's why it's so important to hold up and have conversations about this subtle ground shift that's happening right in front of our eyes. Even as these discoveries come up, we will certainly cover them here. For now, however, that's going to do it for today's AI Daily Brief.

Starting point is 00:24:32 Appreciate your listening or watching as always. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - Maybe AI Will Cure Cancer After All

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.