The AI Daily Brief: Artificial Intelligence News and Analysis - What AI Builders Are Actually Excited About

Starting point is 00:00:00 Today on the AI Daily Brief, what AI builders are actually excited about right now, and before that in the headlines, DeepSeek drops their latest model. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. Hello, friends, quick announcements before we dive in. First of all, thank you to today's sponsors, Blitzy, Vanta, and Superintelligent, and to get an ad-free version of the show, go to patreon.com.com slash AI Daily Brief. Welcome back to the AI Daily Brief Headlines edition. All the daily AI news you need in around five minutes. We kick off today with a new model from Deepseek, and there has been some interesting conversation lately around this company. Deepseek obviously set the tone for a lot of

Starting point is 00:00:43 this year when they dropped a reasoning model that thanks to its free access in a friendly consumer app and its exposed chain of thought, went sort of viral, rocketed to the top of the US app charts, and got everyone, including Wall Street, all totally freaked out about the capabilities of Chinese AI. Now, since then, and basically as soon as there are one reasoning model, launched, people were excitedly waiting for R2. We still don't have R2, but we did yesterday get their V3.1. Now, the V-Line of models are DeepSeek's non-reasoning models that serve as a base model to build on top of. Now, given that this is just a 0.1 version update, the model specs remain fairly similar. It's a 685 billion parameter model up from 671 for V3,

Starting point is 00:01:28 still utilizes a mixture of experts architecture. The model was released without a ton of fanfare, although the official commentary account did at least post the announcement of the update. One question is that they said in the announcement post on Twitter that there was a longer context window, but it appears as though both V3 and V31 have a 128K context window. Now, benchmarking is obviously still in its early stages, but it's looking fairly strong so far. The big thing here is absolutely the performance cost ratio. V31 got a 71.6% score on the Ider Polyglot coding benchmark, which was in fact 1% more than Claude Opus for in its non-reasoning mode, but the main thing was that it was 68 times cheaper.

Starting point is 00:02:08 V3-1 is also DeepSeek's first hybrid model that can handle chat reasoning and coding functions within the same model. Researchers noted how the model now has special tokens to support reasoning and search, which has led people to wonder if we are going to see a shift away from having two separate model families, in the same way, for example, that OpenAI moved away from having their non-reasoning and reasoning models have different naming conventions. Journalist Poziao writes, Deepseek quietly removed the R1 tag. Now every entry point defaults to v3.1. Looks less like multiple public models more like strategic consolidation, a Chinese answer to the fragmentation risk in the LLM race. This suggests Deepseek is consolidating,

Starting point is 00:02:44 rather than juggling multiple public versions, it's steering towards one coherent product line. In China's hypercompetitive AI market, such integration is less about branding and more about reducing fragmentation costs. It's a strategic move, stabilize the baseline, then expand capabilities. Deepseek fan tier taxes agrees posting, I've long been saying that they hate maintaining separate model lines and will collapse everything into a single product and artifact as soon as possible. This may be it. A Chinese language account called Ace Taffy wrote, If the Deepseek V3-1 update isn't meant to set the stage for Deepseek V4,

Starting point is 00:03:14 I don't really see the point of this update because aside from lower token usage, the overall reasoning performance hasn't improved at all. Now, it's only been a very short period of time, but so far we're seeing something very similar to the immediate chipt F5 response, which is disappointment in what the model isn't. To your taxes again, writes, Deepseek didn't hype anything, but is getting the OpenAI post-GPT5 treatment.

Starting point is 00:03:34 I guess this speaks to them being maybe the only lab on the same level of memetic power. They screenshot at another Chinese language post from X that said, the release of Deepseek v3.1 has sparked widespread criticism. It can only be said that to wear the crown one must bear its weight. Still others are betting that we're going to see a Deepseek v4, and maybe before too long. Swix writes, looks like Deepseek is still on track to ship Deepseek v4. This November and December is going to be pretty wild, I think. Speaking of models out of China, a new AI image editing tool from Alibaba's Quen team

Starting point is 00:04:05 is getting a ton of chatter as a potential new disruptive force in that area. Based on their benchmark topping, Quinn Image Model released earlier this month, the team is now released in open source editing tool called Quinn Image Edit. The tool can perform Photoshop style edits using text prompting, similar to the editing modes released by OpenAI and Google based on their own image models. Still, people are initially really impressed with the quality and attention to detail. The team is making some pretty big claims about where the limited performance is. With Jun Yang Ling posting, it can remove a strand of hair, very delicate image modification,

Starting point is 00:04:37 which even if that isn't exactly true, and I haven't tried it yet to know, you got to love the bravado from the people who built the model. Now, later in the main episode, we will be talking about another image model that's getting a ton of attention, the so-called nano-banana model, but taking together we might be in for a big upgrade in that particular area of AI. Looking at some fundraising news. Databricks is finalizing round that will see their valuation jump to $100 billion. Sources say that Databricks has signed a term sheet with existing investors, including

Starting point is 00:05:06 Thrive Capital, Insight Partners, and Injuries in Horowitz to raise about a billion dollars. This is a 60% increase over Databricks last raise in December, which happened at a $62 billion valuation. That round raised $10 billion and was one of the largest private fundraising rounds ever. Many noted back then that Databricks were using that gigantic Series J as something of a substitute for going public, and in that vein, going back to the private markets for a Series K is something of an unusual move. You might have seen one of the million tweets floating around that say some equivalent of what happens when we get past Series Z. Still, Databricks CEO Ali Goatsy said in the statement, we're seeing tremendous investor interest because of the momentum behind our AI products,

Starting point is 00:05:45 which power the world's largest businesses and AI services. We're thrilled this round is already oversubscribed and to partner with strategic long-term investors who share our vision for the future of AI. In comments to the Wall Street Journal, he added that he was not planned. on fundraising so soon, but that he's receiving daily inbound from investors trying to put money into the company. He said, it wasn't this way two months ago, but in the last month, it's just been constant. How that relates to some of the jitters on Wall Street, again that we'll talk about in the main episode, remains to be seen, but for now, at least in the private capital markets,

Starting point is 00:06:14 there is clearly still a lot of appetite. That, though, will do it for today's headlines. Next up, the main episode. This episode is brought to you by Blitzy, the Enterprise Autonomous Software Development Platform with Infinite Code. context. Blitzy uses thousands of specialized AI agents that think for hours to understand enterprise-scale code bases with millions of lines of code. Enterprise engineering leaders start every development sprint with the Blitzy platform bringing in their development requirements. The Blitzy platform provides a plan, then generates and pre-compiles code for each task. Blitzy delivers 80% plus of the development work autonomously while providing a guide for the final 20% of human

Starting point is 00:06:50 development work required to complete the sprint. Public companies are achieving a 5x engineering velocity increase when incorporating Blitzie as their pre-IDE development tool, pairing it with their coding co-pilot of choice to bring an active STLC into their org. Blitzy is providing a limited time, 30-day free proof of concept for qualifying enterprises. The team will provide a 5x velocity increase on a real development project in your org. Visit blitzy.com and press book demo to learn how Blitzie transforms your STLC from AI-assisted to AI Native.

Starting point is 00:07:20 That's BLITZY.com. As a founder, you're moving forward. fast towards product market fit, your next round, or your first big enterprise deal. But with AI accelerating how quickly startups build and ship, security expectations are higher earlier than ever. Getting security and compliance right can unlock growth or stall it if you wait too long. With deep integrations and automated workflows built for fast-moving teams, Vanta gets you audit-ready fast and keeps you secure with continuous monitoring as your models, infra, and customers evolve. Fast-growing customers like Langchane, writer and cursor trusted Vanta to build a scale.

Starting point is 00:07:55 scalable foundation from the start. And look, as someone who lives in the world of enterprise procurement, I love how Vanta makes it easy to get compliance right. The last thing you need when you're trying to win that big deal is to have it scuttled by something that Vanta has solved for over 10,000 companies. Go to vanta.com slash NLW to save $1,000 today through the Vanta for Startups program and join over 10,000 ambitious companies already scaling with Vanta. That's V-A-N-T-A-com slash NLW to save $1,000 for a limited time. If you are a regular listener, you will have heard about Super Intelligence Agent Readiness Audits at this point. But I wanted to tell you today about the full suite of Agent Readiness products that go beyond just the initial readiness report.

Starting point is 00:08:38 Over the last six months, Super Intelligence has built out an entire agent planning suite. We help you move from discovery to planning to implementation. After you've completed your agent readiness audits, we help you double click on your most important use cases with what we call our use case planning reports. These reports are going to help you understand what sort of technical preparation you need to do to be ready for a use case, what challenges you might face in implementation, and whether you should be thinking about building, buying, partnering, or some combination. After that, you can even get a spec document in what we call our technical blueprint that gives either your developers or the developers of the partner you work with what they need to build exactly the agent that you're looking for. If you want to learn more about superintelligence agent planning suite, we've built a custom GPT to answer your questions. just go to bit.ly slash super super agent. That's bit.l.ly slash super super agent, all one word. And if you have any

Starting point is 00:09:32 questions, the agent can even help you book an appointment with our team. Welcome back to the AI Daily Brief. Today we're doing something a little bit different. We are, of course, in the very last days of summer here in America. And this is always sort of a quiet, weird time, frankly, in advance of back to school and back to work. and all the things that happen in September and beyond getting into the fall. It can be a time when there's a lot of volatility in markets because trading volume is low, liquidity is thin, and because of that, narratives tend to get really exaggerated, right? The bigger the market moves, the more people think big things are changing. Now, in AI right now, as I've talked about quite a bit over the last couple of weeks,

Starting point is 00:10:13 we've definitely been in a narrative ebb. And what I want to do today is get into all of that. talk about one thing that I do think is actually important outside of just the immediate term as relates to those narrative questions, and ultimately come out the other side talking about where I think all of the excitement in AI actually is right now and what people who are actually building in the field are looking forward to. So first, this narrative ebb. The TLDR is that the launch of GPD5 opened up a huge seam for anti-AIism. If you were spending any time near social media or really any mainstream media outlet,

Starting point is 00:10:47 you're seeing tons of tweets and articles like this one. We're witnessing the most expensive act in tech history, pouring trillions into a dead-end Gen AIL-LM paradigm that can't deliver real intelligence ever. 500 billion later, no AGI, no cognition, no mind, no clue, just theater and autocomplete version 5. We have the New Yorker piece that I mentioned a couple of times. What if AI doesn't get much better than this?

Starting point is 00:11:08 And now added to that, this Atlantic piece that goes even farther, AI is a mass delusion event. Now, I'm not getting deep into the substance of these actual articles, because relative to our conversation here, at least, my point is just that there's a lot more of this happening all at once. Now, if you are a regular listener, you will also know that even as these articles have had their lag from the GPT5 launched to publishing, the narrative inside the AI user community has shifted fairly dramatically, and people are actually getting a lot out of GPT5. Not that they are without complaint, nor do they think it's the second coming or anything like that, but it is definitely not

Starting point is 00:11:43 the sort of antagonism that we had that produced the context for all these articles in the first place. As I mentioned, over in the markets, things are getting a little tense right now. Hedge funder Ben Efert treated a screenshot of one of the articles talking about where Sam Altman had admitted that they screwed up the GPT5 launch, adding Altman is a huckster, few companies are deriving meaningful value from these products, the CAPX is rapidly depreciating waste, markets will correct rapidly when the hype falls through and spending collapses. Now, when it comes to that few companies are getting value out of this, you might have seen this article ripping around the socials from Fortune, MIT report 95% of generative AI pilots

Starting point is 00:12:20 at companies are failing. Now, once again, the specifics here are out of scope for today's episode, although I do believe that this weekend's big thing slash long reads episode is going to be all about why I actually think AI pilots fail. The TLDR on my take here is that while the naysayers are using AI pilots failing as an indictment of the technology itself. I think it's also, and perhaps even more, an indictment of the systems into which those technologies come. But again, relative to the narrative shift, that doesn't really matter. You are in fact seeing this study tortured into other headlines that warp even what

Starting point is 00:12:53 this one is trying to say. Like this one from the Hill, companies have invested billions into AI 95% getting zero return, which, while wildly inaccurate in the substance, does accurately sum up some of the shift in sentiment that we're seeing. It's not just the study, though, that has the most. It's not just the study, though, that has the markets talking, in this confusing market environment where tariff and macro policy is still up in the air, where the markets are assuming that we're getting a rate cut in September, but we don't really know for sure because Powell is going to do what Powell is going to do. You've got a lot of people worried about just how dependent on technology and specifically AI the entire market structure is.

Starting point is 00:13:27 Citron Research is going hard after Palantir, arguing that the company is wildly overvalued. You've got the interpretation of this meta-news around their AI research. organization being once again interpreted as big tech somehow pulling away from AI investment, and all of it is getting super amplified into this gloomy sort of narrative. Now, my strong instinct is that all of this reveals two things. One is that there is a lot of short-term instability right now, particularly around the market aspects of this, that I think it's important to understand and appreciate, but not get too attached to as things are incredibly fast moving in markets and are always changing.

Starting point is 00:14:03 The other thing that it reveals, though, is that there is this big, divide between the way that those in technology and the AI industry specifically talk about and think about AI and how the rest of America outside the tech industry looks at things. And in this case, I do think it's important to identify the specific country domain, given that there are wildly divergent levels of optimism in how, for example, American citizens versus citizens in places like China. Part of this gap is a much longer structural problem around tech media. I think about a decade ago, after the 2016 election, when tech, started being blamed on the left for the election of Donald Trump, and on the right for censoring

Starting point is 00:14:40 conservatives, an entire separate or alternative tech media infrastructure was built up, which has created lots of very powerful voices in the technology industry itself, but hasn't necessarily done a great job of representing the technology industry opinion in shaping larger American discourse. It seems to be that those chickens are coming home to roost with AI. Capturing this well is an opinion essay from the New York Times yesterday by former Google CEO Eric Schmidt. The essay is called Silicon Valley is drifting out of touch with the rest of America. Schmidt and his co-author Salina Shue write, Building a machine more intelligent than ourselves. It's a centuries-old theme, inspiring equal amounts of awe and dread, from the agents in the

Starting point is 00:15:18 Matrix to the operating system in her. To many in Silicon Valley, this compelling fictional motif is on the verge of becoming reality. Reaching artificial general intelligence or AGI, or going a step further, superintelligence, is now the singular aim of America's tech giants, which are investing tens of billions of in a fevered race. And while some experts warn of disastrous consequences from the advent of AGI, many also argued that this breakthrough, perhaps just years away, will lead to a productivity explosion, with the nation and company that get their first reaping all the benefits. And yet, the authors here argue that, one, this obsession with AGI is, as they put it, alienating the general public by not really taking into consideration whatever concerns they might have, but also

Starting point is 00:16:00 failing to appreciate what has already been built. Or, as they They write, bypassing crucial opportunities to use the technology that already exists. The authors compare all of this AGI excitement to the way that AI is happening and being received in China. They write, the country's scientists and policymakers aren't as AGI-pilled as their American counterparts. They talk about how Chinese leaders have emphasized, quote, the deep integration of AI with the real economy, and how because of that, they're focused on actual practical uses.

Starting point is 00:16:30 The authors write, in rural villages, competitions among Chinese farmers, have been held to improve AI pools for harvests. Alibaba's Quark app recently became China's most downloaded AI assistant, in part because of its medical diagnostic capabilities. Last year, China started the AI Plus initiative, which aims to embed AI across sectors to raise productivity. It's no surprise that the Chinese population is more optimistic about AI as a result. At the World AI Conference, we saw families with grandparents and young children milling

Starting point is 00:16:56 about the exhibits, grasping at powerful displays of AI applications, and enthusiastically interacting with humanoid robots. Over three quarters of adults in China say that AI had profanity. only changed their daily lives in the past three to five years, according to an Ipsos survey. That's the highest share globally and double that of Americans. Another recent poll found that only 32% of Americans say they trust AI compared with 72% in China. Coming around to their big point, they write, many of the purported benefits of AGI and science, education, healthcare, and the like, can already be achieved with the careful refinement and use of powerful existing models. For example,

Starting point is 00:17:28 why do we still not have a product that teaches all humans essential cutting-edge knowledge in their own languages and personalized gamified ways? Why are there no competitions among American farmers to use AI tools to improve their harvests? Where's the Cambrian explosion of imaginative, unexpected uses of AI to improve lives in the West? When a technology eventually goes mainstream, that's when it's truly game-changing. It's paramount that more people outside Silicon Valley feel the beneficial impact of AI on their lives. AGI is in a finish line. It's a process that involves humble, gradual, uneven diffusion of generations of less powerful AI across society. Now, regular listeners will know that I have sometimes said that I think AGI is the least

Starting point is 00:18:06 useful term in all of AI. Certainly when it comes to businesses figuring out how to use AI, it's an incredibly distracting concept. It makes sense why technologists and entrepreneurs who are trying to build the next thing are focused on what they see is the big blinking goal there on the horizon. But what it does for the rest of everyone else is it basically creates this artificial linearity of progress best expressed in the numerical naming convention of GPs where it becomes the case that all that matters is how much better five is than four and then six is than five

Starting point is 00:18:36 and so on and so forth into infinity. The better question is, of course, not how much better five is at four, but what new things can five do that four couldn't that make my life or my work better, easier, or more full of some new type of opportunity? The good news is, for as much as it seems like all that matters to the AI industry is how much better five is than four, there's actually so much exciting work happening on real problems of a problem. applied AI that are going to bear fruit totally outside of just the latent capabilities of the underlying foundation models. Let's talk through a couple examples. And by the way, each of these is something that I'm planning on exploring in more depth, probably over the next couple weeks,

Starting point is 00:19:15 is this is arguably maybe other than Christmas, the slowest news time in AI and technology. One of those things is memory. Upon announcing that he was joining Open AI, James Campbell wrote, Memory will fundamentally change our relationship to machine intelligence. And I plan to work extraordinarily hard to make sure we get it right for humanity. Cameron from Letter writes, anytime you see some big name and AI talk about how AGI is supposed to work, you hear them talk about memory,

Starting point is 00:19:39 continuous learning, and maintained state. AGI is a memory problem. Now here again, we can't resist making it an AGI discourse. Andrew Pigninelli also wrote a post recently, Memory is the last problem to be solved to reach AGI. But hold aside the AGI implications of memory.

Starting point is 00:19:56 The reality is that better work on and more solutions around memory, have the potential to massively improve the actual practical applications of AI in the immediate term. In that piece by Andrew, he writes a section called Agents Are Great Processors, but largely lack memory. He says, why are your coworkers valuable to you or your friends, even family? Because they know things about you and know things about your life. They care about you to different degrees because you also know things about them in their lives, and they can interact with you and you with them.

Starting point is 00:20:24 Our systems today get the interaction part right, but that's only half of what's needed to make a digital self. memory is severely lacking interaction. Now from there, he goes in to talk about where we are in terms of different agent capabilities that would make agents better, spending a lot of time on the challenges of every type of memory from long term to episodic. Like I said, I'm going to get much more into this in a complete episode, but there is so much exciting work being done right now on memory that whether it ultimately translates to AGI or not will not matter in the slightest to you, because what it will mean to you

Starting point is 00:20:55 is AI systems and agentic tools that are radically better at working with you to actually accomplish whatever it is you're trying to do. Next up, another thing that people are working on that's incredibly exciting right now are world models. We talked a bunch about Genie 3 a couple weeks ago when it was announced. Jack Parker Holder, the co-lead of Genie 3 at Google DeepMind wrote, we can now generate multi-minute, real-time interactive simulations of any imaginable world. And while again, Jack writes, this could be the key missing piece for embodied AI.

Starting point is 00:21:24 it also just opens up incredible new use cases, not least of which is a totally different approach to customized gaming that could radically alter the face of entertainment. For what it's worth, the big breakthrough of Genie 3 is also related to memory, showing how these things are all connected. In fact, here's Jack again talking about that at an A16Z podcast. Folks expected that at some point, you know, video generation, for example, would become real time. Like, you know, when I saw the Genie 3 post, it was like, okay, they actually went and did it. But the special memory, the persistence was when I kind of sat up in my chair and I was like,

Starting point is 00:21:58 how did that happen? Could you talk a little bit about when did you discover that as an emergent property or was that a specific design goal? So the TLDR is it was totally planned for, but still incredibly surprising when it worked that well, right? So that specific sample, when I saw it, it was hard to believe. I actually wasn't sure that the model generating for a second. I was like, that told me to watch a few times.

Starting point is 00:22:23 and like really check and like freeze the frames and look back and check that it was the same. But so that from going back a few steps, so obviously, Genie 2 had some memory, right? So this got kind of lost because I mean, Genie 2 came at a time when there were lots of announcements, very exciting announcements. I mean, VO2 only a few days later. It was a busy time of the year and the main headline act was that we could do, generate new worlds at all, right?

Starting point is 00:22:50 So that was the thing that we wanted to emphasize. But it did have. you know, a few seconds of memory. And we had a couple of examples. And then for Genie 3, we basically went much more ambitious on the same sort of approach, right? And we made it like a headline goal for ourselves. It's like, can we make the memory be what it is? Right. We said we want minute plus memory and real time and this higher resolution, all in the same model. Now, the other thing about Genie 3 and V-O-3, in fact, is that another discussion that gets lost in the AGI talk, and really just the focus on chatbot LLMs in general is how

Starting point is 00:23:30 much crazy progress is being made on other modalities. I think a great example of this is this week, where the hyper-enfranchised AI Twitter people have all been talking about this model that showed up on LM Arena called nanobanana. Sheikar Patel writes, there's a new mystery AI image model called nanobanana. It appeared on LIMSIS Arena with no announcement. Users report it's shockingly fast under five seconds and can do 2D to 3D conversion. No one knows who built it, although the speculation is Google. The hunt is on. And people have been kind of gobsmacked about Nanobanana. D-Studio Project writes, I'm still amazed at how Nanobanana can take a single image and turn it into exactly what's in my head. The consistency is insane, tone, detail,

Starting point is 00:24:12 and vibe all stay perfectly aligned with the original shot. Later they wrote, now I'm testing nanobanana with product replacement. Even with product photos that have complex patterns, nanobanana can still match them perfectly. On average, it only takes me two to three tries to get a solid result. Daria Sarkova also used it for product design writing. I've been testing nanobanana on my own designs with different lights, settings, and styles, and this tool feels like a game changer for brand designers. She then shows how she integrated product brands with existing photos. Some of it all up, MadPencil writes, Nanobanana is bananas. Now, it seems like we might be getting some news about this soon. Google's Logan Hillpatrick got everyone chattering by simply,

Starting point is 00:24:50 posting the banana emoji on Tuesday night, which was followed up by Josh Woodward from Google posting this is banana emoji, leaving everyone to suspect that this is a new Google model that we are going to get announced very soon. And the point is that for all of the focus on GPT5 and all of the questions that have resulted, there is still so much going on beyond just how performant GPT5 is compared to the models that came before it with a lower number. There's also a new stealth model and cursor that people are thinking is the GROC 4 coding model. And basically my strong prediction is that very, very soon, we're going to once again have a narrative shift away from this slightly dreary is this as good as it gets kind of moment into a realization that there is just

Starting point is 00:25:33 so much more happening all the time across so many different dimensions that I continue to think we've still really barely scratched the surface of what in practice all these models and tools can do. We'll have to see, but that is my best bet. For now, as I posted earlier today. It took us till August 20th, but we finally hit the touchgrass part of the summer. And so maybe we all just need to check out for just a little bit. I certainly hope you don't, and instead choose to continue to listen to the show every day. But for now, that is going to do it for the AI Daily Brief. Appreciate you guys listening and watching as always. And until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - What AI Builders Are Actually Excited About

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.