The AI Daily Brief: Artificial Intelligence News and Analysis - Stability AI Releases Stunning Stable Video Diffusion Model

Episode Date: November 23, 2023

The company claims the video generator out performs Pika and Runway. Note: this is the Brief that was supposed to come out Wednesday which got bumped for OpenAI news. ABOUT THE AI BREAKDOWN The AI Br...eakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://theaibreakdown.beehiiv.com/subscribe Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@TheAIBreakdown Join the community: bit.ly/aibreakdown Learn more: http://breakdown.network/

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI breakdown, Stability AI has released a new stable video diffusion model. The AI breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.net for more information about our YouTube, our newsletter, and our Discord. Hello, friends, quick note before we get into the brief today. So I actually wasn't planning on doing a Thanksgiving episode. However, I had recorded this brief on Tuesday night before Sam Altman had been restored to the CEO role at OpenAI. And obviously, that whole story had to bump our normal fore. format of a brief and then a main episode, and I figured why not give you this shortened brief as a
Starting point is 00:00:43 full episode, just in case you're in one of those situations where maybe you need to duck outside for a few minutes to collect yourself and want to zone back into the wild world of AI before you throw yourself back into the fray of family. In any case, hope you enjoy this slightly shorter than normal episode. Happy Thanksgiving. Welcome back to the AI breakdown brief, all the AI headline news you need in around five minutes. Well, yesterday I was very excited to see something new from Stability AI. Just as everyone was getting so, so sick of the OpenAI discussion, stability raced in with a very cool new model that they are calling stable video diffusion. It is, according to their launch tweet, their first foundation model for generative AI based on the image model stable diffusion.
Starting point is 00:01:27 So this is an area that I pay very close attention to. Obviously, video generation is a bit behind text to image generation, but text to image generation has become one of the most used tools for generative AI. Generative video certainly has gotten a lot better. Companies like Runway and Pica Labs have been making great strides, and a lot of early adopters have done some pretty amazing things stringing together very short clips into complete, interesting works of art and storytelling. And while I don't think that the use cases are as ubiquitous as image generation are, which can be everything from creative to professional to useful and practical in the context of YouTube
Starting point is 00:02:04 cover images, the fact that video generation models are trending in a direction where there's a radically reduced barrier to entry for people to create video means it's very likely that more people will create video, and new types of content, new types of storytelling, and new modalities of all of this will emerge. Now as stable diffusion is want to do, they're making the code for stable video diffusion available on their GitHub, and they're also sharing the weights required to run the model locally on their hugging face page. It's quite clear from their announcement that this is just a first step.
Starting point is 00:02:31 They write, our video model can be easily adapted to very downstream tasks, including multi-view synthesis from a single image with fine-tuning on multi-view datasets. We are planning a variety of models that build on and extend this base, similar to the ecosystem that is built around stable diffusion. Now, the first release is two image-to-video models that can generate 14 and 25 frames at customizable frame rates between 3 and 30 frames per second. Stability argues that based on external evaluation, these models are currently surpassing their competitors
Starting point is 00:02:59 in Runway and PICA. Now, for those who are excited to dig in, however, these models are released with a research license only, and as they put it, are not intended for real-world or commercial applications. Still, it's a super cool advance in an area that I think is very, very interesting, and one that I'm going to continue to keep an eye on. Next up, because OpenAI doesn't have enough going on right now, Microsoft to the subject of yet another author copyright lawsuit around model training. Hollywood Reporter editor Julian Sankton is leading a new class action lawsuit that was filed in Manhattan federal court, alleging that OpenAI copied tens of thousands of nonfiction books without
Starting point is 00:03:33 permission. So what's new about this suit? Well, it's the first one that names Microsoft as a defendant as well. And, well, actually, that's kind of the only new thing. Sanctin's attorney said, while OpenAI and Microsoft refused to pay nonfiction authors, their AI platform is worth a fortune. The basis of OpenAI is nothing less than the rampant theft of copyrighted works. Open AI, perhaps unsurprisingly, declined to comment. Meanwhile, however, these lawsuits are not necessarily going all that well. The Hollywood reporter writes, Sarah Silverman hits stumbling block in AI copyright infringement lawsuit against Meta. TLDR, a federal judge has dismissed most of Sarah Silverman's lawsuit against Meta over the unauthorized use of the author's copyrighted books
Starting point is 00:04:12 to train its generative AI model. Writes the Hollywood reporter, U.S. District Judge Vince Chabria on Monday offered a full-throated denial of one of the author's core theories that Meta's AI system is itself an infringing derivative work made possible only by information extracted from copyrighted material. Wrote the judge, this is nonsensical. There is no way to understand the Lama models themselves as a recasting or adaptation of any of the plaintiff's books. Now, another of Silverman's argument that every result produced by their tools constitutes copyright infringement was dismissed for similar reasons. In other words, because her lawyers didn't offer any evidence that any of the outputs, quote, could be understood as recasting, transforming, or adapting the plaintiff's
Starting point is 00:04:49 books. Now, overall, Chabria gave her lawyers a chance to re-plead that claim, along with five others that weren't allowed to advance. Now, as this article points out, this is the second recent case in which a judge has denied large portions of one of these class action lawsuits around copyright claims when it comes to AI training. In both cases, it appears that plaintiffs are going to have to present evidence of, quote, infringing works produced by AI tools that are identical to their copyrighted material. Now, whatever setbacks they have, I don't anticipate this to be the end of the lawsuits that come around this area, and I don't think it's going to get fully resolved until it hits the Supreme Court, and maybe not even then. Over in markets, Nvidia had its earnings report, and once again,
Starting point is 00:05:28 it reported another quarter of record sales. In the fiscal third quarter, sales more than tripled to 18.1 billion, which was well above Wall Street forecasts. Profits also rose to 9.2 billion, up from 680 million a year earlier. And yet, even with these impressive numbers, it was not enough to get markets really excited. Indy's shares were flattened after hours trading. And part of the reason for that might be that, as well as they're doing now, there are serious headwinds in terms of the new regulations and restrictions around the export of AI chips that clearly investors are nervous will have an impact ultimately in Nvidia's bottom line. Speaking of market excitement, the CEO of HP told Jim Kramer on CNBC that the advance of AI
Starting point is 00:06:07 is likely to double the growth of the PC category. Enrique Lores said, this will drive significant momentum in the category, some in 24, more in 25, more in 26. As we've said before, we think this is going to double the growth of the PC category starting next year. Now, of course, what we're talking about here is the ability for PCs to actually run complex AI models locally without having to touch the cloud. It's very clear that this is something that Apple is prioritizing and driving towards, and they are certainly far from alone. Now, part of why markets are so excited about AI is the belief that it will lead to major increases in productivity, while one new think tank study suggests that the use of AI could create a four-day workweek
Starting point is 00:06:45 for almost one-third of workers. Writes the Guardian, the report from the think tank economy found that projected productivity gains from the introduction of AI could reduce the working week from 40 to 32 hours for 28% of the workforce, 8.8 million people in Britain and 35 million in the U.S. while maintaining pay and performance. Said the director of research at autonomy, quote, too many studies of AI, large language models, and so on, solely focus on either profitability or a job's apocalypse. This study tries to show that when the technology is deployed to its full potential, and the purpose of the technology is shifted, it can not only improve work practices, but also improve work-life balance. Now, obviously this is super interesting. However, one thing
Starting point is 00:07:22 that is worth noting, especially as we have broader conversations about AI policy and direction, is that when it comes to shifting from a 40-hour workweek to a 32-hour workweek, that's not just a productivity question alone. It's also a social contract question. It's a societal expectation question. In other words, it's not going to happen unless society decides that that's a good way to move forward. For it to do that, people have to advocate for it, and it likely has to become policy. But in either case, the fact that this is the way that we're talking is one more indication of just how high potential so many people think this technology is. However, for now, that is going to do it for today's AI breakdown brief.
Starting point is 00:07:59 Happy Thanksgiving.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.