The AI Daily Brief: Artificial Intelligence News and Analysis - How Just Released Sora Stacks Up to Other AI Video Generators

Starting point is 00:00:00 Today on the AI Daily Brief, SORA has finally been released by OpenAI, and Google announces a breakthrough in quantum computing. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes. There has been this longstanding pattern of Google trying to command a news cycle and Open AI swooping in and finding a way to front run them. Yesterday, though, we kind of got the inverse, where theoretically Sorrow was the biggest announcement, until all of a sudden Google came out with this announcement, which had people's jaws on the floor. Google has announced a new quantum computing chip called Willow.

Starting point is 00:00:50 They claim that, quote, Willow performed a standard benchmark computation in under five minutes that would take one of today's fastest supercomputers, 10 septillion that is 10 to the 25 years, a number that vastly exceeds the age of the universe. Part of the announcement is a claim that they've solved the scaling. problem with quantum computing. The chip architecture is capable of reducing errors exponentially, as more cubits are added, which is the quantum equivalent of bits. Errors due to interaction with the surrounding environment were the key issue with the technology and stood as an unresolved problem for almost 30 years. Google Quantum AI founder Hart-Mutnevin wrote,

Starting point is 00:01:24 this historic accomplishment is known in the field as below threshold, being able to drive errors down while scaling up the number of cubits. You must demonstrate being below threshold to show real progress on error correction. And this has been an outstanding challenge since quantum error correction was introduced by Peter Shore in 1995. If that was all completely Greek to you, the TLDR implication is that this is the first time it appears that there's been a viable pathway to quantum computing at scale. Until now, all experiments have been extremely small proof of concepts, novel and important, but not the first step on the path towards building a useful quantum computer. The big idea with quantum computing is the ability to process

Starting point is 00:01:57 certain computations at an unfathomable speed. Traditional computing only moves in a straight line, with the processor testing solutions in sequence before coming up with an answer. Quantum computing allows all solutions to be tested simultaneously. In terms of why we're discussing it here, the technology could be a massive unlock for AI training once it's commercially viable. My colleagues sometimes ask me why I left the burgeoning field of AI to focus on quantum computing. My answer is that both will prove to be the most transformational technologies of our time,

Starting point is 00:02:23 but advanced AI will significantly benefit from access to quantum computing. Quantum computing will be indispensable for collecting training data that's inaccessible to classical machines, training and optimizing certain learning architectures, and modeling systems where quantum effects are important. This includes helping us discover new medicines, designing more efficient batteries for electric cars, and accelerating progress in fusion in new energy alternatives. Many of these future game-changing applications won't be feasible on classical computers. They're waiting to be unlocked with quantum computing. Now, it seems incredible, but what is the hype mitigation version of this story? Former NVIDIA leader Boyan Tongu said, I've been holding off saying

Starting point is 00:02:57 more about Google's purported quantum computing breakthrough until I read a bit more about it. It turns out, as I had suspected, it was way overhyped. Yes, it's good science, but in terms of any kind of practical applications, we are probably at least a decade away. Even then, it will most likely be specialized areas of application like molecular dynamics. A good rule of thumb is that quantum computers are really good at doing stuff that comes naturally down to quantum mechanics, which is literally all about randomness. Deterministic computations that are relevant for conventional computation are on par with what

Starting point is 00:03:24 conventional computers can do. Still, he kind of just shrugged off the idea that quantum computing is a decade away. And I think that's where a lot of the excitement is coming from. It's about what the future timeline looks like. Educator Paul Kruvert writes, so in less than 24 hours we got, Google unveiling a quantum chip that solves in five minutes what would take the best supercomputers 10 septillion years,

Starting point is 00:03:43 open AI launching SORO with almost photorealistic AI video quality. The timeline is unreal. Venture investor Neil Kostla writes, stuff is about to get really weird if AGI and useful quantum computing timelines line up. Add a little robotics to the mix, and there's a growing probability the world 25 years from now looks literally nothing like it does today.

Starting point is 00:03:59 Alex Treese points out, best part of this quantum announcement is Google saying, yeah, I don't know, we probably live in a multiverse then. The specific line he pulled from the blog post is, it lends credence to the notion that quantum computation occurs in many parallel universes in line with the idea that we live in a multiverse. Again, a throwaway line. All in all, while it may not be technology for today, it is still pretty cool. Speaking of AGI, Microsoft AI lead Mustafa Sullyman has weighed in on the AGI debate and thinks something very different than Sam Altman. During a Reddit AMA, Soleiman said that the technology is still a decade away, adding, I don't think it can be done on NVIDIA Blackwell GB200s. I do think it is going to be plausible at some point in the next two to five generations.

Starting point is 00:04:38 I don't want to say I think it's high probability that it's two years away, but I think within the next five to seven years since each generation takes 18 to 24 months now. So five generations could be up to 10 years away depending on how things go. This, of course, flies in the face of Sam Alman's recent comments where he said AGI was coming, quote, sooner than most people in the world think, and it matters much less. Suleiman also dive into the philosophical question of what AGI is, which of course has a pretty big impact on when we think it's going to arrive. He said, to me, AGI is a general purpose learning system that can perform well across all human level training environments. So knowledge work, by the way,

Starting point is 00:05:11 that includes physical labor. A lot of my skepticism has to do with the progress and complexity of getting things done in robotics. But yes, I can well imagine that we'll have a system that can learn without a great deal of handcrafted prior prompting to perform well in a very wide range of environments. I think that that is not necessarily going to be AGI, nor does that lead to the singularity, but it means that most human knowledge work in the next five to ten years could likely be performed by one of the AI systems that we develop. And I think the reason why I shy away from the language around singularity or artificial superintelligence is because I think they're very different things. Of course, the definition of AGI matters a great deal to Microsoft as it

Starting point is 00:05:42 would trigger the termination of their deal with Open AI if the lab manages to achieve it, or at least if Open AIS board declares that AGI has been achieved. Recent reporting, however, suggests that this clause in the contract is being reconsidered, with open AI potentially removing it in order to smooth the process of converting to a for-profit company. Suleiman touched on the reported tensions between the two firms stating, every partnership has tension. It's healthy and natural. I mean, there are a completely different business to us. They operate independently and partnerships evolve and they have to adapt to what works at the time. So we'll see how that changes over the next few years.

Starting point is 00:06:11 For what it's worth, this is the least deniely of any sort of the questions of tensions we've seen, which could be a reflection of the fact that it's gotten more sour, or could also simply reflect the fact that Sullyman was sort of brought in, as Microsoft's hedge against whatever the heck is going to go on in Open AI. Lastly today, XAI have officially announced the release of their cutting edge image model by the end of the week. Late on Friday night, XAI released their in-house image model named Aurora. The model was only available for a few hours, but that was enough time for users to be amazed by its capabilities. It produced some of the best photo-realistic images seen to date from generative AI and seemed to excel at replicating celebrities

Starting point is 00:06:46 in particular. Once it was pulled down, users were left wondering when they would get another their chance to play with the model, or indeed whether there had been a critical safety issue, perhaps associated with the near complete absence of copyright and deepfake guardrails. The team has now confirmed Aurora's release in select regions with a full rollout within the week. In an announcement blog post, they wrote, GROC can now generate high-quality images across several domains, where other image generation models often struggle. It can render precise visual details of real-world entities, text, logos, and can create realistic portraits of humans. XAI developer Ethan Knight wrote, earlier today we released a new model codenamed Aurora

Starting point is 00:07:18 that gives Rock the ability to generate extremely photorealistic images, and in the future, even edit them. It's free to use for Olive X, try it out, and send us what you're creating. This model was trained entirely in-house with a very small team and we're excited to finally show it off. Elon Musk confirmed that the model was developed internally in around six months, which settles a few questions from Friday night's sneak preview. Primarily whether the model was a collaboration with Black Forest Labs, who provides the flux model that currently drives XAI's image generation capabilities, but the fact that the model was trained so quickly by a small team suggests either

Starting point is 00:07:48 that A, we're about to see a massive improvement in image generation across the board, or that the XAI team is simply as cracked as they seem to think they are. In either case, another contender in the image generation space. That, however, is going to do it for today's AI Daily Brief Headlines edition. Next up, the main episode. Today's episode is brought to you by Plum. Want to use AI to automate your work but don't know where to start. Plum lets you create AI workflows by simply describing what you want.

Starting point is 00:08:15 No coding or API keys required. Imagine typing out AI, analyze my Zerner, Zoom meetings and send me your insights in Notion and watching it come to life before your eyes. Whether you're an operations leader, marketer, or even a non-technical founder, Plum gives you the power of AI without the technical hassle. Get instant access to top models like GPT40, Claude Sonnet 3.5, assembly AI, and many more. Don't let technology hold you back. Check out Use Plum, that's Plum with a B, for early access to the future of workflow automation. Today's episode is brought to you by Vanta. Whether you're starting or scaling your company's

Starting point is 00:08:46 security program, demonstrating top-notch security practices, and establishing trust is more important than ever. Vanta automates compliance for ISO-2701, SOC2, GDPR, and leading AI frameworks like ISO-402, and NIST AI Risk Management Framework, saving you time and money while helping you build customer trust. Plus, you can streamline security reviews by automating questionnaires and demonstrating your security posture with a customer-facing trust center, all powered by Vanta AI. Over 8,000 global companies like Langchain, Lila AI, and factory AI use Vanta to demonstrate AI trust and prove security in real time. Learn more at Vanta.com slash NLW. That's Vanta.com slash NLW. Today's episode is brought to you, as always, by Super Intelligent. Have you ever wanted an

Starting point is 00:09:31 AI daily brief but totally focused on how AI relates to your company? Is your company struggling with AI adoption, either because you're getting stalled figuring out what use cases will drive value or because the AI transformation that is happening is siloated individual teams, departments, and employees and not able to change the company as a whole. Super Intelligent has developed a new custom internal podcast product that inspires your teams by sharing the best AI use cases from inside and outside your company. Think of it as an AI daily brief, but just for your company's AI use cases. If you'd like to learn more, go to Bsuper.a.i slash partner and fill out the information

Starting point is 00:10:08 request form. I am really excited about this product, so I will personally get right back to you. Again, that's besuper.a.ai slash partner. After months of waiting, OpenAI has finally released Sora. When the company announced last week that they were going to be doing 12 days of shipmiss, rumors immediately started swirling that Sora was going to be a part of it, and frankly, a lot of people felt like if it wasn't, there was something wrong. Well, it turns out that those rumors were correct, as yesterday OpenAI tweeted, our holiday gift to you, SORA is here. They continue. Now you can generate entirely new

Starting point is 00:10:44 videos from text, bring images to life, or extend remix or blend videos you already have. We've developed new interfaces to allow easier prompting creative controls and community sharing. Since previewing SORA in February, we've been building SORA turbo, a significantly faster version of the model to put in your hands. We're releasing it today as a standalone product to Plus and Pro users. We hope this early version of SORA will help people explore new forms of creativity. We can't wait to see what you create. A couple things that I think are interesting about this. As you just heard, the model can generate videos based on text or image inputs.

Starting point is 00:11:15 It can create clips up to 20 seconds long and up to 1080p in resolution. But alongside the model, they also put a lot of thought into the interface. Users can easily toggle between generated videos that are displayed on a grid or a list. I think more interesting, though, is the storyboarding mode, which allows users to arrange multiple clips into a continuous video. The model will attempt to create seamless transitions between individual clips, but users have control over the speed of cuts. Storyboard also allows frame-by-frame inputs. This is one of those interface updates that makes a huge difference in terms of how usable the model actually is from a

Starting point is 00:11:48 real production standpoint. In terms of availability for Plus Tier subscribers, SORA is now available with a limit of 50 videos per month at up to 480p resolution or fewer videos at 720P. Pro-tier users can access higher resolutions, longer durations, and up to 500 videos per month. And if you manage to burn through all 500 videos, the model can still be accessed at a lower speed. That is, of course, assuming that you can actually sign up. As you can see, when you go to Sora.com, account creation is not currently available. Altman tweeted, we significantly underestimated demand for Sora. It's going to take a while to get everyone access, trying to figure out how to do it as fast as possible.

Starting point is 00:12:23 My strong guess is that they did not underestimate demand for Sora. It's just they are constrained, and they decided that it was better to launch and then deal with this to create a sense of urgency and hype, then to slow trickle people who officially had access to it. Another side note of this, SORA is not available in the EU and UK, showing once again how these countries' regulatory stances are denying their citizens' access to the cutting edge. When Tegakanch tweeted, bring it to Europe, please. Altman responded, we want to offer our products in Europe and believe a strong Europe is important to the world. We also have to comply with regulation. I would generally expect us for new products to have delayed launches in Europe,

Starting point is 00:12:58 and that there may be some we just can't offer. Now, when it comes to the delay on why it took so long to get SORA. Part of it seemed to be a safety consideration. For now, OpenAI seems to be going with the plan of releasing a model and fine-tuning the safety parameters as they see how people use it. They wrote, we're introducing our video generation technology now to give society time to explore its possibilities and co-developed norms and safeguards that ensure it's used responsibly as the field advances. The model complies with the C2PA standard, which ensures that videos are identifiable as AI generated and have watermarks. Uploads of people are limited at launch, they said, but they also said that they would become available as the company refines their deep fake mitigations.

Starting point is 00:13:35 They said early feedback from artists indicate that this is a powerful creative tool that they value, but given the potential for abuse, we are not making it initially available to all users. During the live stream SORA product lead, Rohan Saha'i announced that this is a pretty steep tradeoff, but said, we obviously have a pretty big target on our back as Open AI. We want to prevent illegal activity on SORA, but we also want to balance that with creative expression. We know that it will be an ongoing challenge. We might not get it perfect on day one. We're starting a little conservative, and so if our moderation doesn't get it quite right, just give us that feedback. And indeed, some users are running into that, with Nick St. Pierre, for example, failing to generate a bear eating a salmon.

Starting point is 00:14:10 But let's talk about what people are finding with this. Marquez Brownlee, who is a famous product reviewer and has recently been ragging on a lot of products coming out of the AI space, although admittedly it's AI hardware, tweeted a review yesterday, saying, the rumors are true. SORA OpenAI's video generator is launching for the public today. I've been using it for about a week now. I've learned a lot testing this. The video has a bunch of garbled text, the telltale sign of AI generated videos.

Starting point is 00:14:36 But the cutaways, the moving text ticker, the news style shots, those were all things who are decided to do on its own, and those news anchors looked very real. It's still a product, though, with pros and cons, and one of the cons is that physics is still hard. Without a quote-unquote understanding of the objects in the video, the model is still prone to hallucinations

Starting point is 00:14:52 in the form of movements that don't make sense, and lack of object permanence. He then shared a few examples of that, where the physics of the objects in the videos just don't quite work. And that's something we'll come back to in a moment. Brownlee continues, it can be really good at landscapes. Almost any drone shot of a significant landmark could pass for stock footage

Starting point is 00:15:08 or is very close to usable for an establishing shot in a documentary or low-budget film. Turns out it can do a passable job with cartoon style or stop-motion style, since the irregularities in movement and physics appear more stylistic. The other features like remixing videos or turning images into videos can be useful tools if you know what you're doing, but the most consistent finding for me was that the models don't know what direction or speed makes sense for objects in that specific picture. So sometimes it gets it right and sometimes it gets it really wrong. To the extent there was a clear critique, it was

Starting point is 00:15:36 definitely about Sora not having solved physics. Christopher Bryant showed a video of a set of birds, flying in a way that really was a natural, to which Mr. Bizarro said the bird test, no model has passed it yet. Anjni Midha from A16Z writes, big props to the Sora team for actually shipping. The feeds full of embarrassing displays on how the model isn't a world simulator. But the torrent of human preference data open AI will unlock via the new interfaces gold. If Google Cloud, Azure, AWS, Adobe, et cetera, aren't updating their urgency priors right now, they've missed the point. That idea of a world simulator or a world model is something that we explored in last week's Long Reads episode, and this is a good context to go back and see why some people think that that's going to be

Starting point is 00:16:15 key for helping AI reach the next level. Victor M writes, SORAS videos look impressive, but physics understanding and consistency is still not there yet. This prompt was quite simple, a humanoid robot standing near the table with red, green, and blue cubes on it, performs a cube stacking task with red in the bottom and blue on the top. It then shows the robot, kind of doing it, but not really. And yet it's not like the physics is off always. Edwin Arbus showed a video of a prompt, a golden retriever with a shiny wet coat skillfully balances on a surfboard as it rides a gentle wave at Pacifica Beach. The dog's tongue hangs out in excitement and its eyes are focused on the horizon. The backdrop includes a wide expanse of the ocean with rolling waves and clear blue sky.

Starting point is 00:16:52 And that one looks really natural. The physics are much, much closer to the real world. The other thing that you see a lot of is people comparing their results from SORA to other models including Kling and Runway. PGA, the CEO of Filmport.AI, said, my review of SORA after paying $200, sometimes it produces something great but you can get better results elsewhere. Kling, Minimax, and Runway have nothing to worry about in the near future. One specific thing that had pointed out was that Sora seemed to have more trouble for image-to-video, which is something that Marquez Brownlee had pointed out as well. In terms of the status of where everything else is, Pika Labs released their latest version. 1.5 back in October, and it's known for peak effects like exploding items, squishing items, and Kakeify. It seems designed to be useful for social media clips right now, as opposed to a

Starting point is 00:17:34 fully professional tool. Runway released Gen 3 Alpha Turbo in October. The big news at the time was that they had partnered with Lionsgate, so presumably that studio at least thinks that runway is studio quality and useful in a professional environment. Luma Labs updated Dream Machine back in November, which has been notable for how accessible it is. There have been a lot of examples of it being used for fashion, marketing, and filmmaking. And finally, there's Kling, a Chinese model whose latest version was released in September and was for many people mind-blowing. And so that's where we stand. It's been less than 24 hours with a very small number of people having access, so it's hard to know so far how it really compares. What's for sure is that video generation in 2025 is going to be radically more accessible

Starting point is 00:18:14 and much more used than it was this year, and there are going to be some pretty big implications of that. Later in the week, I'll come back and explore some business use cases for SORA. For now, though, that is going to do it for today's AI Daily Brief. Appreciate you listening or watching, as always. Until next time, peace.

The AI Daily Brief: Artificial Intelligence News and Analysis - How Just Released Sora Stacks Up to Other AI Video Generators

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.