The AI Daily Brief: Artificial Intelligence News and Analysis - 5 Use Cases for Veo 2
Episode Date: December 18, 2024Google has responded to Sora with their own updated video model Veo 2, and it's getting rave reviews. NLW explores a number of use cases, from social media to advertising to b-roll to establishing sho...ts and beyond. Brought to you by: Vanta - Simplify compliance - https://vanta.com/nlw The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, Google has announced V-O-2, and today we're discussing the most interesting
use cases that are available right now. The AI Daily Brief is a daily podcast and video about the
most important news and discussions in AI. To join the conversation, follow the Discord link in our show notes.
Hello, friends, quick note before we dive into today's episode, the main episode five use cases for VO2
got fairly long, and so we're just doing a main episode, no headlines today. We will be back
with the headlines as normal tomorrow. But for now,
Let's dig in and see how people are actually getting value out of AI video generation right this moment.
Welcome back to the AI Daily Brief.
Something really interesting has been happening recently.
And it was summed up in this tweet from Riley Brown who said,
why does it feel like Google is the underdog that everyone is rooting for?
Of course, what he's referring to is the fact that Google is really getting its groove back from a product perspective.
We've talked extensively and we'll continue to talk about Notebook LM.
But today it's all about VO, or more specifically VO2, which has really stolen a lot of thunder
from OpenAI SORA. What we're going to do today is to discuss a little bit the announcement and then dig
deep on a set of use cases, which the combination of V-O-2 and SORA open up. So what is included in this
announcement? First of all, it's both V-O-2 and Image in 3, so it's not just their video generation
model, but also their image generation, although our focus will be on video generation. VO2 can produce
two-minute clips and resolutions up to 4K. That's four times the maximum resolution of SORA and six
times the duration, with both being industry leading for a consumer-facing model. A big part of
Google's selling point is improved physics and user control. Physics in particular seems like a notable
weak point for SORA. And the difference did seem clear across a number of clips that circulated yesterday.
The model can generate a pair of hands cutting a tomato, a task that Sora failed at spectacularly.
Then there's a very impressive video of a deck of cards being shuffled, a task that Dennis Kardonski
of Sovereign AI referred to as the Turing test for video. There's this video. There's this video
of a truck speeding down a road and then veering off to go over a waterfall. And by the way,
at this point, if you are a listener, I would suggest you either subscribe to the YouTube or just
fire up Spotify where you can watch the video as well, because this is definitely an episode that
benefits from the visual. Anyways, this truck video that we're referring to demonstrates a range
of really tricky physics problems where other models have been challenged. Then of course,
there's the classic throwback to those Will Smith videos from about a year and a half ago,
with a successful video creation of a man-eating spaghetti. It's definitely the,
control of physics that has people most excited. AI design consultant Marco wrote,
What stands out the most to me about Google V-O-2 model is that it appears to actually understand
physics. That's a big leap forward. Another leap forward is the ability to recreate professional
cinematography techniques, like replicating camera motion and the look of different equipment.
Google wrote, V-O-2 understands the language of cinematography, ask it for a genre, specify a lens,
suggest cinematic effects, and VO2 will deliver. Ask for a low-angle tracking shot that glides
through the middle of a scene or a close-up shot on the face of a scientist looking through her
microscope and VO2 creates it.
Video benchmarks are inherently subjective, but Google is claiming that VO2 outperforms
SORA and other rival models on preference and prompt adherence.
The model is now available through Google Labs' VideoFX platform, but you will have to
join the wait list for the moment, which is probably the biggest downside.
So as you would expect, there are a ton of comparisons to SORA.
Marquez Brownlee writes, if these handpicked examples are real, they look better than anything
I've gotten out of SORA. And entrepreneur Bindi Ready writes, Google has officially turned the tables on
Open AI. All you have to do is out in ounce and drown out the other side. Open AI was hoping for a
big press cycle against Google, given that their search is now free. However, Google stole the limelight
with video and image models. Still, I think for me, the conversation about whether V-O-2 or SORA is better
is much less interesting, if only because it's very, very temporal. What's more interesting to me is thinking about
given where the overall state of video generation is, inclusive not only of SORA and V-O-2,
but also PICA 2.0 runway and Luma Labs, what are the use cases that are actually online right now?
The first use case I want to discuss is social media creations.
And this is really where PICA has tried to carve out a niche.
For example, with Pika, they've preloaded a bunch of effects, such as this Cakeify effect,
which you can see in this video that looks like a hot air balloon in the sky,
but then is actually a giant piece of cake.
There's also their squish effect, where people can take a photo of a daily life object,
and then pika will squish it in a video that's definitely purpose for social media.
Similarly, there's a crush it feature, a melted feature, a dissolve feature,
which looks very much like what happened when Thanos snaps his fingers in Avengers Infinity War.
And the point is that when it comes to really creative and cool social media videos,
we absolutely have the tools right now to totally change what you can do.
Now, of course, thinking about this from a business context,
That means that brands can be doing more creative social media generation right now.
The line, however, between social media and advertising is increasingly blurry.
Pieric Chevalier combined reference images of a woman, a Red Bull, a particular set of
kitty-eared headphones, and a neon gamer girl background to show how quickly a branded video
could come together.
He pointed out that we're just at the beginning, saying, just imagine the power once we
achieve 100% object consistency.
And when it comes to advertising, some companies are already jumping ahead and going
full AI for their ads. Last month, for example, E. Toro released a completely generated ad
featuring a dancing barren bull in the middle of Times Square. The results were far from perfect,
particularly the scenes where the animals were dancing, where the physics were far from perfect.
What's more interesting, though, was that the Dore brothers, who produced the ad, said that the entire
project was wrapped up in one and a half weeks from conception to final cut. The idea of producing
an entire ad in a week and a half is absolutely insane and totally game-changing.
This doesn't mean that everyone is going to use AI for all sorts of advertising,
but the dramatic collapse of the cost of advertising production will inevitably change the way
that industry works.
Certainly, it's going to democratize ad creation for smaller companies and brands.
Video advertising is likely to move from something that requires an ad agency and a production
team to something that an intern can whip up in a few days.
Also, the speed of production means that people will be able to respond to pop culture and
cultural moments with near real-time ad generation as well.
All of this hits my thesis that I've shared here before, that the word that most sums up the AI
future is more.
We're just going to have more of everything.
And certainly we're going to have more advertising.
The advertising is going to be more customized, more of the moment, probably more fleeting,
and represent more of the world of business.
In the fashion and lifestyle space, you're already seeing a ton of this.
Flair AI is a platform that specifically optimizes video and image models for ad creation.
And back in October showed an example of a very professional-looking commercial,
generated from Mulberry handbags that was made 100% with AI.
Salma on X, who focuses on AI product photography and video, has also done tutorials on creating
ads for a makeup brand, once again, entirely using AI.
Today's episode is brought to you by Plum.
Want to use AI to automate your work, but don't know where to start?
Plum lets you create AI workflows by simply describing what you want.
No coding or API keys required.
Imagine typing out, AI, analyze my Zoom meetings and send me your insights in Notion,
and watching it come to life before your eyes.
Whether you're an operations leader, marketer, or even a non-technical founder, Plum gives you the power of AI without the technical hassle.
Get instant access to top models like GPT40, Claude Sonnet 3.5, assembly AI, and many more.
Don't let technology hold you back. Check out Use Plum, that's Plum with a B, for early access to the future of workflow automation.
Today's episode is brought to you by Vanta. Whether you're starting or scaling your company's security program, demonstrating top-notch security practices, and establishing trust is more important than ever.
Vanta automates compliance for ISO-2701, SOC2, GDPR, and leading AI frameworks like ISO-42,001, and NIST-AI-R
saving you time and money while helping you build customer trust.
Plus, you can streamline security reviews by automating questionnaires and demonstrating your security posture with a customer-facing trust center all powered by Vanta AI.
Over 8,000 global companies like LangChane, Lila AI, and Factory AI use Vanta to demonstrate AI trust and prove security in real time.
Learn more at vanta.com slash nLW.
That's vanta.com slash nlw.
Today's episode is brought to you, as always, by Super Intelligent.
Have you ever wanted an AI Daily Brief,
but totally focused on how AI relates to your company?
Is your company struggling with AI adoption,
either because you're getting stalled,
figuring out what use cases will drive value
or because the AI transformation that is happening
is siloated individual teams, departments, and employees
and not able to change the company as a whole,
Super Intelligent has developed a new custom internal podcast product
that inspires your teams by sharing the best AI use cases
from inside and outside your company.
Think of it as an AI Daily Brief,
but just for your company's AI use cases.
If you'd like to learn more,
go to Bsuper.a.i slash partner
and fill out the information request form.
I am really excited about this product,
so I will personally get right back to you.
Again, that's Bsuper.A.I. slash partner.
Today's episode is brought to you by Rocket Money.
We are coming up on the beginning of the new year, and that is a perfect time to get
organized, set goals, prioritize what matters most, which for many of us is going to be financial
wellness.
Thanks to Rocket Money, those goals, especially around money, feel achievable.
Rocket Money shows you all of your subscriptions right in one place, helping you easily
cancel those that you've maybe forgotten that you're actually paying for.
Rocket Money also pulls together all of your spending across your different accounts so that you can
clearly track spending habits and see where you can cut back. Rocket Money is a personal finance app that
helps find and cancel unwanted subscriptions, monitors your spending, and helps lower your bills so you can
grow your savings. Their dashboard gives you a clear view of your expenses across all of your
accounts. You can easily create a personalized budget with custom categories. You can see your
monthly spending trends in each category to know exactly where your money is going. Rocket Money will
even try to negotiate lower bills for you. They automatically,
scan your bills to find opportunities to save, and then you can ask them to negotiate.
They'll deal with customer service so that you don't have to.
Rocket Money has over 5 million users and has saved a total of 500 million in canceled subscriptions,
saving members up to $740 a year when using all of the app's premium features.
Cancel your unwanted subscriptions and reach your financial goals faster with Rocket Money.
Go to rocketmoney.com slash AI breakdown today.
That's rocketmoney.com slash AI breakdown.
Our third use case that is available right now is establishing shots, B-roll, and drone footage.
This type of video is used in everything from ads to social media content to professional film.
Big budget productions can afford to go out and get their own, but in many cases this sort of
imagery is purchased from stock libraries.
That's a business model that seems very suspect in the future, as Vio and Sora are both
already incredibly adept at these sort of establishing natural world shots.
AI and design Marco did an entire sizzle reel with this sort of shot, showing just what an incredible
library of video imagery is now available to people on demand and based solely on their imagination.
A fourth use case for AI video right now is storyboarding and brainstorming.
One of the things that people are most excited about when it comes to SORA is the fact that
they've built a storyboard timeline editor directly into the product.
So you can basically plan out an entire sequence of videos that add up to a complete story.
Now, initially, this will, I think, be used by filmmakers, but in the long run, it wouldn't surprise
me to start to see video brainstorming as something that actually becomes part of a broader
set of business activities. You could see internal teams doing video brainstorming to lay out their ads,
even if they work with an ad agency. You could see events in marketing teams planning out and
experimenting with what their event setup might look like at big trade shows.
One of the most important things to remember about generative AI is that it's very hard for us to
not think about the one-to-one replacement phase. In other words, understand how it replaces things that
already exist. For example, I was just talking about how stock video libraries are going to have a
tough time, given that VO and SORA can now create all the types of things that they've previously
made money on. However, I think when the dust settles in a decade or so, the far more interesting
use cases will be the things that simply weren't possible before. So ultimately, I don't know
whether trade show and event sponsorship planning is going to involve video storyboarding
and brainstorming, but it wouldn't surprise me.
Lastly, as much as we've talked about these use cases that are for advertisers and businesses
and social media creators, it's very clear that this is going to infiltrate Hollywood
and professional filmmaking very soon.
The fact that we're starting to understand physics, the fact that VEO can imitate
cinematography techniques.
All of this means that these models are more ready for prime time, even at the highest
levels of production than they've ever been.
The coolest thing about that, though, is that it's not just going to be Hollywood that has
access to them. There could be an absolute renaissance of film and video storytellers,
given the decreased cost of production in the expanded realm for creativity. As Andrew Marman,
a research engineer at Google DeepMind put it, the world we will be able to create. V-O-2 looks
awesome. I'm excited to try to get a chance to play around with it. For now that, that is going to
do it for today's AI Daily Brief. Appreciate you listening as always. Until next time, peace.
