The AI Daily Brief: Artificial Intelligence News and Analysis - What Runway Gen-3 and Luma Say About the State of AI
Episode Date: June 18, 2024Explore the latest in AI video technology with Runway Gen-3 and Luma Labs Dream Machine. From the advancements since Will Smith’s AI spaghetti video to the groundbreaking multimodal models by OpenAI... and Google DeepMind, this video covers the current state of AI development. Discover how companies are pushing the boundaries of video realism and accessibility, and what this means for the future of AI-generated content. Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'youtube' for 50% off your first month. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, a look at AI video generation and what the release of runway, Gen 3, Luma, and Kling in the last few weeks says about the broader state of AI development.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, visit the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines edition, all the daily AI headline news you need in around five minutes.
Today we kick off with a warning from the International Monetary Fund over the potential that
AI will increase inequality.
This is from a new report from the IMF published Monday, June 17th, that basically said that
the IMF had what they called profound concerns about significant labor disruptions and rising
inequality.
What's more, it pointed out that unlike previous disruptive technologies, AI was coming not
just for blue-collar jobs, but for higher-skilled occupations as well.
While acknowledging that generative AI held significant potential,
to boost productivity and even advance the delivery of public services, it was also they believed
going to cause some big problems. So what were these suggested remediation? Some of them were things
like improving unemployment insurance, but a big focus was on policies around education and training.
From the Financial Times, in its report, the IMF said that policies on education and training needed
to adapt to new realities to help repair workers for a rapidly changing job market in the future,
with a greater focus on offering lifelong learning. Sector-based training, apprenticeships, and reskilling
could play a greater role in helping workers with the transition to new tasks and sectors.
Said ERA Dablanoris, deputy director of the IMF's fiscal affairs department and co-author of the report,
quote, the transition could be painful for workers, and older workers may not have the skills
that are needed in the age of AI, and it may require more time than in the past to acquire those new
skills. We want people to be able to benefit more broadly from the potential that this technology
holds, and we want to ensure that there are opportunities created for people.
Obviously, it will come as a surprise to none of you, given the fact that I've chosen to spend all my time
on super-intelligent, a scalable platform to help educate the world on how to use AI, that I also
think that a big part of the answer to this question is in education. Next up, we move over to NATO.
Something you might not know is that NATO has a $1.1 billion innovation fund, the NIH.
This fund was unveiled back in summer 2022, a couple months after the Russian invasion of Ukraine.
One of the stated goals of the fund was to invest in technologies that could enhance NATO's
defenses. The fund is currently backed by 24 of NATO's 32 member states. Earlier today on Tuesday,
June 18th, the NATO Innovation Fund announced that it had invested in four European tech companies,
including fractal, which Reuters described as a London-based computer chipmaker aiming to make
LLMs like those that power chat GPT run faster, Germany's ARX Robotics, which is an unmanned
robotics company, Icomat, which makes lighter materials for vehicles, and SpaceForge, a company
that uses specific conditions of space, including microgravity and vacuum conditions to build
semiconductors in orbit. So there you have it. One of the biggest military alliances in the world is
officially investing in AI. Speaking of interesting collaborations, OpenAI has continued to expand its footprint
in the healthcare space, teaming up with color health on a new cancer co-pilot. Color was founded as a genetics
testing company back in 2013 and has developed an AI assistant using the GPT4O model. According to the
Wall Street Journal quote,
the co-pilot helps doctors create cancer screening plans as well as pre-treatment plans for people
who have been diagnosed with cancer. Said Othman Laraki, the co-founder and CEO of color,
the co-pilot is intended to assist doctors not replace them. Indeed, the use of that term
copilot is meant to reference engineering co-pilots that are augmenting software engineers
rather than replacing them. This is not the first time that OpenAI has done a deal with a company
in the health space. Back in April, for example, they announced to deal with Moderna, where that
company would use AI to speed up business processes such as selecting optimal doses for clinical trials.
And Brad Lightcap, OpenAI, COO, said we see a perfect fit for AI technology because they can help
bring relevant information to the surface faster. They can give clinicians more tools to understand
medical records, to understand data, to understand labs and diagnostics. We spent a lot of time on
this show and in the media in general talking about the impact of AI on creative industries,
entertainment, but in more heavy science-based industries, there is a ton of interest and excitement
happening there that could lead to some incredible benefits from artificial intelligence.
Finally, just to make it clear once again how hot the AI agent or AI assistant space is,
U.com is raising $50 million to get into that space.
U.com had previously been positioned as an AI-infused search engine, but as all search engines
have started to AIify themselves, they've started to move more directly in this AI assistant space.
Reuters writes, U.com's 11 million visitors in May reflected a year-to-date rise in web traffic,
but the number was still below its 20 million peak in February 2023.
Its app downloads have decreased an estimated 69% so far in 2024.
Against this backdrop, the company has morphed U.com into an AI assistant,
one that is focused on productivity as well as internet search.
Lots and lots of activity in this AI assistant space, but also tons of competition,
something we will continue to keep an eye on here at the AI Daily Brief.
However, for now, that is going to do it for the headlong.
lines up next the main episode. A quick note before we get back to the show, today's episode is
brought to you by Super Intelligent. Super is of course the platform that we built and released a couple
months ago to help people learn how to use AI. It's built around fun, fast tutorials that get you
actually using AI in minutes, not hours, and certainly not days. The learning all happens in the
context of an engaged community, and we've got a bunch of exciting features rolling out in the
weeks to come. One of those is a new team's version of the platform, which includes a custom
curated playlist, as well as a showcase where people on your team can share their use
cases and projects in AI across the organization. If you are interested in being a part of the
super for teams beta, go to B-Supertai slash partner so you can learn about the program.
Welcome back to the AI Daily Brief. There has been a ton of activity over the last few weeks
in the AI video space, and I want to use this episode as a way to not only catch
you up on what's been happening there, but also to use it as a way to ground our sense of where we
are in AI development more broadly. And for that, I think we have to go back to about 15 months ago.
You may remember in late March of last year when AI generated videos of Will Smith eating spaghetti
first started hitting the internet. This was right around the time of the six-month pause letter.
It was right as the AI safety narrative was starting to take hold in mainstream media.
And it was certainly still in a period where people were being dismissive of AI
because of the weirdness in its generations, the weird extra fingers, strange extraneous body parts
happening with no apparent connection to anything else going on. And these videos were sort of
Exhibit A and all that. They were at once disturbing and also weirdly compelling, and people
just watch them on repeat. However, for our purposes, they matter much more as a way marker
of where things were, given the advances we've seen since then. At the end of last year,
our friends over at Leighton Space published something that they called the Four Wars of the AI
stack. These were basically the big sets of questions they saw as being unanswered and for which
the answers had a major impact on how the AI field was likely to develop. One of those wars was
what they called the multimodality war, specialist models versus everything models. So on the one
hand, you had specialist models, text to image startups like Mid Journey, text to audio startups, etc.
versus the everything models that companies like OpenAI and Google Deepmind were pursuing.
And at the time that this came out and we were first talking about it on this show, I think
for as advanced as the everything models were, there was still a sense that those specialist
models really could remain differentiated. Specifically, I think a lot of people felt that
Mid Journey was still ahead, for example, of OpenAI's Dolly 3, even if Dolly 3 had made some
advances, like being better with text. But then in February, OpenAI released the first
generations with its video model called SORA. Just less than a year from when we saw those
Will Smith eating spaghetti examples, we got these hyper-photorealistic,
that absolutely blew people away.
There was the famous woman walking through a Tokyo street,
Woolly mammoths running through snow,
a wide panning shot of a lighthouse
on what looks like the California coast,
and then there were examples like the pirate ship
swirling around in what appears to be a cup of coffee,
which showed just how much better this model seemed to understand physics.
For many people, the release of SORA was a moment
that absolutely reignited our sense of the unfettered possibilities of AI generation.
The problem was that Sora hasn't really,
really been widely available. Sure, OpenAI has done a lot of sharing of SORA with very select
artists, creators, and they've been clearly making a pitch towards Hollywood and the traditional
film and entertainment industry, but it hasn't been available broadly to the public. The same is true
for Google Vio, which was their answer to SORA shown off a couple months later, which also was
limited in its distribution. So the situation we've been in for the last couple months is one in which
the state of the art when it comes to video generation has not been widely available. The last few weeks,
have seen some major moves in that area.
Fast forward to yesterday, Monday, and Runway became the third AI video company in a matter
of three weeks to release their latest model.
Runways was called Gen 3 Alpha.
The announcement post on Twitter reads,
Gen 3 Alpha can create highly detailed video with complex scene changes, a wide range of cinematic
choices and detailed art directions.
Runway continues, Gen 3 Alpha is the first of an upcoming series of models trained by
runway on a new infrastructure built for large-scale multimodal training and represent
represents a significant step towards our goal of building general world models.
Gen 3 Alpha was trained on both videos and images,
and powers Runway's text to video, image to video, and text to image tools.
Apparently, Runway has also been collaborating with entertainment companies,
thinking about the obvious applications of this technology,
and while it wasn't available immediately upon announcement,
they said that it would be rolling out to everyone over the coming days.
The early impressions of Gen 3 definitely felt that it represented a significant realism upgrade.
And in addition to the fidelity of the actual generations,
they also promised more fine-grained control.
The runway website reads,
Gen 3 Alpha has been trained with highly descriptive,
temporally dense captions,
enabling imaginative transitions
and precise key framing of elements in the scene.
Almost immediately, people started doing their threads
of the most impressive generations,
although initially many of them did come from the runway team itself,
and there were some interesting notes as well.
While a lot of folks were focused on the increase in the realism
enabled by the new model,
others like habitualization on Twitter,
who does support at runway,
tweeted a very weird,
stylized video and said hyperrealism is sick, but I am unfortunately someone who will die on the hill
of avant-garde experimentalism and thus spent a while making surreal, grainy, noisy craziness in Runway's
new Gen 3 model to great success. And then shared a set of five examples of these distinctly
non-realistic generations. Tom's guide went a little bit farther in explaining what runway means when
it says that its end goal is general world models. They describe them as, quote, an AI system that can
build an internal representation of an environment and use it to simulate events inside that
environment. Times Guide also noted that, quote, each video is about 10 seconds long, which is about
twice as long as a Luma default and of a similar length to SORA videos. It is also nearly three
times the length of the current runway Gen 2 videos. Andrew Curran also mentioned Luma, saying
Luma has started the avalanche. What he means is that last week, Luma Labs absolutely dominated
the AI Twitter conversation with their release of Dream Machine. Dream Machine came out last week and while
it wasn't nearly on the same level for most as the realism in something like Sora,
The fact that it was widely available made all the difference in the world.
One of the first use cases that popped off online was people animating classic memes.
Even Andre Carpathie, formerly of Open AI and Tesla said, wow, the new model from Luma
Labs extending images into videos is really something else.
I understood intuitively that this would become possible very soon, but it's still something
else to see it and think through future iterations of.
It should be noted though that really when it comes to what started this flood, it wasn't
Luma Labs, but was actually from Chinese company Kui.
Kui is basically a TikTok competitor with 400 million plus daily users, and when Kling was released,
it was immediately available right then and there.
Kling was good enough that some folks, like Didi at Menlo Ventures, said China might be surpassing
the U.S. at AI.
Now, of course, one of the big questions for folks with all this is whether OpenAI would
actually respond and get SORA out to the public.
Matt Wolf writes, maybe we'll see Sora soon amongst all this competition popping up?
When we do, will people still be excited by it?
Then again, as Dogenioro writes,
The leap of sorrow was so big that it feels like nothing surprises me anymore.
Anyone else feel this way?
AI and design, however, responded the surprise lies with the fact that while OpenAI has been
dicking around pandering to Hollywood, other players have achieved comparable quality and are
making it available to the public in four months.
That's the pleasant surprise.
It's clear that the race is on.
In fact, even as runway Gen 3 was announcing itself, Luma Labs was talking about what was coming
next.
They tweeted coming soon to Dream Machine, powerful, editability, and intuitive controls.
They shared a video of a new editing suite that allows for much more fine-grained precision
editing when it comes to these video generations, which is something that hasn't been widely
available yet.
They also released something called extend video, which they said is aware of what's happening
in your video and extends it in a consistent way to follow instructions.
It's clear that as many advantages as the big companies have, the smaller startups in
this space have some advantages as well.
Dan Kay formerly of Google responded to someone who was tweeting about Luma,
Now you know why I left Google to join Luma.
I was in the team that developed VO early on, but knew it would never be shipped to the masses for quite a long time, same as SORA.
Not until a company like Luma forces their hand, that is.
For me, it's hard to look back 15 months ago at Will Smith eating spaghetti to see this variety of models that have come to fruition in the last three weeks
and take seriously any of these claims that have started to float around that AI is somehow plateauing or slowing down.
The speed of evolution is unbelievable and is going to come with changes that we can't even imagine yet.
Investor Jared Hecht writes, given the pace at which Luma, Sora, and Kling are emerging and improving
this may be the final generation of the global movie star.
There is a future where Hollywood quality film can be created by anyone with an idea,
computer, and internet connection.
Who knows how this will all go down, but right now, it is an absolutely flowering moment for AI video generation,
and that's pretty cool to see.
Quick shout out if you're interested in learning more about
how to use these tools. Even if you haven't signed up for Super Intelligent yet at B-Supertai,
we did just release one of our Luma tutorials for free on our YouTube channel. You can find that
at YouTube.com slash at B-SuperaI. I'll also include a link in the show notes, but for now,
that is going to do it for this AI Daily Brief. Until next time, peace.
