The AI Daily Brief: Artificial Intelligence News and Analysis - OpenAI Sora Has Leaked
Episode Date: November 27, 2024OpenAI's long-awaited video generation model, Sora, has reportedly leaked, sparking debates across the AI community. This video explores the model's capabilities, its potential impact on video creatio...n, and the controversy surrounding OpenAI's approach with early testers. How does Sora compare to advancements from competitors like Luma and Pika Labs? Brought to you by: Vanta - Simplify compliance - https://vanta.com/nlw Plumb - AI automation that just works - https://useplumb.com/ The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, it appears that OpenAI's video generation model, SORA, has been leaked.
Before that in the headlines, President-elect Trump is apparently thinking about a lead White House AI position.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, follow the Discord link in our show notes.
Welcome back to the AI Daily Brief Headlines edition, all the daily AI news you need in around five minutes.
One of the big things that people are watching right now is how presidential appointments might
impact AI policy in the years to come. Now we have rumors of what might be the most direct position
to influence this space, with Axios reporting that President-elect Donald Trump is considering
naming an AI czar. This is coming from sources inside the Trump transition team. And the way that they
framed it to Axios is that the role is likely but not certain. In terms of the details, they are,
of course, sparse. sources suggest that this won't be Elon Musk himself, but that he, along with Vivek
Kramaswamy, the two who are, of course, leading the new Department of Government Efficiency or
Doge, will have a big role in determining who is the AI czar. That is somewhat concerning for
other tech leaders with whom Elon has a touchy relationship. Interestingly, this is not the only
czar being considered for the Trump White House. Bloomberg reported last week that the Trump
transition team had also been vetting cryptocurrency executives for a similar role for the
crypto industry. There is also a possibility, say the sources that the AI and crypto roles could be
combined under a single emerging technology czar. I, for one, am hugely hoping that that doesn't
happen. I think that these spaces, while having a relationship with one another, and while I'd love to
see those two czars work in close concert with one another, are fundamentally different and require
different things. And I'd like to see them get their own consideration. In terms of what the AI czar will
do, Axios says the AI czar will be charged with focusing both public and private resources to keep
America in the AI forefront. Something that was established under President Biden's AI executive order,
and which might be kept even though Trump plans to repeal that order, is that government agencies
have all named chief AI officers. Theoretically, the White House AIs are could play a coordination
role across all of those individuals. Another potential area of activity? We've discussed how
Doge, the Department of Government Efficiency, might not only be focused on trying to find obvious
waste, but also think about how new AI-enabled processes could make things more efficient.
reports speculate that both of those functions could be supported by this AIsar.
In other words, using AI to root out, quote, waste, fraud and abuse, but also thinking about
how AI could reshape processes going forward.
It also seems likely that if this role is established, they will have to be closely connected
to energy policy, given that one of the big constraints for future leadership is going
to be the availability of energy.
And Axios notes that an AISR would not require Senate consent, allowing the person to get
to work much more quickly. Speculation has, of course, started ramping up. One of the names thrown around,
for example, is Max Tegmark, who's an MIT professor and AI safety advocate, and who some have
reported has been influential in shaping Trump's views on controlled AI development. But then again,
all of that is just speculation. The more interesting thing here is that Tegmark is a reminder
of how much this role could shape the way the U.S. approaches things. The difference between
someone who is accelerationist-minded versus safety-minded could be enormous when it comes to how
different policy is pursued. And so if this is a real thing, it is going to be worth following very
closely. In the meantime, politicians continue jockeying to make AI policy. The latest comes
from Senator Peter Welch, who is introduced a new bill aimed at making copyright enforcement
easier when it comes to AI model training. Called the Transparency and Responsibility for Artificial
Intelligence Networks or Train Act, the bill theoretically increases transparency into
data sets. Copyright holders would be able to subpoena the training records of AI models if they have a
good faith their belief was used to work to train the model. Developers would only need to reveal
training material to the extent that it is, quote, sufficient to identify with certainty
whether copyrighted works were used. Failure to produce would create a legal presumption that the AI
developer did, in fact, use the copyrighted material in question. Welsh said that the country
needs to, quote, set a higher standard for transparency around AI training, adding, this is simple.
If your work is used to train AI, there should be a way for you, the copyright holder,
to determine that it's been used by a training model, and you should get compensated if it was.
We need to give America's musicians, artists, and creators a tool to find out when AI companies
are using their work to train models without artist's permission.
So far, attempts to sue AI labs around copyright infringement have been progressing at a fairly
slow pace.
The New York Times lawsuit against OpenAI is probably the most advanced.
In that case, the judge ordered Open AI to produce a searchable version of their training
data for New York Times attorneys to scour through.
We don't know whether they've found anything at this stage, but the process was marked by controversy
when OpenAI accidentally deleted search logs setting the process back.
This law is aimed at streamlining some similar process, but the concern, of course, is that
it swings too far in the other direction.
We don't have right now solid legal precedent on whether using data to train AI models
constitutes copyright infringement.
Many labs have been signing licensing agreements in order to avoid lawsuits and the associated
PR damage, but no court has had an opportunity to make a ruling on this point of law.
This bill also doesn't settle the question of law. It simply introduces a clearer subpoena power when
copyright infringement is alleged. Meanwhile, jurisdictions like Israel, Japan, and Singapore have created
laws that classify training data as fair use. A16Z and others in the AI industry have likened
use in training data as closer to reading a book than copying a book. Ultimately, this is going to be
one of the most challenging balancing acts that we face, protecting authors, musicians, and creatives
on the one hand, while advancing strategic AI on the other.
Moving back to the technical side of things, Anthropic has launched a new tool for connecting AI
assistance to external data sources.
Called the Model Context Protocol or MCP, Anthropic are proposing it as an open-source standard
for data connectivity.
MCP allows any model, not just ones produced by Anthropic, to draw data from business tools
and software or content repositories.
They wrote in a blog post, as AI assistants gain mainstream adoption, the industry has invested
heavily in model capabilities, achieving rapid advances in reasoning and quality.
Yet even the most sophisticated models are constrained by their isolation from data.
Trapped behind information silos and legacy systems, every new data source requires its own custom
implementation, making truly connected systems difficult to scale.
Alex Albert, the head of cloud relations, provided a series of examples of MCP being used
to connect to GitHub and a generic search engine to demonstrate its flexibility.
He wrote,
We're building a world where AI connects to any data source through a single elegant protocol.
MCP is the universal translator.
Integrate MCP once into your client and connect to data sources anywhere.
Get started with MCP in less than five minutes.
We built servers for GitHub, Slack, SQL databases, local files, search engines, and more.
Like LSP did for IDEEs, we're building MCP as an open standard for LLM integrations.
Build your own servers, contribute to the protocol, and help shape the future of AI integrations.
Even if that sounds like Greek to you, what's important to know is that open connectivity
standards have a long history of being a powerful unlock once they reach mass adoption.
Even something we take for granted, like the standard USB port, used to be dozens of different
proprietary variants that were all incompatible. It's unclear whether OpenAI and other frontier
labs will adopt an open standard, but Block, Apollo, Replit, Codium, and Sourcegraph are all building
MCP support into their platforms. Anthropic have also shared pre-built MCP servers for Google,
Slack, and GitHub. The company wrote, instead of maintaining separate connectors for each data
source, developers can now build against a standard protocol. As the ecosystem matures, AI systems will
maintain context as they move between different tools and datasets, replacing today's fragmented
integrations with a more sustainable architecture.
What I think is relevant here is that it's so telling about where we are as an industry.
So much of what's actually exciting right now is not big huge advances in model capabilities.
It's these fundamental infrastructure building blocks that are coming online that in a few years
or even a few months, it will be very hard to imagine a time before them.
They're going to unlock a huge number of use cases and new opportunities.
And so as small as they might seem relative to getting Orion or GPT,
T5. This is, I think, very big news indeed.
For now that's going to do it for today's AID Daily Brief Headlines edition, next up, the
main episode.
Today's episode is brought to you by Plum.
Want to use AI to automate your work but don't know where to start?
Plum lets you create AI workflows by simply describing what you want.
No coding or API keys required.
Imagine typing out AI, analyze my Zoom meetings and send me your insights in Notion, and watching
it come to life before your eyes.
Whether you're an operations leader, marketer, or even a non-technical founder,
Plum gives you the power of AI without the technical hassle.
Get instant access to top models like GPT40, Claude Sonnet 3.5, assembly AI, and many more.
Don't let technology hold you back.
Check out Use Plum, that's Plum with a B, for early access to the future of workflow automation.
Today's episode is brought to you by Vanta.
Whether you're starting or scaling your company's security program,
demonstrating top-notch security practices, and establishing trust is more important than ever.
Venta automates compliance for ISO-27-01, SOC2, GDPR,
and leading AI frameworks like ISO-42,001 and NIST AI Risk Management Framework,
saving you time and money while helping you build customer trust.
Plus, you can streamline security reviews by automating questionnaires
and demonstrating your security posture with a customer-facing trust center all powered by Vanta
AI.
Over 8,000 global companies like Langchain, Lila AI, and Factory AI use Vanta to demonstrate
AI trust and prove security in real time.
Learn more at vanta.com slash NLW.
That's vanta.com slash NLW.
Today's episode is brought to you, as always, by Super Intelligent.
Have you ever wanted an AI Daily Brief but totally focused on how AI relates to your
company?
Is your company struggling with AI adoption, either because you're getting stalled, figuring
out what use cases will drive value, or because the AI transformation that is happening
is siloed individual teams, departments, and employees, and not able to change the company
as a whole?
Super Intelligent has developed a new custom internal podcast product that inspires your team
by sharing the best AI use cases from inside and outside your company.
Think of it as an AI Daily Brief, but just for your company's AI use cases.
If you'd like to learn more, go to Bsuper.a.i slash partner and fill out the information request form.
I am really excited about this product, so I will personally get right back to you.
Again, that's Bsuper.A.I. slash partner.
Welcome back to the AI Daily Brief.
We have a spicy one today as the Internet is exploding with the report
that OpenAI's SORA video generation model has just been leaked.
Now, SORA is probably the most anticipated AI product that we've heard about this year that we haven't
gotten yet.
All the way back at the very beginning of the year, Open AI blew people away with what was possible
when they demoed SORA.
For many, it transformed their sense of what AI video could do, and it really did feel like
video was going to have its mid-journery or stable diffusion moment and become a big part of
the texture of 2024.
And that is sort of what happened, but it wasn't led by OpenAI and it wasn't led by SORA.
Pika Labs came out with a new version of their model, which was much more advanced.
Luma Labs' Dream Machine became a popular option for artists and creators.
And Runway, in addition to releasing a new version of their model, also started forming
partnerships with big Hollywood studios like Lionsgate.
And what made that extra interesting is that even as OpenAI Sora got delayed, there was a
sense that maybe it was because they wanted to roll it out first to Hollywood, to have this be a product
that came in through traditional entertainment industry rather than as a bottoms up sort of service.
Now, there are a ton of reasons why an advanced video model might not get released.
It's an extremely expensive proposition for one, and there are a lot of safety concerns when
it comes to deep fakes and the use of AI generated video for nefarious purposes.
But still, most people have spent the back half of this year wondering, where is SORA?
Well, now we have apparently access to it via a model that was uploaded to Hugging Face.
You can generate a video in the PR puppet Sora space, which is at the moment of recording having a seriously hard time, presumably being crushed under the weight of people hitting it.
But the group has also published an open letter.
The letter reads, Dear Corporate AI Overlords, We received access to SORA with the promise to be early testers, red teamers, and creative partners.
However, we believe instead we are being lured into artwashing to tell the world.
that SORA is a useful tool for artists. Artists are not your unpaid R&D. We are not your free bug
testers, PR puppets, training data, validation, tokens, etc. Hundreds of artists provide
unpaid labor through bug testing, feedback, and experimental work for the program for a $150 billion
valued company. While hundreds contribute for free, a select few will be chosen through a
competition to have their SORA created film screen, offering minimal compensation which pales in
comparison to the substantial PR and marketing value OpenAI receives. De-normalize billion-dollar brands
exploiting artists for unpaid R&D and PR. Furthermore, every output needs to be approved by the OpenAI
team before sharing. This early access program appears to be less about creative expression and critique
and more about PR and advertisement. Corporate artwashing detected. We are releasing this tool to give
everyone an opportunity to experiment with what around 300 artists were offered, a free and
unlimited access to this tool. We are not against the use of AI technology as a tool for the arts.
If we were, we probably wouldn't have been invited to this program. What we don't agree with is how
this artist program has been rolled out and how the tool is shaping up ahead of a possible public
release. We are sharing this to the world in the hopes that OpenAI becomes more open, more artist-friendly,
and supports the arts beyond PR stunts. As you might imagine, immediately, first, everyone tried
to figure out whether it was real or not. Dibor Blahoe writes, Why I think it's real. This is using
the OpenAI SORA API endpoint to generate and download videos with hard-coded request headers and
cookies from the Hugging Face Space Environment config. Chabby on Twitter writes confirmed,
OpenAISO really has been leaked. However, on the other side, AI WORAWROWROWR
Harper writes, I mean, if it's just an API request directly to OpenAI, it gets shut down in
T-minus 10 seconds. If it doesn't, it's BS. The other big discussion, of course, is what the
artist wrote. There's a little bit of a sense of yeah, duh, of course Open AI is behaving like
this. Mike Butcher quoted TechCrunch writing, they claim that OpenAI is pressuring Sora's
early testers, including red teamers and creative partners to spin a positive narrative around
Sora and failing to fairly compensate them for their work. Butcher added, who would have
thought OpenAI would do that, dot, dot, dot, dot. Of course, the other discussion,
is people sharing what they've generated, and the video quality does seem high.
Although at first glance, it's a little harder to tell how much better this is than other options
that are out there, given how much the rest of the video generation space seems to have caught up.
I tried to create a video of a turkey getting up off a table and running away, but alas,
the site was too crushed.
Anyway, this is certainly something that I'm going to keep watching, and I will report back
tomorrow on whether this is confirmed as an actual leak, or if it's just a big PR stunt.
Ironically, a leak came on the day where Luma AI released a massive update to their Dream Machine video model platform.
The end-to-end upgrade includes a new interface, a new mobile app, and a new image generation foundation model called Luma Photon.
To get a sense of how many people have been excited about this space,
Luma says they have over 25 million registered users since they launched in June 2024.
CEO Amit Jain said,
We built Dream Machine as a visual thought partner powered by a whole new image model called Luma Photon.
It's creative, intelligent, and designed for the people who build our work.
designers, creators in fashion, media, and entertainment. The model also has improved functionality
with natural language and can also draw from reference images. Jane said, unlike prompt engineering
where you have to carefully craft specific commands, Dream Machine lets you talk to it like you're talking
to a person. This conversational interface makes editing and creating intuitive. With Dream Machine,
you can give it reference images, color, structures, or textures, and it will intelligently combine
and iterate until you get exactly what you want. Loua also says that they've cracked consistent
characters, which is going to be essential for creating longer content pieces that have a coherent
throughline. So maybe taken together the story of this plus OpenAI is that the video generation
space is very much in full swing. Like I said, a spicy one today, but for now, that is going to do it
for the AI Daily Brief. Until next time, peace.
