The AI Daily Brief: Artificial Intelligence News and Analysis - The AI Data Wars: Elon Threatens to Sue Microsoft While Viral AI Drake Track Stuns Music Industry
Episode Date: April 20, 2023We're seeing more and more instances of industries, companies and individuals fighting back against their data and IP being used to train AI. On this episode, we look at Elon threatening to sue Micros...oft, Reddit changing their terms for AI companies, and the music industry loosing their ever-loving mind about a viral AI produced track featuring the synthetic voices of Drake and The Weeknd. Subscibe to our YouTube: https://www.youtube.com/@TheAIBreakdown
Transcript
Discussion (0)
This episode of the AI breakdown originally premiered as a YouTube video on Thursday, April 20th.
In it, we discuss Elon Musk threatening to sue Microsoft and open AI over their use of Twitter data to train their models,
the music industry freaking out at the viral AI track featuring Drake in the weekend that didn't feature them at all,
and the broader question of the emerging AI data wars.
Welcome back to the AI breakdown.
Today we are talking Elon Musk suing Microsoft, Reddit charging AI companies for access to its data,
It's the Data Wars, baby.
All right, guys.
Well, yesterday you might have been one of the 13 million people to see this little interaction.
Twitter Daily News writes, news.
Microsoft drops Twitter from its advertising platform as they refuse to pay Twitter's API fees.
Elon Musk, not being one to let something like that go, writes,
they trained illegally using Twitter data.
Lawsuit time.
Now, again, this has 13 million views, 207,000 likes, over 1,100 bookwomen.
marks, 19,000 retweets, 1,700 quote tweets. So what's going on and what does it have to do with a larger
phenomenon happening in AI? Well, the specific offense from Microsoft was this notice. It's on
the Microsoft advertising website and it says, starting on April 25th, smart campaigns with
multi-platform will no longer support Twitter. As of April 25th, 2023, you'll be unable to access
your Twitter account through our social management tool. Create and manage draft.
or tweets, view past tweets and engagements, schedule tweets.
Other social media channels such as Facebook, Instagram, and LinkedIn will continue to be available.
However, it was clearly not what Elon was focused on.
Elon shifted the conversation to be about AI.
So specifically, Elon is accusing, or at least implied, that Microsoft and presumably OpenAI
used Twitter data to train their large language models behind things like ChatGPT.
Now, OpenAI and Microsoft for the purposes of this tweet, I think, were assumed to be one and the same, even though technically OpenEye remains an independent company, even though Microsoft owns a big chunk of them based on their last couple of investments.
Now, holding aside the specifics of OpenAI or Microsoft, this is an issue that is starting to come up more and more and more.
You might have heard of a Getty lawsuit last summer where Getty images set out to sue stable diffusion for copyright infringement.
What happened was that people started noticing that in the images that they were generating with the stability AI-based tool,
there were little sections that looked an awful lot like the Getty Images logo that's on unauthorized Getty images,
or rather Getty images that haven't been licensed yet.
So you'll know if you've ever Googled an image and seen exactly the perfect one, but then it's a Getty image.
So there's a little bar down at the bottom that says Getty Images with the name of the photographer.
And that is a protection so that people can't use these images unlicensed, right?
They can't just copy paste them from Google search.
Now, the way that stable diffusion was pulling images back, it showed something like that
little patch.
And that's because the AI wasn't able to understand that that wasn't supposed to be a part
of the image.
When it was searching or trying to draw inspiration to create its AI version of the image,
it was looking at all the available materials, all the available images that it had been trained on.
and if a meaningful portion of them were Getty images, then it would just assume that that little
box was supposed to be there. In the suit, Getty is accusing Stability AI of, quote,
brazen infringement of Getty images intellectual property on a staggering scale. It claims that
Stability AI copied more than 12 million images. Now, interestingly, in this lawsuit,
not only was Getty focused on the fact that Stability AI had used their data set to train their
model, but also that it was an issue or that they presented the watermark in a way that is
bizarre or grotesque their words and thus dilutes the quality of the Getty images mark by blurring
or tarnishment. So they're really, we're hitting them from both sides. Now, fast forward,
this issue is coming up more and more. Obviously, we see Elon talking about it, but he could just
be talking, right? It's not for sure that he's going to actually follow through and sue Microsoft or
go after OpenAI, but a company that has actually made moves specifically in this area already
is Reddit. So Reddit is starting to charge AI models for learning from its huge data set.
An Ars Technical article says LLMs can no longer lurk, learn, and profit from 18 years of links and
chatter. This came out first in the New York Times, where Reddit founder and CEO Steve Huffman
said that it basically knew that the 18 years worth of content that it had that was generated by
humans was immensely powerful and that it wasn't going to just allow companies training their large
language models to access it for free. So the Reddit API will remain free for developers,
working on bots or other Reddit tools, as well as to researchers, but there is a new
premium category for those companies that need more information. So this is from the Reddit
announcement that followed that news story. They write, we are introducing a new premium access
point for third parties who require additional capabilities, higher usage limits, and broader
usage rights.
Now, what's interesting about this Reddit story is that there has been some speculation,
that one of the reasons that Elon really wanted to get his hands on Twitter was to access
the huge, huge amount of human language data that the platform represents.
As far as I've seen, Elon hasn't even really hinted at that, but it's an interesting
theory. The question of data and intellectual property that underlies AI training models has also
recently been in the news in the context of the music world. Over the weekend, an anonymous TikTok
user posted a track called Hard on My Sleeve. It was a new Drake and the Weekend track,
except it wasn't. It used AI to put Drake and the Weekend on this song, and it was a banger.
Frankly, it went extraordinarily viral. McKay Wrigley says AI music is here. This
This is the first example of AI-generated music that really wowed me.
This guy, Ghostwriter 77 on TikTok, made a Drake X The Weekend track that's actually kind of insane.
You'll soon be able to make unlimited music by your favorite artists on demand with AI.
Roberto Nixon wrote this extremely viral tweet, probably viral, because it was attached to this track, and says,
listen to this AI-generated song featuring Drake in The Weekend.
It goes so damn hard.
It's by Ghostwriter 977 on TikTok, and it's blowing up,
on socials and streaming platforms.
UMG, which controls around one-third of the global music market,
has already asked streaming platforms to ban AI,
a modern Napster moment.
We'll be fascinating to watch this all unfold in real time.
Now, very soon after he posted this,
Twitter has the little,
this media has been disabled in response to a report
by the copyright owner post,
and this track was really eliminated from a lot of different parts of the internet.
Although, of course, people downloaded it,
and so it's still existing.
On April 17th, Rob Abelow tweeted,
This AI-generated song featuring Drake in the weekend,
trading lines about Selena Gomez dropped on Saturday.
It now has 20 million streams in under 48 hours.
TikTok 13 million, Twitter, 5.3 million, Spotify, 254,000.
YouTube, 144,000, SoundCloud, 84,000.
The track is impressive and the artist leans hard into AI in the branding.
The title says it's AI.
Artist's name is Ghostwriter.
They wear a cloth like a ghost.
the videos. The fact it's AI is a feature. It's why everyone wants to talk about it. What about when
that's not interesting anymore? Will AI mashups of celebrity artists become a new art form like
some mutant child of the remix? Troy Carter, who was formerly Lady Gaga's manager and who now
runs Venice Music, wrote, Ghostwriter is the new Banksy. He should never reveal his identity.
Now, there are a lot of opinions around this. Some people feel like this is a clear
infringements of artist's IP and copyrights and are with the industry that this shouldn't be
allowed.
Others, whatever they think about whether it should or shouldn't be allowed, feel like it's
entirely inevitable, which is obviously exactly how so many of these conversations in AI go.
Then there's also the set of people who think that there is a new thing coming for creators
in this new quote-unquote art form.
Joey Palitano tweets, that sound you hear is the assembling of an army of copyright lawyers,
the largest the world has ever seen, who have been in cryosleep since the Napster lawsuits,
and have just now been awakened for one last job.
Traffic Jam Master Jay writes something similar.
Only Gen Z would dare to wake the sleeping giant that is the RIA in the battle over AI.
Y'all don't remember how they absolutely beat the brakes off of Napster and Livewire.
The RIAA is undefeated.
We are exiting the F-A-R-A-R-R-R-R-R-R-R-Wase and speed running towards Find Out.
Now, all joking aside, I don't think that that's wrong.
This is going to force a confrontation in the legal system around AI training and data and IP,
at least in the context of music, but I wouldn't be surprised if the implications go a lot farther.
Last year, even before ChatGPT came out, the Verge wrote an article called
The Scary Truth About AI Copyright is Nobody Knows What Will Happen Next.
The last year has seen a boom in AI models that create art, music, and code by learning from others' work,
But as these tools become more prominent, unanswered legal questions could shape the future of the field.
This is really the point of this.
We don't know what courts are going to decide.
We don't know, on top of that, how people are going to be able to enforce it.
Ryan Hoover from Product Hunt writes, free startup idea that will likely get you sued.
AI Spotify.
How it works?
AI Spotify hosts AI generated music of your favorite artists.
Anyone can submit music and the best song surface based on listens and likes.
music with the most listens earns a pro-rata share of subscription revenue reserved for the original artists.
For example, Drake could claim money generated from his likeness on the platform.
Artists that do not want to participate can opt out entirely banning any music that uses their likeness
or individually allowing songs they endorse.
Of course, there are many ethical and legal issues with this model, especially with labels.
But maybe this is a germ of a shower thought that has potential.
I like this view from Ryan.
I think that it splits the difference between the take that this is going to be prevented active,
but also the reality that this is likely inevitable activity and that no amount of legal pressure
and court decisions can really stop technology like this once it's out of the bottle.
This at least would be a way for the AI generated set of music that's built on artist catalogs
to, A, benefit them in some way, B, be clearly segmented from other types of music, and C, still not
undermine all the incredible creativity that's poised to go into this.
Anyways, this is a beginning of an issue that is going to be.
going to do nothing but grow more and more intense. And frankly, we could see a very different
landscape for AI startups in a few months to say nothing of a few years. But anyways, guys,
the data wars, they are here, they have begun. That's the story for today. It's a story
that we will hear lots more about. Until next time, peace.
