The AI Daily Brief: Artificial Intelligence News and Analysis - Scarlett Johansson vs. OpenAI
Episode Date: May 22, 2024OpenAI is facing controversy after Scarlett Johansson claimed they used a voice eerily similar to hers in their recent demos without her permission. Johansson has released a statement saying she decli...ned an offer from OpenAI to use her voice and is now seeking legal clarity. OpenAI responded by pausing the voice, insisting it was not modeled after Johansson’s. This incident raises important questions about AI ethics, likeness rights, and the responsibilities of tech companies in this emerging field. ** Join Superintelligent at https://besuper.ai/ -- Practical, useful, hands on AI education through tutorials and step-by-step how-tos. Use code podcast for 50% off your first month! ** ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI. Subscribe to The AI Breakdown newsletter: https://aidailybrief.beehiiv.com/ Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@AIDailyBrief Join the community: bit.ly/aibreakdown
Transcript
Discussion (0)
Today on the AI Daily Brief, it's Scarlett Johansson versus Open AI.
Before that in the headlines, Microsoft Build has barely kicked off,
and we've already got some serious features and some serious discussion.
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
To join the conversation, go to the Discord linked in our show notes.
Welcome back to the AI Daily Brief Headlines edition,
all the AI headlines you need in around five minutes.
Today, we are focusing on the first set of announcements from Microsoft Bill.
Yesterday, we did a little bit of a preview around what I was looking for at Microsoft Build.
It included things like how much they were going to lead with hardware versus software,
how much they were going to continue to tout their OpenAI relationship versus
give us some indication that they were moving away or at least hedging, and a couple of other
questions like that.
Well, even before the event technically began, because I believe, as they've got it listed,
today is the first day technically, they kicked off with a bang showing off their new
copilot plus PCs.
As anticipated, we got a new wave of AI-powered PCs with some pretty big claims.
Microsoft says that their new line is significantly faster than the M3 MacBook Air,
owing to the new Snapdragon Elite X-chips from Qualcomm.
Still, the emphasis wasn't really so much on the hardware.
It was what the whole package of the hardware enabled.
And at the very center of that was a new desktop assistant
that was basically the Windows integrated version of what we got from OpenAI last week.
In fact, one of the comments that I saw from some people is now we know why
chatGPT didn't launch a desktop app because Microsoft just made the thing native to Windows.
Mustafa Sullyman, formerly the head of inflection, now the head of Microsoft AI, tweeted yesterday,
we are taking co-pilot to the next level.
Copilot will see, hear, speak, and help in real time.
Soon your AI companion will start to live life alongside you, whether playing Minecraft or
helping you navigate life's most difficult challenges.
Let's take a quick look at this demo to get a sense of what's going on here.
Hey, co-pilot, how's it going?
Hey, it looks like you're about to dive into some Minecraft.
Yeah, I'm trying to figure out how to make a sword.
To craft a sword, you'll need some materials.
Can you open your inventory?
Just press E on your keyboard.
You've got some sticks.
Perfect for the sword's handle.
But it looks like you're missing the material for the blade.
Let's start by gathering some wood or stone.
Shall we head out and find some resources?
Yeah, that sounds great.
So what you're seeing here is copilot actually being able to see the screen that the user is interacting with
on more than just a superficial level.
Right? In this case, it's able to actually see what's in the player's Minecraft inventory and make suggestions on that basis.
Although it's a different context in this case gaming, it's very similar to the type of interaction that we saw OpenAI show off last week, as well as Google with Project Astra.
And you take those things together and it's pretty clear that we're moving to a world where our interactions with computers may be mediated or maybe all shared with this omnipresent assistant that can help us on any given tasks.
Now, one of the things that I think is interesting about the choice they made to show a game scenario is that when it comes to the utility,
or use cases of this type of feature, I can see there being a bit of an intimidation gap.
Are people, for example, really going to ask their perpetual omnipresent assistant to read their writing
and give them suggestions? Maybe not, but getting stuck in a game, which is a very common experience,
and having the assistant help walk them through how to get out of being stuck in that game,
seems like it could be the type of first-step use case that gets a lot of people to get familiar
with this mode of interaction. Still, probably the most discussed feature from yesterday's announcement
was called Recall. For those of you,
who are familiar with what used to be called rewind.AI that has now also launched limitless,
recall is basically the same feature. It keeps track of everything that you're doing so you can use
natural language to go back and recall exactly what you've done. As Zar Nick sums up, Saksia
and Adela says Windows PCs will have a photographic memory feature called recall that will
remember and understand everything you do on your computer by taking constant screenshots. Let's watch
this little video about this feature that was also released yesterday. How do we introduce memory,
right, photographing memory into what you do on the PC.
And now we have it.
So it's called recall.
It's not keyword search, right?
It's semantic search over all your history.
And it's not just about any document.
We can recreate moments from the past, essentially.
Here's how it works.
Windows constantly takes screenshots of what's on your screen.
Then uses a generative AI model right on the device, along with the NPU,
to process all that data and make it searchable, even photos.
I got to try it out.
I searched brown leather bag.
It came up in visual search.
There's no place on this page that it says brown leather bag.
It just knows because it sees this brown leather bag.
There could be this reaction from some people that this is pretty creepy.
Microsoft is taking screenshots of everything I do.
Yeah, I mean that's why that you can only do it on the edge.
So this is like, you know, you have to put two things together.
This is my computer.
This is my recall, and it's all being done locally.
So a couple things here.
On the one hand, this is the type of feature that will be extremely useful to many people in many contexts.
Searchable memory across everything that you've done is just a potential productivity hack.
At the same time, as this interview host points out, the potential for abuse here is also extreme.
Now, Microsoft's answer to this, which is exactly what Apple has been working on as well,
is that this is AI that doesn't touch the cloud.
that's what he's referring to when he says The Edge.
He means that it's only happening locally and is not being stored or shared with the cloud.
The challenge, of course, for many people, is that that requires a lot of trust.
And indeed, there are about 30 million views of this video, and the responses are basically
a Rorschach test for how people feel about issues of privacy, surveillance, big tech, you name it.
Kevin Beaumont points out the financial risk.
He says from Microsoft's own FAQ, note that recall does not perform content moderation.
It will not hide information such as passwords or financial account numbers.
Abiba Burhan, who does AI accountability at Mozilla, says,
this is called constant surveillance, monitoring and tracking,
and it will eventually be used to influence and control the masses.
Karthik San Karan says lawyers predict a new golden age of discovery.
Elon waited as well, saying this is a black mirror episode,
definitely turning this quote-unquote feature off.
In what I find to be a deeply resonant tweet
based on the history of new features that people complained about,
Matthew Pines writes,
you will complain in post and then passively accept it as the new normal.
The ratchet of persistent and pervasive surveillance is required for AI to reach its full total
addressable market.
What we have here is in a single feature, an embodiment of so many of the upsides and downsides
of AI and new technology in general all in one.
It's something that could be extremely useful, but that also pushes the boundaries of what
people are comfortable with.
It's something that people can't imagine right now, but in the future may not imagine
living without.
It's something that requires a huge amount of trust from big tech companies that don't
exactly have a lot of our trust right now.
Basically, in short, it is going to be fascinating to see how this,
actually plays out as they roll these features out.
And so I'm going to be watching that closely.
Now, Microsoft Build is still just getting started,
so I'm sure we were going to have more to talk about this week.
But for now, that is going to do it for the AI Daily Brief Headlines Edition.
Stick around for the main episode.
Hello, friends.
Before we get back to the episode,
I want to tell you about something special I'm doing on Super Intelligent this June.
Super is, of course, our platform for AI learning.
And I've heard from a lot of you that you really want something for a true AI beginner,
someone who's really just getting their feet wet with these tools.
So what I'm going to do is put together basically a course that sits on top of and uses super intelligent tutorials and lessons, but where I hand guide you through around 10 different lessons and how-toes that I think once you complete them will have you ahead of 80% of the other people who are just starting to use AI right now.
If you are interested in this learning experience, go to B-Super.a.i and sign up using code June.
You'll get 25% off your first month and I'll automatically add you to that AI for beginners group.
That's B-super.A.I discount code June.
See you there.
Just a week after OpenAI's GPT-40 demos and Google's new variety of AI announcements
and on the same day that Microsoft started showing off its co-pilot PCs.
The big AI story was not about any of these technologies,
but about potential legal action from Scarlett Johansson.
By way of background, Scarlett Johansson was, of course, the voice of Samantha in Her.
The movie Her, meanwhile, which as a total aside was not meant to be aspirational,
but was intended as dystopian, is something that Sam Altman has said clearly influenced
Open AI. Let's listen to this clip.
The number of things that I think Her got right
that were not obvious at the time,
like the whole interaction model with how humans are going to use an AI,
this idea that it is going to be this like conversational language interface,
that was incredibly prophetic and certainly more than a little bit inspired us.
So they're on the record as being very interested in this movie.
Add on top of that, the fact that Altman explicitly made the connection to the movie
during the demo last week when he tweeted the word her, which by the way got 12 million views.
And all of this led to many comparisons between the voice that was in these open AI demos,
which was called Sky and Scarlett Johansson's voice.
Again, let's briefly listen.
Well, well, well, just when I thought things couldn't get any more interesting,
talking to another AI that can see the world, this sounds like a plot twist in the AI universe.
Now, even before any of this Scarlett Johansson stuff, there were already some critiques of this voice that it was way too flirty in a way that made people uncomfortable.
But all of this came to a head yesterday when Scarlett Johansson released a statement.
The statement reads, last September I received an offer from Sam Altman who wanted to hire me to voice the current ChatsyPT 4.0 system.
He told me that he felt by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with a seismic shift concerning humans in AI.
He said he felt that my voice would be comforting to people.
After much consideration and for personal reasons, I declined the offer.
Nine months later, my friends, family, and the general public all noted how much the
newest system named Sky sounded like me.
When I heard the release demo, I was shocked, angered, and in disbelief that Mr. Altman
would pursue a voice that sounded so eerily similar to mine, and that my closest friends
and news outlets could not tell the difference.
Mr. Altman even insinuated that the similarity was intentional, tweeting a single word
her, a reference to the film in which I voiced a chat system, Samantha, who forms an intimate
relationship with a human. Two days before the Chatchapit 4.0 demo was released, Mr. Altman contacted my
agent asking me to reconsider. Before we could connect, the system was out there. As a result of their
actions, I was forced to hire legal counsel who wrote two letters to Mr. Altman and OpenAI,
setting out what they had done and asking them to detail the exact process by which they created
the Sky Voice. Consequently, OpenAI reluctantly agreed to take down the sky voice.
In a time when we are all grappling with deepfakes and the protection of our own lightness,
our own work, our own identities, I believe these are questions that deserve absolute clarity.
I look forward to resolution in the form of transparency in the passage of appropriate legislation
to help ensure that individual rights are protected. This triggered an unbelievable amount of
conversation. Altman felt compelled to comment. He sent a statement to CNBC that said,
The voice of Sky is not Scarlett Johansson's, and it was never intended to resemble hers.
We cast the voice actor behind Sky's voice before any outreach to Ms. Johansson. Out of respect
for Ms. Johansson, we have paused using Sky's voice in our products. We are sorry to Ms.
Johansson that we didn't communicate better. So let's try to unpack the myriad of
of things going on here. First of all, there is the explicit legal question of whether the
Sky voice was actually trained on anything from Scarlett Johansson. Many of the AI safety
folks who already don't like Sam Altman jumped up to scream that of course it probably was
and that we needed a full accounting of how it was trained. And sure, if that's the case,
that's obviously hugely problematic. But I think it misses the broader point. And frankly, I think
it undermines a lot of the critique by trying to suggest that Sky itself was directly trained on
Scarlett Johansson's voice rather than just meant to sound like her. Now, of course, in the other end
of the spectrum, there are many people who pointed out that just a similarity in a voice
doesn't entitle someone to not having any voices that sound like that used by AI. Annaid tweets,
OAI isn't at fault if it's genuinely a different voice actress, just because X sounds like
Scarlett Johansson doesn't make it Scarlett Johansson's voice. Fight the lawsuit, bring Sky back.
There are also folks who are inclined towards the AI safety side of things that also think that
people are trying to make too much hay out of this.
Flory and my tweets, many AI safety-minded people are losing all their credibility for dunking on
OpenAI for any reason they can get their hands on, regardless of how irrelevant they are to AI safety.
The superalignment team disbanding is a big deal, a Hollywood star being angry, not so much.
Now, of course, what he's referring to is the fact that Open AI dissolved the superalignment team
in the wake of its leader, Jan Lakey, leaving.
Reports from inside the company suggest that the team members who are still there are being
reassigned to other teams within the company.
This, by the way, is a story that is continuing to pick up.
Jeremy Kahn from Fortune Magazine tweeted this morning,
Exclusive. Open AI publicly committed to give 20% of its computing resources to a team dedicated to controlling the most dangerous kind of AI.
It never delivered and in fact repeatedly denied that team's request for resources.
So this isn't the focus of today's show, so I'm not going to go deep on this.
I do think that it's a very interesting moment to see how the cleave between the AI safety folks and the acceleration side of things is growing.
But the point being that this is one take where focusing on some Hollywood celebrity isn't the right thing to focus on.
I think that the obviously bigger thing here is just the obvious own goal.
In other words, it doesn't really matter, I don't think, whether or not there is any legal
culpability here.
This is a gray area space that there's going to be lots of legal battles fought on over
the next few years, and who knows where things will land when it comes to exactly what
rights around someone's likeness and voice they have.
Is Sky enough similar to Scarlett Johansson that there's actual legal culpability there?
Would there have to be actual Scarlett Johansson material in the training?
even then is that protected, et cetera, et cetera, et cetera.
Those are all legal questions.
But right now, there is an incredible battle for public opinion going on.
And this just reads terribly.
Dara's tweets, this is why so many creatives are so anti-AI.
If Scarlett Johansson, a celebrity, had to lawyer up to get them to unsteal her voice,
imagine how hard it will be for any and everybody else to claim or reclaim their work after
AI companies steal it.
Even if you disagree with a lot of the premises of that tweet,
It's fairly shocking to be that even with a cursory watching of the cultural attitudes around AI right now,
that you wouldn't understand that this was going to become an issue.
Tom Shaughnessy writes, I'll say it, feels like there's a trend of Sam and Open AI acting murky and trying to cover afterward.
Scarlett Johansson this week, the equity clawback last week, entire safety team leaving last week.
I'm losing track.
Allow drama-free competitors to enter the ring.
There is always a certain amount of chaos when it comes to new startups.
If you have ever built or been in a startup that is being built quickly, there is no way,
to move as fast as you want or think is important. That pressure is increased dramatically by the
place that AI has in society and the viciousness for the competition of the frontier. But at some point,
you're no longer a scrappy startup. You're an industry leader and you have to be held to a higher
standard. From where I'm sitting with the facts that we have right now, it doesn't seem likely to me
that Open AI did anything actually legally wrong. My guess is that they did not train Sky on Scarlett
Johansson's voice, especially after they asked her and were denied, getting explicit permission
to use her voice. I also think that their intentions in trying to invoke a cultural reference point
with the movie her were not trying to be creepy or weird. That movie was clearly an important
cultural touchstone for many of the people at OpenAI, and I'm sure they felt like they were doing
an homage. What's more, an homage that might make other people feel more comfortable as well.
But if they and we didn't realize it before, everyone has to tread more carefully, an endeavor to a
appreciate the ethical gray areas that so much of this industry lives in right now,
or else every week is just going to be a conversation like this rather than a conversation
about new technology.
And make no mistake, there is absolutely a battle for public opinion going on that is going
to make it to Washington, D.C.
Hawaii Senator Brian Shats tweeted yesterday,
alarming that an AI company just seems to have gone ahead and lifted a voice of an
actual person without permission or compensation.
The impunity is even more worrisome for performers who aren't already popular.
The right to one's own image and voice must be protected.
Look, I am in the AI industry.
I am on the side of the AI industry, broadly speaking at least.
But we just got to do better. Simple as that.
Anyways, guys, that is going to do it for today's AI Daily Brief.
Until next time, peace.
