The AI Daily Brief: Artificial Intelligence News and Analysis - Scarlett Johansson vs. OpenAI

Episode Date: May 22, 2024

OpenAI is facing controversy after Scarlett Johansson claimed they used a voice eerily similar to hers in their recent demos without her permission. Johansson has released a statement saying she decli...ned an offer from OpenAI to use her voice and is now seeking legal clarity. OpenAI responded by pausing the voice, insisting it was not modeled after Johansson’s. This incident raises important questions about AI ethics, likeness rights, and the responsibilities of tech companies in this emerging field. ** Join Superintelligent at https://besuper.ai/ -- Practical, useful, hands on AI education through tutorials and step-by-step how-tos. Use code podcast for 50% off your first month! ** ABOUT THE AI BREAKDOWN The AI Breakdown helps you understand the most important news and discussions in AI.  Subscribe to The AI Breakdown newsletter: https://aidailybrief.beehiiv.com/ Subscribe to The AI Breakdown on YouTube: https://www.youtube.com/@AIDailyBrief Join the community: bit.ly/aibreakdown

Transcript
Discussion (0)
Starting point is 00:00:00 Today on the AI Daily Brief, it's Scarlett Johansson versus Open AI. Before that in the headlines, Microsoft Build has barely kicked off, and we've already got some serious features and some serious discussion. The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI. To join the conversation, go to the Discord linked in our show notes. Welcome back to the AI Daily Brief Headlines edition, all the AI headlines you need in around five minutes. Today, we are focusing on the first set of announcements from Microsoft Bill.
Starting point is 00:00:33 Yesterday, we did a little bit of a preview around what I was looking for at Microsoft Build. It included things like how much they were going to lead with hardware versus software, how much they were going to continue to tout their OpenAI relationship versus give us some indication that they were moving away or at least hedging, and a couple of other questions like that. Well, even before the event technically began, because I believe, as they've got it listed, today is the first day technically, they kicked off with a bang showing off their new copilot plus PCs.
Starting point is 00:01:01 As anticipated, we got a new wave of AI-powered PCs with some pretty big claims. Microsoft says that their new line is significantly faster than the M3 MacBook Air, owing to the new Snapdragon Elite X-chips from Qualcomm. Still, the emphasis wasn't really so much on the hardware. It was what the whole package of the hardware enabled. And at the very center of that was a new desktop assistant that was basically the Windows integrated version of what we got from OpenAI last week. In fact, one of the comments that I saw from some people is now we know why
Starting point is 00:01:31 chatGPT didn't launch a desktop app because Microsoft just made the thing native to Windows. Mustafa Sullyman, formerly the head of inflection, now the head of Microsoft AI, tweeted yesterday, we are taking co-pilot to the next level. Copilot will see, hear, speak, and help in real time. Soon your AI companion will start to live life alongside you, whether playing Minecraft or helping you navigate life's most difficult challenges. Let's take a quick look at this demo to get a sense of what's going on here. Hey, co-pilot, how's it going?
Starting point is 00:01:58 Hey, it looks like you're about to dive into some Minecraft. Yeah, I'm trying to figure out how to make a sword. To craft a sword, you'll need some materials. Can you open your inventory? Just press E on your keyboard. You've got some sticks. Perfect for the sword's handle. But it looks like you're missing the material for the blade.
Starting point is 00:02:16 Let's start by gathering some wood or stone. Shall we head out and find some resources? Yeah, that sounds great. So what you're seeing here is copilot actually being able to see the screen that the user is interacting with on more than just a superficial level. Right? In this case, it's able to actually see what's in the player's Minecraft inventory and make suggestions on that basis. Although it's a different context in this case gaming, it's very similar to the type of interaction that we saw OpenAI show off last week, as well as Google with Project Astra. And you take those things together and it's pretty clear that we're moving to a world where our interactions with computers may be mediated or maybe all shared with this omnipresent assistant that can help us on any given tasks.
Starting point is 00:02:54 Now, one of the things that I think is interesting about the choice they made to show a game scenario is that when it comes to the utility, or use cases of this type of feature, I can see there being a bit of an intimidation gap. Are people, for example, really going to ask their perpetual omnipresent assistant to read their writing and give them suggestions? Maybe not, but getting stuck in a game, which is a very common experience, and having the assistant help walk them through how to get out of being stuck in that game, seems like it could be the type of first-step use case that gets a lot of people to get familiar with this mode of interaction. Still, probably the most discussed feature from yesterday's announcement was called Recall. For those of you,
Starting point is 00:03:30 who are familiar with what used to be called rewind.AI that has now also launched limitless, recall is basically the same feature. It keeps track of everything that you're doing so you can use natural language to go back and recall exactly what you've done. As Zar Nick sums up, Saksia and Adela says Windows PCs will have a photographic memory feature called recall that will remember and understand everything you do on your computer by taking constant screenshots. Let's watch this little video about this feature that was also released yesterday. How do we introduce memory, right, photographing memory into what you do on the PC. And now we have it.
Starting point is 00:04:03 So it's called recall. It's not keyword search, right? It's semantic search over all your history. And it's not just about any document. We can recreate moments from the past, essentially. Here's how it works. Windows constantly takes screenshots of what's on your screen. Then uses a generative AI model right on the device, along with the NPU,
Starting point is 00:04:24 to process all that data and make it searchable, even photos. I got to try it out. I searched brown leather bag. It came up in visual search. There's no place on this page that it says brown leather bag. It just knows because it sees this brown leather bag. There could be this reaction from some people that this is pretty creepy. Microsoft is taking screenshots of everything I do.
Starting point is 00:04:49 Yeah, I mean that's why that you can only do it on the edge. So this is like, you know, you have to put two things together. This is my computer. This is my recall, and it's all being done locally. So a couple things here. On the one hand, this is the type of feature that will be extremely useful to many people in many contexts. Searchable memory across everything that you've done is just a potential productivity hack. At the same time, as this interview host points out, the potential for abuse here is also extreme.
Starting point is 00:05:20 Now, Microsoft's answer to this, which is exactly what Apple has been working on as well, is that this is AI that doesn't touch the cloud. that's what he's referring to when he says The Edge. He means that it's only happening locally and is not being stored or shared with the cloud. The challenge, of course, for many people, is that that requires a lot of trust. And indeed, there are about 30 million views of this video, and the responses are basically a Rorschach test for how people feel about issues of privacy, surveillance, big tech, you name it. Kevin Beaumont points out the financial risk.
Starting point is 00:05:48 He says from Microsoft's own FAQ, note that recall does not perform content moderation. It will not hide information such as passwords or financial account numbers. Abiba Burhan, who does AI accountability at Mozilla, says, this is called constant surveillance, monitoring and tracking, and it will eventually be used to influence and control the masses. Karthik San Karan says lawyers predict a new golden age of discovery. Elon waited as well, saying this is a black mirror episode, definitely turning this quote-unquote feature off.
Starting point is 00:06:14 In what I find to be a deeply resonant tweet based on the history of new features that people complained about, Matthew Pines writes, you will complain in post and then passively accept it as the new normal. The ratchet of persistent and pervasive surveillance is required for AI to reach its full total addressable market. What we have here is in a single feature, an embodiment of so many of the upsides and downsides of AI and new technology in general all in one.
Starting point is 00:06:36 It's something that could be extremely useful, but that also pushes the boundaries of what people are comfortable with. It's something that people can't imagine right now, but in the future may not imagine living without. It's something that requires a huge amount of trust from big tech companies that don't exactly have a lot of our trust right now. Basically, in short, it is going to be fascinating to see how this, actually plays out as they roll these features out.
Starting point is 00:06:55 And so I'm going to be watching that closely. Now, Microsoft Build is still just getting started, so I'm sure we were going to have more to talk about this week. But for now, that is going to do it for the AI Daily Brief Headlines Edition. Stick around for the main episode. Hello, friends. Before we get back to the episode, I want to tell you about something special I'm doing on Super Intelligent this June.
Starting point is 00:07:14 Super is, of course, our platform for AI learning. And I've heard from a lot of you that you really want something for a true AI beginner, someone who's really just getting their feet wet with these tools. So what I'm going to do is put together basically a course that sits on top of and uses super intelligent tutorials and lessons, but where I hand guide you through around 10 different lessons and how-toes that I think once you complete them will have you ahead of 80% of the other people who are just starting to use AI right now. If you are interested in this learning experience, go to B-Super.a.i and sign up using code June. You'll get 25% off your first month and I'll automatically add you to that AI for beginners group. That's B-super.A.I discount code June. See you there.
Starting point is 00:07:52 Just a week after OpenAI's GPT-40 demos and Google's new variety of AI announcements and on the same day that Microsoft started showing off its co-pilot PCs. The big AI story was not about any of these technologies, but about potential legal action from Scarlett Johansson. By way of background, Scarlett Johansson was, of course, the voice of Samantha in Her. The movie Her, meanwhile, which as a total aside was not meant to be aspirational, but was intended as dystopian, is something that Sam Altman has said clearly influenced Open AI. Let's listen to this clip.
Starting point is 00:08:25 The number of things that I think Her got right that were not obvious at the time, like the whole interaction model with how humans are going to use an AI, this idea that it is going to be this like conversational language interface, that was incredibly prophetic and certainly more than a little bit inspired us. So they're on the record as being very interested in this movie. Add on top of that, the fact that Altman explicitly made the connection to the movie during the demo last week when he tweeted the word her, which by the way got 12 million views.
Starting point is 00:08:57 And all of this led to many comparisons between the voice that was in these open AI demos, which was called Sky and Scarlett Johansson's voice. Again, let's briefly listen. Well, well, well, just when I thought things couldn't get any more interesting, talking to another AI that can see the world, this sounds like a plot twist in the AI universe. Now, even before any of this Scarlett Johansson stuff, there were already some critiques of this voice that it was way too flirty in a way that made people uncomfortable. But all of this came to a head yesterday when Scarlett Johansson released a statement. The statement reads, last September I received an offer from Sam Altman who wanted to hire me to voice the current ChatsyPT 4.0 system.
Starting point is 00:09:38 He told me that he felt by my voicing the system, I could bridge the gap between tech companies and creatives and help consumers to feel comfortable with a seismic shift concerning humans in AI. He said he felt that my voice would be comforting to people. After much consideration and for personal reasons, I declined the offer. Nine months later, my friends, family, and the general public all noted how much the newest system named Sky sounded like me. When I heard the release demo, I was shocked, angered, and in disbelief that Mr. Altman would pursue a voice that sounded so eerily similar to mine, and that my closest friends and news outlets could not tell the difference.
Starting point is 00:10:07 Mr. Altman even insinuated that the similarity was intentional, tweeting a single word her, a reference to the film in which I voiced a chat system, Samantha, who forms an intimate relationship with a human. Two days before the Chatchapit 4.0 demo was released, Mr. Altman contacted my agent asking me to reconsider. Before we could connect, the system was out there. As a result of their actions, I was forced to hire legal counsel who wrote two letters to Mr. Altman and OpenAI, setting out what they had done and asking them to detail the exact process by which they created the Sky Voice. Consequently, OpenAI reluctantly agreed to take down the sky voice. In a time when we are all grappling with deepfakes and the protection of our own lightness,
Starting point is 00:10:39 our own work, our own identities, I believe these are questions that deserve absolute clarity. I look forward to resolution in the form of transparency in the passage of appropriate legislation to help ensure that individual rights are protected. This triggered an unbelievable amount of conversation. Altman felt compelled to comment. He sent a statement to CNBC that said, The voice of Sky is not Scarlett Johansson's, and it was never intended to resemble hers. We cast the voice actor behind Sky's voice before any outreach to Ms. Johansson. Out of respect for Ms. Johansson, we have paused using Sky's voice in our products. We are sorry to Ms. Johansson that we didn't communicate better. So let's try to unpack the myriad of
Starting point is 00:11:13 of things going on here. First of all, there is the explicit legal question of whether the Sky voice was actually trained on anything from Scarlett Johansson. Many of the AI safety folks who already don't like Sam Altman jumped up to scream that of course it probably was and that we needed a full accounting of how it was trained. And sure, if that's the case, that's obviously hugely problematic. But I think it misses the broader point. And frankly, I think it undermines a lot of the critique by trying to suggest that Sky itself was directly trained on Scarlett Johansson's voice rather than just meant to sound like her. Now, of course, in the other end of the spectrum, there are many people who pointed out that just a similarity in a voice
Starting point is 00:11:49 doesn't entitle someone to not having any voices that sound like that used by AI. Annaid tweets, OAI isn't at fault if it's genuinely a different voice actress, just because X sounds like Scarlett Johansson doesn't make it Scarlett Johansson's voice. Fight the lawsuit, bring Sky back. There are also folks who are inclined towards the AI safety side of things that also think that people are trying to make too much hay out of this. Flory and my tweets, many AI safety-minded people are losing all their credibility for dunking on OpenAI for any reason they can get their hands on, regardless of how irrelevant they are to AI safety. The superalignment team disbanding is a big deal, a Hollywood star being angry, not so much.
Starting point is 00:12:24 Now, of course, what he's referring to is the fact that Open AI dissolved the superalignment team in the wake of its leader, Jan Lakey, leaving. Reports from inside the company suggest that the team members who are still there are being reassigned to other teams within the company. This, by the way, is a story that is continuing to pick up. Jeremy Kahn from Fortune Magazine tweeted this morning, Exclusive. Open AI publicly committed to give 20% of its computing resources to a team dedicated to controlling the most dangerous kind of AI. It never delivered and in fact repeatedly denied that team's request for resources.
Starting point is 00:12:51 So this isn't the focus of today's show, so I'm not going to go deep on this. I do think that it's a very interesting moment to see how the cleave between the AI safety folks and the acceleration side of things is growing. But the point being that this is one take where focusing on some Hollywood celebrity isn't the right thing to focus on. I think that the obviously bigger thing here is just the obvious own goal. In other words, it doesn't really matter, I don't think, whether or not there is any legal culpability here. This is a gray area space that there's going to be lots of legal battles fought on over the next few years, and who knows where things will land when it comes to exactly what
Starting point is 00:13:25 rights around someone's likeness and voice they have. Is Sky enough similar to Scarlett Johansson that there's actual legal culpability there? Would there have to be actual Scarlett Johansson material in the training? even then is that protected, et cetera, et cetera, et cetera. Those are all legal questions. But right now, there is an incredible battle for public opinion going on. And this just reads terribly. Dara's tweets, this is why so many creatives are so anti-AI.
Starting point is 00:13:51 If Scarlett Johansson, a celebrity, had to lawyer up to get them to unsteal her voice, imagine how hard it will be for any and everybody else to claim or reclaim their work after AI companies steal it. Even if you disagree with a lot of the premises of that tweet, It's fairly shocking to be that even with a cursory watching of the cultural attitudes around AI right now, that you wouldn't understand that this was going to become an issue. Tom Shaughnessy writes, I'll say it, feels like there's a trend of Sam and Open AI acting murky and trying to cover afterward. Scarlett Johansson this week, the equity clawback last week, entire safety team leaving last week.
Starting point is 00:14:21 I'm losing track. Allow drama-free competitors to enter the ring. There is always a certain amount of chaos when it comes to new startups. If you have ever built or been in a startup that is being built quickly, there is no way, to move as fast as you want or think is important. That pressure is increased dramatically by the place that AI has in society and the viciousness for the competition of the frontier. But at some point, you're no longer a scrappy startup. You're an industry leader and you have to be held to a higher standard. From where I'm sitting with the facts that we have right now, it doesn't seem likely to me
Starting point is 00:14:53 that Open AI did anything actually legally wrong. My guess is that they did not train Sky on Scarlett Johansson's voice, especially after they asked her and were denied, getting explicit permission to use her voice. I also think that their intentions in trying to invoke a cultural reference point with the movie her were not trying to be creepy or weird. That movie was clearly an important cultural touchstone for many of the people at OpenAI, and I'm sure they felt like they were doing an homage. What's more, an homage that might make other people feel more comfortable as well. But if they and we didn't realize it before, everyone has to tread more carefully, an endeavor to a appreciate the ethical gray areas that so much of this industry lives in right now,
Starting point is 00:15:34 or else every week is just going to be a conversation like this rather than a conversation about new technology. And make no mistake, there is absolutely a battle for public opinion going on that is going to make it to Washington, D.C. Hawaii Senator Brian Shats tweeted yesterday, alarming that an AI company just seems to have gone ahead and lifted a voice of an actual person without permission or compensation. The impunity is even more worrisome for performers who aren't already popular.
Starting point is 00:15:56 The right to one's own image and voice must be protected. Look, I am in the AI industry. I am on the side of the AI industry, broadly speaking at least. But we just got to do better. Simple as that. Anyways, guys, that is going to do it for today's AI Daily Brief. Until next time, peace.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.