Today, Explained - AI Video Killed the Video Star

Starting point is 00:00:00 A few weeks ago Google dropped VO3, Generative AI video, but now with Generative AI sound to go with it. This is video from VO3. What do you think about the idea that we're just a bunch of prompts? If I'm generated from a prompt, how come I don't have six fingers? So is this. About to do the first plunge into an active volcano. Let's send it.

Starting point is 00:00:23 And this. Breaking news, the Secretary of Defense, Pete Hegseth, has died after drinking an entire liter of vodka on a Dare By RFK. But how are the reviews? A Slop Monger's dream, says The Verge. It might actually take my job, says YouTuber Matthew Berman. The world is not ready, says Mashable. We're so cooked, says thousands of people on social media. But are we? Maybe not. That's our take at Today Explained.

Starting point is 00:01:00 You hear that? Ugh, paid. And done. That's the sound of bills being paid on time. But with the BMO Eclipse Rise Visa Card, paying your bills could sound like this. Yes! Earn rewards for paying your bill in full

Starting point is 00:01:17 and on time each month. Rise to rewards with the BMO Eclipse Rise Visa Card. Terms and conditions apply. Time to drive away from the grind and unwind in your new 2025 Mitsubishi Eclipse Cross. No phones, no signal, no meetings. Just the smell of adventure. Lease the Eclipse Cross today

Starting point is 00:01:40 for the equivalent of $89 weekly at 3.99% for 36 months. Plus, get a no charge, two-year maintenance package. Visit your local Mitsubishi dealer today or see Mitsubishimotors.ca for details. Conditions apply. This is an artificial intelligence version of Drake and you ain't listening to today's play. Joanna Stern is a personal technology columnist

Starting point is 00:02:09 at the Wall Street Journal. She is not a filmmaker, but that didn't stop her from trying to harness all the latest AI tools to make a short film. So I worked on this film with a close friend of mine and producer named Gerard Cole. He works here at the Wall Street Journal and he's a seasoned audio and video journalist

Starting point is 00:02:25 who just really has become obsessed with testing and playing with AI video tools. We started on this project probably at the end of March. I sort of challenged Gerard, I said, hey, we'll make a film. I think we should try to make something that's like a real film. We come to this place for magic.

Starting point is 00:02:46 I'm gonna make them an offer, kid. I'm gonna make them an offer. So, oh. I mean like a two minute short film. You're not sure. Sure. It's not Spielberg here. And what was so crazy about it is that every week there would be new tools that would come out.

Starting point is 00:03:01 The companies keep getting in touch and saying, well, actually we have a new update next week, so you might want to hold off on publishing that video, or you might want to hold off because we have a new tool that you can test. And so in May, Google announced VO3, which is their third version of their video model. They also announced a new tool called Flow,

Starting point is 00:03:20 which makes it easier to edit with AI video. And so we kind of had to to uproot the project a little bit to get this going. But this stuff is moving so fast that every night we'd go to sleep, we'd wake up in the morning, and there'd be a new AI video tool that we thought we should try. The one that has gotten a ton of buzz

Starting point is 00:03:37 over the last couple of weeks is Google VO. And this is from Google. This is VO3. What they did here with VO3 is they just created a new model that really blew people away. Previously with AI video, not only did you kind of have some weird wonkiness to some of the visuals and maybe things didn't look as realistic, but also there was no audio to them.

Starting point is 00:03:57 And now with VO3, you can put in a prompt, you can say, a woman working out alongside a robot. And now with VO3, you have audio. Go for the burn, sweat. When I see the woman boxing with a robot, you hear sprinting, you hear sounds of the robot's mechanics,

Starting point is 00:04:14 you hear punching sounds, you hear all kinds of audio to make the scene come to life. Which is to say, it just took like a massive jump, this technology, because it just feels a lot more real. Yeah, it feels a lot more real. And I said, OK, well, what if we can see if we can actually

Starting point is 00:04:38 tell a story here? We really wanted to see if that was possible. And we learned very quickly, it is possible. It's just really hard and time consuming. The film itself is about my, if we want to say that I'm me in this film, getting a sort of a humanoid robot and these robots were designed to make people and humans more efficient.

Starting point is 00:05:02 Time for your coffee IV drip. Because I thought, okay, maybe we can have some fun playing off of this idea that AI is all about making us efficient in our jobs and in everything else. I let people watch, but the robot lives with me. We have some good times together. We have some not so good times together. It really wants me to keep working.

Starting point is 00:05:21 My ultra sensitive microphones indicate you are not engaged in elimination activities. And then I can't ruin the end. But you know, let's just say I come out on top of the end. Wait, can you ruin the end? Hmm. Okay, fine. I'll do it.

Starting point is 00:05:37 But I don't know. It's not usually how it works with with movie interviews. But yeah, I mean, in the end, Spoiler alert. Spoiler alert. I get frustrated with this robot and I had no other choice, but I have to reprogram him. Joanna, please don't do this. Oh. Yeah. But I will say there was a lot of constraints to making this.

Starting point is 00:06:01 You'll notice when you watch everybody, you'll see like the robot doesn't talk, right? The robot has a voice, but it doesn't have mouth movements. And so that was one of the constraints we had. And you'll see, I never talk. Like my mouth never moves in the piece because we had that technical constraint. You can't really have the dialogue work very well

Starting point is 00:06:21 between two people. You can't really make that consistent. And so when you watch with an eye for the technical constraints, you can really see like, oh yeah, they kind of had to make something that was like this. Tell people exactly how you made this short film. What exactly are you doing to make this?

Starting point is 00:06:39 Because this isn't like shooting a little short film on your phone where you hit record, you capture some footage, then you edit it. Yeah. No, and I'll take you through as simply as I can, but it is pretty complicated. So, we decided we wanted to have two characters, me, and I exist in real life, and this robot, which does not exist in real life.

Starting point is 00:06:58 And so, we created these digital versions of the characters. The robot named Max, or OptiMax 5000, we created using an AI image generator called Mid Journey. We iterated in that, we worked through what does he look like, what does he look like? We finally landed on some images we liked. As for me, I took a bunch of photos of myself, different angles. Then we went into Runway,

Starting point is 00:07:22 which is an AI video generation tool, and we uploaded those photos. And then we said, okay, create a scene where you see the robot working out alongside Joanna and make it in a suburban background with houses on a paved street. And so then the runway would spit out what we would call the first frame of that. And so we'd have an image. And then we would take that image

Starting point is 00:07:49 and we put it into VO, Google's tool, and say what we wanted the motion to look like. And here's where things got really complicated. And Gerard really did a lot of this work. But you really have to give the model very specific instructions on what you want to be done. And so he worked alongside Google's Gemini, which is their large language model,

Starting point is 00:08:07 to really craft detailed prompts of what we wanted the videos to look like. And so these were long texts, I mean, like hundreds of words that you would put in with the photo and the text into Google Vio, tell it what we'd wanted, and out we would get a bunch of videos. And we'd pick from those videos what would look the best for the scene.

Starting point is 00:08:30 When your video dropped, what did people think of it? What was crazy was how mixed the reviews were. A lot of people wrote in saying they were blown away and they could not believe how real it looked. They could not, they laughed because we played, we played a lot of bloopers. So there was a lot of people that really enjoyed watching this. Joanna is so good at doing these and brings in the mainstream in such a great way. But then there was a very loud and vocal group that just hated this. Here are some of the reviews that I read on X or on TikTok.

Starting point is 00:09:09 Wow, that was just awful. Ugly, soulless, nonsensical. Garbage. This is an abomination and you should feel ashamed for making it. Absolute soulless shit. Wow. Shit from the butt. Shit from the butt?

Starting point is 00:09:24 That's my favorite. Why? Why are people so mad at your video? They're mad at AI video. They're not mad at my AI video. They're mad at AI video in general for existing. Can you trust what you see? Because people don't let you know they're using AI.

Starting point is 00:09:41 Deep fakes and misinformation could get a serious upgrade. Synthetic video evidence might become harder to distinguish from the real thing. You can also see in the quality right now, it's not really Hollywood level. Is that where there's like a more practical use for this technology right now? That's the goal of many of these AI companies.

Starting point is 00:10:02 I mean, yeah, I mean, that's where it really gets interesting. So some will say like, look, this is a moment to democratize video tools, right? Those folks who aspire to be filmmakers, well, they can now just do this. They can sit in front of their computer and they can make things that they once never would have been able to make before.

Starting point is 00:10:22 But then you have the other side of this where what might we see on the big screen that might actually be AI generated. And so we've seen a bunch of AI film studios and production houses start popping up. The goal is for the makers, the Googles, the runways of the world to be working with Hollywood. Their hope is to start working with film studios

Starting point is 00:10:40 to generate stuff that will end up in the films we see on the big screen. Or the small screen, whatever you watch your Netflix on. You can watch Joanna Stern's short film at wsj.com or on YouTube where it's called, we tested Google Vio and runwayway to Create This AI Film. It was wild. We're heading to Hollywood in a minute at Today Explained. So you've always been picky about your produce, but now you find yourself checking every label

Starting point is 00:11:40 to make sure it's Canadian. So be it. At Sobe's, we always pick guaranteed fresh Canadian produce first. Restrictions apply. See in-store or online for details. Support for today's show comes from Bambas. Bambas wants to make your summertime in the sun a little more comfortable with socks that they say are perfect for your next marathon or just your next trip down to the bodega. Bambas says their running socks help wick sweat, keep you cool, and fight blisters. And they don't

Starting point is 00:12:10 just stop at socks. Bambas says they also offer those white tees, those waterproof slides, and those sweat-wicking mudans. Nisha Chichal is our colleague here at Vox, and she's tried Bambas herself. I am part of a whole family of Bambas wearers. My daughter, who's three, also wears Bambas. She has several pairs in toddler kid sizes, and they're great. The kids ones have little grips on them, which is great because she runs around a lot, so the grips help her to make sure she's not slipping on wood floors. So she's a fan, too. Bambas also wants you to know about their mission,

Starting point is 00:12:45 which is for every comfy pair you purchase, they say they donate another comfy pair to someone facing homelessness. You can head over to bombas.com slash explained and use code EXPLAINED for 20% off your first purchase. That's B-O-M-B-A-S dot com slash explained, code EXPLAINED at checkout. Bombas dot com slash explained and use the code EXPLAINED at checkout. Bombas.com slash explained and use the code EXPLAINED. Support for the show today comes from Jerry and Ben's Nowhere to be Seen.

Starting point is 00:13:13 This is not them. Jerry is an app that says they can make finding the right car insurance a breeze from comparing quotes to getting you covered. Everything can be found in the Jerry app. Just answer a few quick questions and then they can instantly pull quotes from like over 50 top rated insurers, you guys. You can stop needlessly overpaying for car insurance.

Starting point is 00:13:37 Jerry says drivers who save with Jerry save over $1,300 a year on average. Before you renew your policy, you can download the Jerry app or head to jerry.ai slash explained. In just a few minutes, you can compare quotes and coverages from up to 50 top insurers. Jerry says they make car insurance simple, smart,

Starting point is 00:14:00 and finally, on your side. Based on drivers who switched and saved with Jerry over the past 12 months, over 20% of drivers who switched with Jerry found a monthly premium of $87 or less not all drivers find savings. We come to this place, today explained, for magic because we need that. Devin Gordon wrote a big piece titled, What If AI is Actually Good for Hollywood for the New York Times Magazine late last year.

Starting point is 00:14:32 We asked him, how dare you? Here's what he had to say. The premise and starting point was my sense that if you were listening to the discourse about AI and Hollywood, you would either hear that it was going to be the end of Hollywood and wipe out everyone's jobs and turn the future of cinema over to robots. greatest creative unlocking magical wand ever handed to creative pill makers in the history of humankind. And I had also been hearing and reading these stories in places like the Hollywood Reporter. Everyone is using AI, but they're scared to admit it. It's the dirty little secret. AI is being used for scripting, for shooting, and producing movies. You go into a little booth

Starting point is 00:15:33 that's 360 degree camera and you're asked to do 30 different expressions. And so I was like okay well what are people actually using it for? What is actually happening with AI? So I started with a visual effects company that works with AI called Metaphysic. The reason why I wanted to start with them is because everything I kept hearing was that when AI descended upon Hollywood, it was going gonna hit visual effects first and hardest so I wanted to start with a visual effects company. And this particular special effects company visual effects company metaphysic their specialty was sort of taking. The deep fake logic and of digitally creating a photo realistic copy of a famous person space and applying that to all sorts of aspects of the film making industry from special effects to dubbing to reshoot animation aging and de aging etc. And so i went and spent time with them and one of the first things they did was they sat me down in a chair pointed a camera at me and

Starting point is 00:16:48 there was a television screen opposite me and my face was on the screen and then the metaphysic guy clickety clack a little bit on his keyboard and suddenly my face had Tom Hanks his face sort of pasted on top of mine Tom Hanks the actor yes, Tom Hanks, the actor. Yes, Tom Hanks, the actor. My mom always said, life was like a box of chocolates. I could see my face.

Starting point is 00:17:14 It was still me. And if I talked, it was moving. But I was also very recognizably Tom Hanks. You never know what you're gonna get. The reason I was Tom Hanks is because the film project that Metaphysic was then working on was a movie called Here that starred Tom Hanks and Robin Wright. Hey, Dad. I'd like you to meet Margaret.

Starting point is 00:17:35 Nice to meet you, Margaret. Nice to meet you, Mr. Young. It was directed by Robert Zemeckis, the team from Forrest Gump reunited again, using AI technology to a degree that it had not been deployed in a Hollywood movie before. In fact, it was central to the making of it. She's pregnant. She's what? She's pregnant.

Starting point is 00:18:00 Margaret is pregnant. You're just 18 years old. In this case, they were using it to enable Tom Hanks to play the same character from the age of 18 to the age of 80. And the way they were able to do that was using metaphysics, AI technology. And one of the reasons why I wanted to focus on this movie here, Which is not a particularly good movie. I wouldn't necessarily recommend you Netflix and chill with it. Get the fuck out of my house.

Starting point is 00:18:31 But I was interested in this movie because this movie is probably the first mainstream Hollywood movie that would not have existed without AI technology. And the reason why is because it's effectively a small, domestic, emotional, serious drama. The only reason why this movie could happen is because the visual effects that it required were cheap enough with AI. It's as good as CGI now, and it's a lot cheaper, and it's a lot faster, and it gives directors a lot more creative control on the set. So that's why in the visual effects space there's such this expectation that AI is very quickly and already is in a lot

Starting point is 00:19:14 in a lot of ways transforming that industry to in good ways but also in ways that's gonna probably gonna cost a lot of people their jobs. I mean and let's talk about all those people for a moment here. Yeah. Let's start with Tom Hanks, because one thing that really surprised me about your piece was that you asked Tom Hanks how he felt about the potential for AI to enable him to star in movies 100 years after his death. Yeah. And he was like, bring it on, right? Surprisingly unconcerned. Wow. He was just sort of like, well, let's just get the paperwork sorted out. Amazing. And I was a little surprised, to be honest, about how cavalier he was.

Starting point is 00:19:52 For instance, I mean, it isn't easy to imagine a scenario, maybe not in the Hanks family. I'm sure the Hanks family is going to, I trust Chet. Do you trust Chet? Big up, big up the whole island, massive. It's your boy Chet, and coming straight from that golden gloves, you know what I'm saying? No, but I do trust Colin. I trust Colin.

Starting point is 00:20:12 I trust Colin. I trust Colin. You always want to work with good people, and obviously I think my dad's good people. But OK, what about Colin's grandkids, and they're down on their luck, and all of a sudden, 100 years from now now Tom Hanks is Legend his imagery is being sullied because he's being you know

Starting point is 00:20:31 His image is being used to make bucks in porn or whatever He's not thinking that far in advance. Let's put it that way I think the takeaway for me no shots at Tom H, was that it did sort of reflect a class divide in AI worriedness and how worried you should be. Right, because not everyone is Tom Hanks. I mean, what did you learn about all the people in VFX or costumes or makeup or what have you that are terrified about what's about to happen

Starting point is 00:21:06 to their industry? You know, one of the things that I kept hearing on the makeup front, with AI, is a director going to have to have a makeup department do a character's makeup every single day? Or can the makeup department do it once, right? At the start of the production, that becomes a file that gets saved and mapped onto the character's face later.

Starting point is 00:21:33 And now, instead of having a makeup artist for the entire run of the set, you've only got the makeup artist for one day. You go from makeup artists being paid by the day to some sort of almost license or copyright for how many days that that makeup work gets used, right? The entire economics of the industry has to change. Does it mean that we're not going to need makeup artists? Of course not. We're still very much going to need makeup artists. They're going to need them as much as ever. But how they work and how they get compensated is going to radically transform. And you could go through every department in the filmmaking process, and each of them

Starting point is 00:22:14 would have different ways in which AI will disrupt how they work. The thing about all these ways is that none of them are as grandiose as the worst of our imagining, right? You know, the people who were the most skeptical about AI's ability to overtake human creativity that I spoke with are the people who understand AI the most and use it the most. They understand its limitations and also how to best use it. Like, they understand how to use this tool. When we're talking generative AI,

Starting point is 00:22:48 when we're talking creative orientations or applications of AI, they understand how indispensable the human mind is to that equation. It just doesn't work without it. The notion, the theory, who knows if this will come to pass, but the positive theory, the flip side of this, is that AI lowers the barrier of entry to so many more films,

Starting point is 00:23:16 that even though the size of the crew and production is shrinking because of AI, the amount of productions that can exist grows because more people can afford to make more movies. You can accuse that of being sanguine and overly sunny. I would say in the defense of the sanguine people, the indie film movement does provide an interesting parallel here, right? When filmmaking went from very, very expensive, limited film in the 90s to small handheld digital filmmaking, where anybody could make cinema quality movies,

Starting point is 00:24:00 all of a sudden you did have a lot more movies, right? You had a lot more movies being made for a lot less money. So there is a test case, right? Can AI do that? Well, I feel like in some ways that brings us back to our friend Joanna Stern at the Wall Street Journal. To her haters out there, I think you're missing the point. I don't think that Joanna Stern is in any way trying to make a film that could go air on ABC or air in the movie theater.

Starting point is 00:24:35 What she's trying to demonstrate is how easy it is for even someone like her to effectively sit there and make something that looks at the worst a bad knockoff, but look at all the things that she can do without having anybody. Exactly. Or experience. Anybody. Or experience. Right?

Starting point is 00:24:57 Right. And now take that capacity out of her hands and put it in the hands of people who actually do this for a living. Right. And the question is, how dangerous does this get? How many people is this going to replace? And I just don't think we know. I don't think we really know. In some ways, I don't think we really know. In some ways, what Joanna's film leaves me with is both fear and relief. Read Devon Gordon's great piece on Hollywood and AI at nytimes.com. This episode was made by humans. Their names, Peter Balanon-Rosen and Gabrielle Burbe.

Starting point is 00:25:50 Amina Alsati and Abhishek Artsy, Patrick Boyd and Andrea Christensdottir, and I'm Sean Rameshwaram. And here are some more humans who didn't work on today's show. Noelle King, Miranda Kennedy, Joely Meyers, Hadi Mawwagdi, Miles Bryan, Victoria Chamberlain, Devin Schwartz, Denise Guerra. We use music by Breakmaster Cylinder

Starting point is 00:26:09 and Laura Bullard as our senior researcher. Today Explained is distributed by WNYC. The show is a part of Vox. You can listen to this podcast ad free by signing up for a membership at Vox.com slash members. Right now, you can pay 30% less than normal for that membership. So get in there while you can. 30% less than normal for that membership. So get in there while you can.

Starting point is 00:26:27 Thank you and have a weekend. you

Today, Explained - AI Video Killed the Video Star

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.