Better Offline - OpenAI's Video Generating AI Is Dead On Arrival

Episode Date: May 15, 2024

Earlier in the year, OpenAI debuted Sora, an AI that can generate videos that almost look realistic. In this episode, Ed walks through why generating video with AI is a near-impossible task, and speak...s with Walter Woodman of Shy Kids, who made a movie called "Air Head" using the tool. LINKS: Shy Kids' Air Head - https://www.youtube.com/watch?v=G4wJ4WeJrz4 Mira Murati Interview with Wall Street Journal: https://www.wsj.com/video/series/joanna-stern-personal-technology/openai-made-me-crazy-videosthen-the-cto-answered-most-of-my-questions/C2188768-D570-4456-8574-9941D4F9D7E2 See omnystudio.com/listener for privacy information.

Transcript
Discussion (0)
Starting point is 00:00:00 This is an IHeart podcast. Guaranteed Human. Run a business and not thinking about podcasting. Think again. More Americans listen to podcasts than ads supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as large as the next two combined. Learn how podcasting can help your business. Call 844-4-8-4-I-Hart.
Starting point is 00:00:19 Another podcast from some SNL late-night comedy guy, not quite. Unhumor me with Robert Smygel and friends. Me and hilarious guests from Bob Odenkirk to David Letterman, help make you funnier. This week, my guest, SNL's Mikey Day and head writer, Streeter Seidel, help an a cappella band with their between songs banter.
Starting point is 00:00:38 Where does your group perform? We do some retirement homes. Those people are starving for banter. Listen to humor me with Robert Smigel and friends on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. Your 20s can be so exciting, but they can also be really overwhelming,
Starting point is 00:00:54 confusing, and honestly, just kind of lonely. May is, Mental Health Awareness Month and the psychology of your 20s is breaking down the science behind the biggest roadblocks we face. I was six years into my career, the 80-hour weeks and just the first one in, the last one out, and I ended up burning out. There was a large chunk of my 20s that I was just so wanting to be out of that phase out of my skin, and I just like really regret not living in the present more.
Starting point is 00:01:20 You don't need to have everything figured out right now. You just need to understand yourself a little bit better. Listen to the psychology of your 20s on the IHeart Radio app. Apple Podcasts or wherever you get your podcasts. AllZo Media. Hello and welcome to Better Offline. As usual, I'm your host Ed Zittron. A few months ago, OpenAI showed off SORA,
Starting point is 00:01:55 a product that can generate videos based on a short text prompt, kind of like ChatGPT does for text or Dali does for images. These videos, which are usually no more than 60 seconds long, can at times seem impressive until you notice a little detail that breaks the entire facade. like in a video where a cat wakes up its owner, but the owner's arm appears to be part of the cushion, and the cat's pore explodes out of its arm like an amoeba. Reactions to SORA's AI-generative videos, and indeed the existence of the model itself, have ranged from kind of a breathless hype to genuine fear that this will be used to replace video producers,
Starting point is 00:02:32 in that it can create reality-adjacent videos that for a few seconds kind of seem real, especially in the case in some of OpenAI's hand-picked demo videos. Yet even in these hand-picked Sora outputs, you'll find these weird little things that immediately shatter the illusion, like one where a woman's legs awkwardly shuffle, then somehow switch sides as she walks around, or blobs of people merging in the background of images. These are on some level genuinely remarkable technological achievements, until you consider that what they are and what they might do, and that there are problems in them that run through the entire. higher fabric of artificial intelligence. A little over a month after SORA was announced,
Starting point is 00:03:16 Open AI would debut a series of short films, including one called Airhead, where filmmakers Shy Kids told the story of a man with a balloon for a head. And because this is AI, said balloon changes sizes 23, 24, 26, 27, 29, 32, 34, 39, 41, 42, 43, and 45 seconds into the piece, at which point I stopped counting because it got boring and I really don't want to be mean to shy kids, as this really isn't their fault. The very nature of filmmaking is that you take different shots at the same thing, something that I anticipated Sora was incapable of doing as each shot is generated fresh,
Starting point is 00:03:53 as Sora itself, much like all generative AI, does not actually know anything. When one asks for a man with a yellow balloon as his head, Sora must then look at the parameters spawn during its trading process and create an output, guessing what a man looks like, what a balloon looks like, what a man's features are on his body, what color yellow is, what the man's doing, and so on and so forth. This becomes extremely problematic when you're working in film or television,
Starting point is 00:04:22 where viewers are far more likely to see when something just doesn't look right, a problem exacerbated by moving images, high-resolution footage, and big television screens, which are now ubiquitous. Yet the press, as usual, credulously accepted Sora's, quote, stunning videos that were amazing and scary, suggesting to the public that we were on the verge of some sort of artificial intelligence takeover of the film industry, helping boy Sam Altman, their CEO and his dumbass attempts to convince Hollywood that Sora won't destroy the movie business. These stories only serve to help Sam Alman, who desperately needs you to believe that Hollywood
Starting point is 00:05:01 is scared of Sora and even more scared of generative AI, because the more you talk about fear and lost jobs and the machines taking over, the less you ask a very, very simple question. Does any of this shit actually work? The answer, it turns out, is not very well. In a piece for FX Guide, Mike Seymour sat down with Shy Kids, the people behind Airhead, and revealed how Sora is in many ways a little bit useless for making films. Sora takes 10 to 20 minutes to generate a single three to 20 second shot, something that isn't really a problem until you realize that until the shot is rendered,
Starting point is 00:05:38 you really have absolutely no idea what the hell it's going to spit out. Sora has no mechanism to connect one shot to another, even with hyperdescriptive prompts. It hallucinates extra features when you haven't asked for them, and Shy Kids were shocked by how surprised OpenAI's researchers were when they requested the ability to use a prompt to request a particular angle in a shot. A feature that was initially unavailable. it took, this is what kind of drives me crazy here, and you'll hear this in the interview with him later.
Starting point is 00:06:10 These people, open AI people, and they were making this tool for making visual images, for making moving images, they didn't think that people might want different shots. I'm so glad these are the people who were in control of the future. Anyway, to quote the piece, it took hundreds of generations at 10 to 20 seconds apiece
Starting point is 00:06:29 to make a minute and 19 second long film. And what's really fun about this is that the movie's fine. I, it was kind of fine. I just, I have nothing really to say about it. It's a minute and 20 seconds long. But it's, it kind of works, but also the balloon looks different in every other shot. This isn't Shy Kids' fault, but also, this isn't going to get better, and I will get into why as we go along.
Starting point is 00:06:59 These tiny little problems I've mentioned, though, they all lead to one overwhelming issue. That Sora isn't so much a tool to make movies as it is a big fat slot machine that spits out footage that may or may not be of any use at all. Almost all of the footage in Airhead was graded, treated, stabilized, and upscaled. And that 10 to 20 second lead time on generations was for 480P resolution footage, meaning that even useful footage needed significant. post-production work to look good enough. And just to give you an idea for the non-technical members of the audience, and this is fair,
Starting point is 00:07:36 the video you see on YouTube is usually somewhere between 720p, 1080p, or 4K. The TV shows you watch usually 1080p, 4K, or upscale 1080P. These are all lots of numbers. What I'm saying is the stuff that Sora spits out
Starting point is 00:07:51 that takes burning a small zoo to spit out is incredibly low resolution on top of not being specific. Look, to put it as plainly as possible, every single time that shy kids wanted to generate a shot, even a three-second long shot, they would give Sora a text prompt, and then they would wait at least 10 minutes to find out if it was right. And they'd have to accept footage that was subprime or inaccurate. And there's a really good example of this. If you watch Airhead, a lot of the shots are in slow motion, and you're just, you're in a lot of. you may think, oh, this is a cinematic choice, right? Because you're kind of just admiring this man with a balloon for a head going about his business.
Starting point is 00:08:36 No, no, no, no, no. They found that this was just what Sora wanted to give them when they asked for it. This was, in and of itself, a hallucination. In the same way that Chat GBTGBT will authoritatively tell you that something is true that is not, Saur will spit out a man running in slow motion, despite you're not asking for that. And it's so weird, they had to quote them, do quite a bit of adjusting to keep the whole thing from feeling like a big slow-mo project. And it still kind of does. And that's rough.
Starting point is 00:09:13 That's really rough. But you know, I'm a curious little critter. So I decided to sit down with Shy Kids' Walter Woodman to talk about his experience with Sora and have him delve a little diaper into his experience with the product. And I'd say he had a far more utopian experience and perspective on the whole thing than I expected. Now, some of you might critique Walter for being so positive about it, but I actually caution you to just listen to what he's saying. Because Walter's perspective is interesting. He sees this as a tool. He doesn't see it as a replacement. And I think it's a valid perspective to come at Sorrow with. I also think it's a perspective that kind of accepts a conceit of Open AI's marketing, strategy, that these things will get better.
Starting point is 00:10:00 If they do, perhaps Walter is right. Perhaps this will be an essential tool in filmmaking, even though he didn't say essential. Don't want to put words in the man's mouth. But I don't think that's the case. Let me talk to him. You decide for yourself. Another podcast from some SNL late-night comedy guy,
Starting point is 00:10:26 not quite. Unhumor me with Robert Smygel and friends, me and hilarious guests from Jim Gaffigan to Bob Odenkirk. to David Letterman, help make you funnier. This week, my guest, SNL's Mikey Day and headwriter, Streeter Seidel, help an Acapella band with their between songs banter. There's that worst singer in the group? The worst?
Starting point is 00:10:46 Yeah. Me. Is there anything to the idea that because you're from Harvard, you only got in because your parents made a huge donation. The group. The yard birds, right? That's the name. The Harvard Yard.
Starting point is 00:11:00 They're open to change. Do you have a name suggestion? We're open. You guys are middle-aged. One erection. Listen to humor me with Robert Smygel and Friends on the IHeart Radio app, Apple Podcasts, or wherever you get your podcast. Humor me. I need some jokes to make me seem funny.
Starting point is 00:11:22 Run a business and not thinking about podcasting, think again. More Americans listen to podcasts than ad-supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as long. large as the next two combined. So whatever your customers listen to, they'll hear your message. Plus, only IHeart can extend your message to audiences across broadcast radio. Think podcasting can help your business. Think IHeart. Streaming, radio, and podcasting. Call 844-844-I-Hart to get started. That's 844-844-I-Hart. There are times when the mind becomes a difficult place to live. This is David Eagelman with the Inner Cosmos podcast, and for Mental Health Awareness Month, we're
Starting point is 00:12:02 dedicating a series to understanding the mind when it struggles. I'm joined by doctors, researchers, and those with lived experience. We'll talk with singer-songwriter Jewel about anxiety. I started living in my car, and then my car got stolen. I was shoplifting. I was having panic attacks. I was agoraphobic. And making it through hardship. To be present is a learned skill, and it's hard to be present. We'll talk with John Nelson about clinical depression and the brain implant that saved his life. What I learned is that procedure made me happy because I'm disease-free. And we'll talk with leading experts like Judd Brewer about anxiety and John Herschfield about obsessive-compulsive disorder and the science of how the brain can change. This is a month of
Starting point is 00:12:51 deeply personal and honest conversations about what happens when the brain goes off course and what we can do about it. Listen to Inner Cosmos on the IHeart Radio app. Apple Podcasts, or wherever you get your podcasts. All right. So how did the relationship between shy kids and open AI actually begin? The relationship between shy kids and open AI began when we made an installation for a film called Dollyland, which was premiering at Toronto International Film Festival. And we were the only people that our friends at Pressman Film knew in Toronto.
Starting point is 00:13:33 And so we made an installation that looked like Salvador Dali's, like, studio inside of the basement of the St. Regis, which is where he lived and made work out of. And inside of that installation, we made a, like, you could make your own surrealist painting. And the way that you could make that was using Dali, the Open AI program. and so the Open AI people came to visit and check out the like what we were working on and making sure that it was like something that they wanted to be a part of and so they met our producer Sydney who they loved who she's easy to love and they um we sent them our previous work and so from there they asked us to join this artist group. And then when SORA came out, we saw it the same time as everyone else. And we got tapped on the shoulder and said, hey, would you like to check this out and try this out? And we said, of course. And that's how it came to be. So how did you on board? Were you just given access? Did they give you instructions? Did they physically come to you? What was that like?
Starting point is 00:14:58 It was top secret. They gave us a briefcase in a cloudy room. No, it was a, yeah, there was a very simple onboarding process where they walked us through the technology as well as some of its features. And yeah, it was pretty. It was pretty. And then from there, they gave us access to begin using it and making things. And you were allowed to use it without their presence. You had direct access.
Starting point is 00:15:31 Yep. Yep. So, okay, did you get instructions on how to write effective prompts, or did you just kind of do trial and error? No, nothing like that. I mean, in the artist group itself, there's a lot of really amazing and thoughtful, creative people who kind of show their work and show how they got to make the things that they did. but no, there was no real engineering of our prompts. They were very much just play, kind of see what comes out of you, your creative people that we trust. Why don't you just see what works? Throw spaghetti at the wall.
Starting point is 00:16:21 That's cool. So, during the, in the piece from FX guide, In the interview, someone from Shy Kids said the open AI's researchers, they were surprised when they were asked about being able to say specific shots. What happened there? Was it just that you tried to ask Sora to do specific shots and it didn't work, or was it just not a feature? I think that's maybe taken a little bit out of context. I think more so it's just people come from different disciplines. And when I say a wide shot on a 130 millimeter lens, people from my area of expertise know sort of
Starting point is 00:17:08 immediately what I'm talking about, whereas the researchers, they are more invested in sort of other things. And so it's not so much that they didn't understand or that SORA didn't understand. It's more so just there's all these terms in films like a Zolli or like a Hitchcock Zoom or all of these different things that are very understandable. But even when you go from set to set, they mean something different. So I think it's about trying to create a lingua, franca, between all of these sort of different, very different people and very different ways of using a tool. what I may call a Zoom, you may call a Dolly shot, et cetera, et cetera. So that feels like a training data challenge.
Starting point is 00:18:01 Yeah, I think it's about trying to figure out how and yeah, exactly, what to train on. Yeah. So tell me, what was the interface like? Was it a chat box? Did you have, like just tell me about what it actually look like? Sure. There's limitations of what I can say about things like that. But I think the way that I've described it to people without giving too much away is I think
Starting point is 00:18:34 if you're familiar with using something like the Adobe Suite, I think that there's some commonalities, whether you're using After Effects or Premiere or whatever, Illustrator, there's like commonalities. and if you can use one, you can sort of futs your way around the others. I would say it's very similar like that with OpenAI's tools and models, that if you are used to things like ChatGPT and Dali and those types of models, I think you will find an ease of use in using SORA. So within that article, they mentioned that there was like a 300 to one shooting ratio, which correct me if I'm wrong, means like 300 seconds of material for each second of usable material. How does that compare to conventional filmmaking in your experience?
Starting point is 00:19:35 It would be even more seconds than that. I would say just 300 shots at probably 10 to 20 seconds apiece. So whatever the math is on that. I would say that that's pretty common with shooting. You know, when you are shooting a fiction film, or like even a documentary is even crazier for that, you shoot all day and all day. And from, we shot a documentary recently, and I actually had to go back and watch all the dailies. We counted about 90 hours of footage that we had, and from that 90s hours, you're making an hour and a half movie. So, you know, you are really trimming things down. And I think also it's like you are getting the five seconds that work or the, you know, the section of that shot that works. And I would say that's pretty common to filmmaking.
Starting point is 00:20:36 How about narrative filmmaking? Because I know documentary you have a lot of stuff, but I'm just wondering what the burden of selection is like compared to the amount of shots. you take in just a regular movie or regular short film even? Again, I would say at least I can only speak for the way that I shoot films, you know, if you had John... Oh, it's subjective. It's subjective for sure. If you're David Fincher, you're shooting 800 takes of like someone picking up a pencil or
Starting point is 00:21:04 Stanley Kubrick, you know, is like famous for a thousand takes. I would say that the burn rate was very similar. I would say that the challenge. The challenges with SORA are like it's unbelievable at making these images that are unbelievable and so interesting to look at. But at its current state, it can sometimes be difficult to do things that in traditional shooting would be much easier, where you say, hey, can that guy go over here? or can that person move from one side of the screen to the other? Things like that are more difficult.
Starting point is 00:21:48 But again, this is baby steps. We are in like the toddler phase. So I assume that those things will get better. So you mentioned, well, shy kids mentioned in the interview that by default it tries to prevent you from creating videos that violate copyright law, existing copyrights. Did you accidentally bump into this regularly? or was this something that just you didn't really bother you? No, you couldn't generate things that. So when I was mentioning like a Hitchcock Zoom,
Starting point is 00:22:20 you couldn't mention Hitchcock. So you had to find a different way to describe that as opposed to like using public figures. Anything that would have a public figure or a title, you would not be allowed to generate. From my experience, there wasn't too many logos or, brands or anything like that in any of the things that I generated and
Starting point is 00:22:45 but something copyright did you generate anything that looked copyrighted no not not to my not to my eye that's fine um so well I I know you don't know how much Sora will cost and we don't know that don't even know when it will launch can you talk about how much you'd be willing to pay for it what do you think it's worth and I realize that this is a vague question for sure um i think that there is this illusion that SORA will be this uh solution to all problems and i don't think that that is the case i think Sora is a tool amongst many tools and for certain things it will be very valuable and so um in terms of value it's like well how much is a glass of water
Starting point is 00:23:39 Well, yes, if a glass of water is just like right now in my kitchen, I wouldn't like to pay that high for it. If a glass of water is for a person in the desert who desperately needs that glass of water, you can really name your price. And I would say that for some projects, I think that the usage of SORA would be absolutely invaluable. And I would, I don't know how much exactly that would be. It would depend on the budget, would depend on the limits and the scales. but I would say that there's other projects where I think it would be like totally inappropriate or like just not worth. Like what? Well, just when I think of studio Ghibli films that are hand drawn and I think the reason that those films work is because of the way that they're made.
Starting point is 00:24:31 Or I think that when you think of art man animation, it's like I feel that you could feel the fingerprints in that clay. so I don't think maybe for those types of films that it would be appropriate, but I think for other types of films like Airhead or others, I think it would be extremely appropriate. I think it's up to the artist's sort of discretion how much they think that that tool is needed. But doesn't the inconsistency of shots make this deeply impractical? Because that's the thing I kept coming back to. Yeah, I mean, depends on what project you're working on. And again, I think that this is like early days. I think that these are kinks and bugs that are going to be changed and already from day one where we started using it to where we are today. Massive improvements
Starting point is 00:25:30 have happened and actually improvements where they've listened to things that we have suggested and things that we'd like to see and tools we'd like to see. So I think that, for example, for Airhead, the inconsistency of having a protagonist, having a protagonist that stays true through all these different shots. That's the reason why we put a balloon in front of their head, because while different bodies can sort of be accepted, a different face and a different head is going to be a little bit difficult. And so we turned the limitation into our sort of main attribute.
Starting point is 00:26:13 And I would say that, again, that works for that story. But I don't think that all stories are going to find this valuable. And I also don't think every single shot needs to come from SORA. I think that there's a world where it can be an addition, or it can be the start of a story where instead of just brainstorming and just having a script, you make a sort of moving mood board or a trailer. So I think that there's like tons of stages along the pipeline that it would be extremely valuable and help elucidate concepts and bring them to life.
Starting point is 00:26:58 So, thematic question. So you avoided filming, locations and all of this, but you spent a lot of time writing prompts and you were waiting for Sora to generate clips, then upscaling and all that. Do you think you could make airhead, assuming you could get around the balloon head thing, do you think you could make it quicker in real life than with Sora? Or was Sora kind of essential to get it done in the timeline you did? Because it was like a week and a half, two weeks, I think. Yeah.
Starting point is 00:27:28 I don't know. That's an interesting question. I mean, we definitely wouldn't be able to fly around the world and get the shots at the car race and all of those things. So I think it would probably be shorter. But I think in general, the conversations about time and money are super reductive in a way in that I think that without SORA, this wouldn't exist. And I think that that is the more interesting conversation. as a director, most directors I know, have a folder of unrealized ideas. And I think that my hope is that SORA will allow us to dust off those folders and breathe new life into concepts.
Starting point is 00:28:18 And when people see what those concepts could be, my hope is that it gives a lot more people opportunities to have their ideas illuminated. And whether that means to go and shoot it now traditionally or some hybrid, I think that that to me is what's most exciting. So where do you see Sora going? I know you're considering looking at it as kind of a complementary tool, but do you think that that's its use case or do you think it'll ever do end-to-end filmmaking? I think let a thousand flowers bloom. You know, I think that there is people who are going to just use it for small, complementary things to maybe help with, in the same way we use stock footage now. I think some people are going to use it as a way, say you are from a community that has maybe a little bit of a less established film community. and it's a way to have you compete with the big boys in terms of special effects and usage.
Starting point is 00:29:30 And again, I don't just think it's as easy as bleep, blu, blop, type in the prompt, here comes the thing, but rather it allows you to just have a really powerful collaborator that you can help make maybe larger concepts and bigger ideas. And then, yeah, I think that there's some people end to end who are going to make things that are completely generated or most of the shots in it are generated or things like that. In general, the thing that feels interesting to me is like helping to deepen humanity. Whereas the more you sort of simplify the process, I think that that is like, I don't know, it's never a simple process.
Starting point is 00:30:21 Any time you hear about something that is going to make it all easy and make all your troubles go away, I'd be very weary of that. I think film is going to always be difficult and a challenge, and I think the benefit of Sora will be to help lead us into new paths and lead us into new directions. If I were to tell you, hey, we made this film called Lord of the Rings, and it uses CGI Ork. and it makes massive orc fights. If I told you that in the 1930s, you'd probably gasp. Or if I told you that CGI is going to be a predominant way in which we make films in 2024, I think you would go, ah, that's not real filmmaking. And I don't think...
Starting point is 00:31:09 I think you kind of saw that in the 90s, actually. Yeah. I don't think history is too kind to those people that go, this is not going to work. This is not art. This technology is not the way. I just think it's, it depends on the artist. And it depends what they want to bring to it.
Starting point is 00:31:28 I think that's the key X factor here. One final question. With that all in mind, do you think that Sora is going to hurt filmmakers? Do you think it's going to replace people? I mean, I hope not. I mean, that's my job. So I would very hope not.
Starting point is 00:31:48 No, I, I, I very much understand people's fears. And I think that, you know, I'm a student of history. So when I look back in history and the camera, obscure comes out, painters are talking about how we aren't going to need painters anymore because now we can capture reality. Why do you need a painter to go and paint it? And it's a very valid point.
Starting point is 00:32:20 but painters didn't go away and then there was this whole new industry called photography and then after photography there was this whole new industry called film and then after film there was this whole new industry called home video and then after home video there was this whole new industry called cell phone video
Starting point is 00:32:39 and then there was this whole new industry called TikToks and vines and I just think that when people don't come in contact with things they're immediate, as humans, our immediate reaction is fear and we're worried about things that are new because we do not yet understand them. And I think that for us, we like to face those things face on. And I think that the other side of that coin is that there's some kid right now in rural Bangladesh who has this amazing big idea. and maybe doesn't have all the resources that everyone else has.
Starting point is 00:33:23 And with these types of technologies, it may level the playing field for kids like that to compete with the avatars of the world, compete with the marvels of the world. And then I think we're going to all be on this level playing field and what's going to matter is not just who has the highest budgets and who has the most resources, but who has the best stories. And for me, that's the exciting part.
Starting point is 00:33:49 We work with groups of collaborators that we love and respect, and our hope is never, let's work with them less. Our hope is always let's enrich those relationships and hopefully grow them and hopefully bring more people into our collective and more people into our process. So that's our hope. Maybe I'm utopic. Maybe I'm wrong, but that's the choice that's the way. we're choosing to look at this.
Starting point is 00:34:21 Another podcast from some SNL late night comedy guide, not quite. Unhumor me with Robert Smygel and friends. Me and hilarious guests from Jim Gaffigan to Bob Odenkirk to David Letterman, help make you funnier. This week, my guest, SNL's Mikey Day and head writer
Starting point is 00:34:47 Streeter Seidel, help an Acapella band with their between songs banter. There's the worst singer in the group. The worst? Yeah. Me. Is there anything to the idea that because you're from Harvard, uh, You only got in because your parents made a huge donation. The group.
Starting point is 00:35:05 The yard birds, right? That's the name. The Harvard Yardt. But they're open to change. Do you have a name suggestion? We're open. Since you guys are middle aged. One erection.
Starting point is 00:35:17 Listen to humor me with Robert Smigel and Friends on the I-Heart Radio app, Apple Podcasts, or wherever you get your podcast. Humor me. I need some jokes to make me see. Run a business and not thinking about podcasting, think again. More Americans listen to podcasts than ads supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as large as the next two combined. So whatever your customers listen to, they'll hear your message. Plus, only IHeart can extend your message to audiences across broadcast radio.
Starting point is 00:35:50 Think podcasting can help your business. Think IHeart. Streaming, radio, and podcasting. Let us show you at iHeartadvertising.com. That's iHeartadvertising.com. Hey, everyone, it's Ryder Strong and Will Ferdell from PodMeets World. And now the PodMeets Twirled podcast. We're two men who were completely clueless to reality TV,
Starting point is 00:36:14 who now have covered Dancing with the Stars, traitors, and we're gearing up for the season finale of Survivor. So yeah, now we're experts. I know we annoyed a lot of our listeners by our severe lack of survivor knowledge. That is the point of this show. I'm just going to remind you. I have watched some Survivor.
Starting point is 00:36:33 I obviously haven't watched enough. Did people not like it? Yeah. Just because we... Yeah. We'll be recapping the big conclusion in the 50th season from the final attempts at gameplay
Starting point is 00:36:43 to the desperate pleas of finalists to a bunch of ha, hoo. Ha, ha, who. Again, we are experts. So make sure to tune into PodMeets Twirled for all our Survivor 50 takes. Listen to PodMeets Twirled
Starting point is 00:36:56 on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. In Woodman's mind, SORA is a tool, an extension of creatives' methods rather than a replacement of filmographers or actors, what have you. And that very much lines up with Sam Mortman and OpenAI sales pitch for Sora. His utopian perspective, his words not mine. It's predicated on both film studios acting with integrity, something they've proven to never do, and Open AI being able to make Sora a significantly better tool, something that's going to require masses more training data and compute
Starting point is 00:37:35 that I think is actually in existence. Paul Trillo, an LA-based artist and filmmaker speaking to Business Insider in April, described Sora as a research project in Alpha, mentioning that it was a little confusing who the market was for the service. And I think that gels with another problem that Woodman raised. that what might be a zoom-out shot for you would be a completely different term for someone else, which in turn would require OpenAI
Starting point is 00:38:07 to have both the right training data of a zoom shot and many, many, many of them, to be clear, but they'd need to know the multitudes of different terminologies that go into filmmaking. Now, if they don't give a shit, maybe that's a completely different story. In short, SORA faces both the intractable problems of AI that I've mentioned in the previous episode, PKII, go and listen to it, but also a few of its own. Namely, that generating moving images isn't just about ingesting a bunch of footage, but it's about understanding said footage well enough to generate something else based on a multitude of different perspectives, descriptions, and cultural contexts.
Starting point is 00:38:50 I'm not sure the Open AI, really most people realize how complex even the simplest movie is, how much work goes into making a film. And I think that that's actually what excites people about this, because making films can be inefficient, it can be extremely taxing, it can be extremely expensive. But the problem here, as I'll get into the other ones as well, is that SORA is being sold to film studios. That is who Sam Altman is going to. And thus, it's going to be built for people who don't make movies. I'm actually really happy to hear that shy kids and other artists are involved,
Starting point is 00:39:29 so it'll actually be tuned to be somewhat useful. But I don't think people realize how gigantic the task is that Sora is going after and how I think it's impossible it can go any further. But I digress. I just don't believe that Sora actually works. if you're making a movie. While Pixar movies may take years to render, they've got supercomputers and specialized hardware,
Starting point is 00:39:57 and more importantly, the ability to actually design and move characters in a 3D space. If you are putting something in SORA, what are you designing? If you put a character in this, again, you cannot have consistency between these things. That is a problem across all generative AI. You can not do that,
Starting point is 00:40:18 unless of course using copyrighted footage, Mr. Altman. But seriously, though, with no consistency of cross shots, what the hell are you doing? While there are unexpected things that might happen in a 3D animated movie or a CGI situation, you still have complete control over the thing you are putting on there, the thing you are animated, you can make subtle tweaks to it. That doesn't seem to be the case with Sora. You can adjust what's on the screen, but even though this is AI generated, it doesn't have the benefits of regular generative stuff like CGI,
Starting point is 00:40:55 which stands, of course, for a computer generated image, I believe, and if I'm wrong, you're going to yell at me in the emails. But seriously, though, the practical use cases for SORA, they're just kind of not there. SORA's attempts to replace filmmakers, if that is Open AI's going, and I really believe it is, they're dead on arrival. because it's an impractical and ineffective solution, and the problems that solving are really only ones created by Hollywood executives. The AI hype bubble, as I have noted repeatedly, is one entirely reliant on us accepting the idea of what these companies will do rather than interrogating their ability to actually do it.
Starting point is 00:41:37 SORA, much like all generative AI, suffers from an imprecision and an unreliability caused by hallucinations. an unavoidable result of using mathematics to generate things. And the massive power and compute requirements are just prohibitively expensive. If this is going to end up as a VFX tool or a productivity tool or as a fill-in tool, it's going to need to be a lot cheaper than it is to run. Generative AI is already unprofitable. To make SORA any kind of useful,
Starting point is 00:42:11 Open AI will have to find a way to dramatically increase the precision of the prompts, reduce hallucinations to pretty much nothing, and vastly increase processing power across the board. Sora hasn't even been launched, save for, of course, these hand-picked companies that got to test it, meaning that this 10 to 20-minute wait between generations of moving images, that's likely to increase once people use the product, and that's before you consider how expensive it's going to be to run the bloody thing. This is a significantly more complex model than chat GPT, which is already unprofitable.
Starting point is 00:42:46 Sam Altman can make money, but can he make profit? I severely bloody doubted. He hasn't before, and I don't think he's going to in the future. He's still begging Daddy Sachia over at Microsoft to give him a supercomputer so his things can fart out things more profitably. It's just drives me a little insane. And these things I've talked about, They're intractable problems that Open AIs failed to solve.
Starting point is 00:43:13 They failed to make a more efficient model for Microsoft last year in 2023. Their Iraqis model. Jesus. Christ. And while GPT5 is meant to be materially better, to quote Mr. Altman, it isn't obvious what better means when GPT4 performs worse at some tasks than its predecessor. I do believe Sam Altman is telling the truth when he says that the future of AI requires an energy breakthrough. But the thing I think he's leaving out is that it may take an energy breakthrough and indeed more chips for generative AI to approach any level of necessity. And he's hoping that people will buy the hype without asking too many annoying questions like, what does this stuff actually do? Or is this useful? Or does this actually help me? Or will this be around in 10 years?
Starting point is 00:44:00 To be clear, Sam Altman is the single most well-connected and well-funded man in AI with a direct connection to Microsoft, a multi-trillion dollar tech company, and a Rolodex that includes effectively every major founder of the last decade. And he still can't get past any of these problems, partly because he is not technical, and thus can't really solve the problems himself, and partly because the problems he's facing are burdened by the laws of maths and physics. Generative AI hallucinates because it doesn't have a consciousness or any ability to learn or know anything. It's extremely expensive because even the simplest prompts require GPT4 to run highly complex mathematical equations on graphics processing units that cost upwards of $10,000 apiece. Even if generative AI were cheaper or more efficient or required less power, it would still be a process that generates answers based on the extremely complex process of ingesting an increasingly dwindling amount of training data.
Starting point is 00:45:02 These problems are significantly compounded when you consider the complexity, size and massive legal ramifications of training a model on videos. A problem that nobody has seen fit to push Altman or Murmati or anyone else are open AI about. That's a pistake, really. Seems like an obvious one. Like, hey man, you need a bunch of training data to train chat GP2, which does words. How are you getting all these videos? Again, big credit to Joanna Stern, who asked Miramirati's
Starting point is 00:45:33 CTO of Open AI, whether SORA was trained on YouTube videos, and then Miramirati, of course, made that incredible face. Go look up that video. I'll link it in the notes. That's ultimately the problem with the current AI bubble. So much of its success requires us to tolerate and applaud these half-assed, half-finished tools that only sort of kind of do the things they're meant to do, and we're meant to nod and smile and clap and say, great job, Sammy, like we're talking to a bloody child, rather than a startup with $13 billion in funding with a CEO that has the backing of God damn Microsoft. And Sora is the ugliest, messiest problem of them all. Its videos, while superficially impressive, are still deeply, deeply flawed. They take way too
Starting point is 00:46:19 long to generate a problem that's only going to get worse, and they're just far too inconsistent, which is a problem created by the nature of how generative AI works and its approach to generating things using mathematics. And if it's planning to be a VFX tool, if it's planning to be a sidearm for filmographers, it's going to have to be a lot cheaper than it's really practical to make it. Again, nothing OpenAI makes is profitable. They may make over a billion dollars of revenue,
Starting point is 00:46:51 but everything is burning money. It's just very frustrating. It's all very frustrating. Sora seems kind of cool, but when you take away the cool side and you just look at it for what it is, it's just another con from Sam Altman. It's just another unfinished product
Starting point is 00:47:11 that is not able to fit the task. It's just another thing that you look at and you say, oh, if that was just a bit better, it'd be really good, except in this case it would be a lot better. Yeah, all the press writes about it's incredible, it's amazing. And you can separate the technological achievement of using maths to generate a visual moving image. That's genuinely cool. But you've got to stop for a second and say, as cool as this is, the people in the back of their shot, they're molding into each other is like the thing.
Starting point is 00:47:45 It's disgusting. Hey, that monkey's got like five arms. That's weird. I don't know. I just feel like normal people don't get this much leniency. You and I don't get people saying, great job, and we do kind of a shitty job. And if we brought something to someone that was insanely expensive, only really did 10% of the job you needed it to, and also the things it created took forever and looked horrifying, I don't think we'd get told great job. I think we'd be told we'd wasted a lot of money in that someone. was quite mad at us. I'm tired of this. I'm tired of these companies announcing these half-completed products
Starting point is 00:48:27 and having the media dance around and act like they've delivered something truly incredible. I'm tired of the public being expected to do the mental and emotional labor for Sam Altman and other AI companies saying it's remarkable that they're even able to do this and assume and give them credit for some inevitable future
Starting point is 00:48:45 where all of these problems are gone despite little proof that such a thing is possible and plenty of proof that it isn't. And as I've suggested, I really don't think it is. I think Sora is dead on arrival. I think it's too expensive, too imprecise, and there is no fixing those problems. You can iterate on them, you can improve them, but without some kind of energy or chips breakthrough, they're not even going to have the compute, or really the money, to build this thing into anything even half-functional. And I'm calling on the press. to push back on these companies.
Starting point is 00:49:21 I'm calling on them to refuse to declare this quasi-functional software as complete. I'm tired of seeing the media back these companies and do marketing work for them when they're not done. They don't deserve the credit. And I'm demanding that people like Sam Altman actually change the world before anyone says that they're doing so. Thank you for listening to Better Offline. The editor and composer of the Better Offline theme song is Mattosowski.
Starting point is 00:50:00 You can check out more of his music and audio projects at Mattisowski.com. You can email me at E-Z at Better Offline.com or visit Better Offline.com to find more podcast links and, of course, my newsletter. I also really recommend you go to chat. Where's Your Ed dot at to visit the Discord and go to R-Slas Better Offline to check out our Reddit. Thank you so much for listening. Better Offline is a production of Cool Zone Media. For more from Cool Zone Media, visit our website, coolzonemedia.com,
Starting point is 00:50:34 or check us out on the IHeartRadio app, Apple Podcasts, or wherever you get your podcasts. Another podcast from some SNL, late-night comedy guy, not quite. Unhumor me with Robert Smigel and friends. Me and hilarious guests from Bob Odenkirk to David Letterman help make you funnier. This week, my guest, SNL's Mikey Day and head writer,
Starting point is 00:51:13 Streeter Seidel, help an acapella band with their between songs banter. Where does your group perform? We do some retirement homes. Those people are starving for banter. Listen to humor me with Robert Smigel and Friends on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. Your 20s can be so exciting, but they can also be really overwhelming, confusing, and honestly,
Starting point is 00:51:36 just kind of lonely. May is Mental Health Awareness Month, and the psychology of your 20s is breaking down the science behind the biggest roadblocks we face. I was six years into my career, the 80-hour weeks, and just the first one in, the last one out. And I ended up burning out. There was a large chunk of my 20s that I, like, was just so wanting to, like, be out of that phase out of my skin. And I just, like, really regret not living in the present more. You don't need to have everything figured out right now.
Starting point is 00:52:03 You just need to understand yourself a little bit better. Listen to the psychology of your 20s on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. This is Saigon, the story of my family. of the country that shaped us. From IHeart podcasts, Saigon. You don't think I'm serious about a free Vietnam? One city, a divided country, and the war that tore America apart. This is for Vietnam.
Starting point is 00:52:28 They're pouring patril all over here. Freedom for Vietnam! There's a fire coming to this country, and it's going to burn out everything. Listen to Saigon on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. This is an IHeart podcast. Guaranteed Human. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.