Hacked - Deepfaking It

Starting point is 00:00:00 So sometime last year, the coach of the Victory Vipers, a cheerleading squad in Buck County, Pennsylvania, gets texted a bunch of pictures from an anonymous number. According to the criminal complaint filed this month, the images painted a compromising portrait of three of the cheerleading squads underage members. In the pictures, the three girls were consuming drugs, they were drinking, they were naked, and these images had been sent to the head coach of the high school. high school cheer team. If the intent of this message, a targeted shaming, wasn't plain enough, shortly after a different phone number starts sending similarly incriminating images, this time to the girls themselves. Each of these girls opens up their phone to a picture of themselves that could damage their reputation, their high school career, their future prospects, accompanied with a text message reading, you should kill yourself.

Starting point is 00:01:00 And for as messed up a situation as that is to find yourself in, these girls knew something that would set this whole drama on the path that it is on now, not towards their internet shaming and expulsion from the cheer squad, but towards criminal charges against a woman named Raphael Speron. The girls knew that these pictures are fake. You can make a deep fake right now. It's pretty easy. You can do a really good one on a computer.

Starting point is 00:01:38 You can do a pretty good one on your phone. You can make a deep fake picture where you swap someone's face. You can make a deep fake video of a person who never existed. You can even make deep fakes of a person's voice. And once you start thinking in deep fake, in the editing tools and deep learning and generative neural networks that power all of it, you realize you can kind of construct lies that never. would have been possible before.

Starting point is 00:02:04 You can look someone in the eye and pretend to be someone you're not. You can pretend someone did something they didn't. Or, according to those criminal charges, you can cyber bully your daughter's cheerleading rivals, which is what Raphael Espone allegedly chose to do with it. She was arrested and charged with six counts of harassment, and her first court date is set for the day this comes out. If you've listened to this show before, you know that elaborate, hacks and cybercrimes, typically at some point come back to social engineering, the human interaction

Starting point is 00:02:37 part of hacking, the part where you trick someone, the part where you lie. And Raphael Espone might be the first mom to allegedly cyber bully her daughter's rivals with it, but she is one in a long history of people using artificially intelligence-created synthetic media to do some pretty wild stuff on the internet. Her story is provocative and a little bit scandalous, but sophisticated actors armed with deepfakes. We've done some pretty extraordinary stuff with them. So we wanted to know, what will cybersecurity look like in a world where deepfakes are mature and accessible? What's left to verify an identity when we can fake all of the evidence of an identity? And did you notice anything funny about this introduction? Narrated by me, Jordan.

Starting point is 00:03:27 Just the real normal human Jordan. Trust me, this is deep faking it. Here on Hacked. Hi, I'm DeepFake Jordan. Okay. Am I supposed to respond to it? Nice to meet you. Oh, hey, bud.

Starting point is 00:04:02 How you doing, little guy? Would you like to host Hacked with me instead? Oh, host it with my best neural net, bud? Of course. Please. Please, please, Scott. I want to host Hacked. Of course.

Starting point is 00:04:13 Anything for you, Deepfake JorD. Jordan. Isn't that just the most unsettling thing you've ever heard? So that was 10 minutes of training data in a free trial of a piece of software we're going to talk about a little bit later. But I found it pretty wild how easy I was able to create an even like okay version of my own voice that I could type in. And it kind of got me thinking that there's sort of like a trust hierarchy.

Starting point is 00:04:38 And I think at the bottom you have text and then you have images and then you have audio visual. Like text, if I get an email, I don't trust it at all. It can tell me I want a million bucks and I scoff and I throw it away. And a photo, I also don't trust even a little bit because I know about Photoshop. But everything above that audio visual, I trust in just a totally different way because in my head, faking that is still very time consuming and expensive. Intellectually, I know about deepfakes, but I don't feel it yet. That term deepfake was coined in 2017. So for most of your history and cybersecurity, this specific technology wasn't really on the table, was it? No, definitely not. This is definitely, you know, more of a contemporary tool, you know, for social

Starting point is 00:05:26 engineering. Like the, I think if you were to look back, you know, through the 80s and 90s, you would get people doctoring VHS videotapes for evidence and stuff like that. And there was people that specialized in like, you know, inspecting tape to see if tape, you know, literally like audio and videotape had been like modified or or messed with in some way and now in the digital age it's it's you know a totally different beast yeah i feel like i remember seeing like c s i type shows in the mid aughts and it was really going in and looking for like seams in audio like artifacts of linear editing and that seems so antiquated now well like you know we are essentially professional audio editors at this point and i can tell you if you pop open adobe edition there's not really a lot you can't do.

Starting point is 00:06:15 Yeah, Adobe comes up in this story a little bit later. Oh, I'm down to talk about a bit of Adobe Voco. So that term, first used in 2017, by a Reddit user of the same name. They do a post, and they were using a desktop open source face-swapping

Starting point is 00:06:31 technology to create videos that they called deepfakes. And I'm curious if you, with even a basic knowledge of the history of where new technology tends to kind of bloom online, Can you guess what kind of video that was? Porn.

Starting point is 00:06:47 It was porn. Of course it's porn. Yeah. And that's an ethical subject, I think, outside the bounds of our discussion. I think we've touched on this in other episodes where it's like if you were to literally just sit back and make a tree of all technical innovations, you know, at the top of that tree, you're going to find pornography, you know, war and maybe one or two other things that like have driven most technical innovations.

Starting point is 00:07:11 It's a porn tree. It's like porn, porn war, and existentialism. Everything links back to. And since that Reddit user coined that term, I think we've all just watched about a million videos of like X actors face swapped out for Y actor or person X saying this thing they never said before

Starting point is 00:07:32 and it's actually an impersonator with deepfakes layered on top. There's that Tom Cruise one that's been blowing up lately. Mm-hmm. Mm-hmm. And I think it's worth going into how deepfakes work before we talk about how they fit into hacking and cybercrime. Just get the jargon out of the way right up front. Sure. So the two terms that you hear coming up a lot are GAN and VAE.

Starting point is 00:07:50 VAE stands for variational auto encoder. So let's say you're making like a Matt Damon deepfake. What a VAA does is it's trained to encode an image down into this really low dimensional representation and then decode those representations back onto images. So what that means in practice is that you have two of these deep learning algorithms running. trained on a huge diversity of different faces so that it can encode and decode the random person who's going to be wearing Matt Damon's face and another one trained on just the celebrity. And then you just wire them together.

Starting point is 00:08:25 You encode Matt Damon's face and you decode it onto the face of the random person. I think is the basic, basic mechanism of how this works. That makes sense to me. You know, you've got one algorithm that you've essentially trained to identify. motions and movements on the random and then you just match that up with something that you've trained for the

Starting point is 00:08:49 specific something that you've trained for the specific subject that you're trying to fake. You know, you connect those dots and allow the computer to take over that makes total sense. And then on the other side of it, I think in terms of how you make them really good

Starting point is 00:09:06 is the GAN. And you read about this one a lot. So making a good deep fake and potentially detecting deep fake, fakes involves using these generative adversarial networks, which I can't find a research paper that's high level enough to explain how they interact with the VAEs, but dumb enough to explain it to me. But the basic idea is that with the GAN, two different machine learning models are essentially duking it out. One is trained on forgeries, and the other one is attempting to create the forgery. And the forger creates a fake until the detection machine learning model can't

Starting point is 00:09:40 detect it anymore. So the more good fakes you have training one, the better it's going to be at creating a deep fake that is undetectable by the best detection technology. Does that make sense? Yeah, you for somebody who did know how to explain it and couldn't find something high enough level, I think that you did a pretty damn good job. You've essentially told an algorithm to keep going until another algorithm is satisfied. And the algorithm that needs to be satisfied is the one that's like specifically trained to detect faking. So the other one just keeps going until it makes the other guy happy. And essentially it just keeps trying things and variations and tuning, you know,

Starting point is 00:10:18 millions of tiny little numbers and things to allow for a curve fit to happen in an end-dimensional space. And, you know, until it satisfies the other one, it just keeps going and going and going. So essentially, you know, it's, I don't know what you call it. Like it's self-testing itself. Totally. Yeah, you're letting it like try and run this lie. Until it believes itself. Until it believes itself.

Starting point is 00:10:43 Yeah. And if you swap out facial recognition for voice recognition, you can run the same basic process for audio. So all that jargon aside, deepfakes really just come down to how much data you can heave into this thing. How many pictures of the person's face, how many samples of their voice. I don't want this episode to just be examples of deepfakes,

Starting point is 00:11:00 but I remember they were able to feed, just reams of Joe Rogan's voice into one of these things because he's probably the most recorded human being on the planet right now. And they made a pretty good one. Friends, I've got something new to tell all of you. I've decided to sponsor hockey team made up entirely of chimps. Yeah, I guess when you've got, what, three hours a day of him rambling on his podcast, you could probably,

Starting point is 00:11:26 and not to mention all of the episodes of Fear Factor, all of the UFC things. You could probably get like a, you know, a good couple thousand hours of Joe Rogan just audio without any issues. You could even probably tune it so that you could do like young Joe Rogan and then like kind of middle age Joe Rogan. Sure you got a knob, a dial, you can move around. Yeah, exactly. So as you've always explained, most hacks tend to rely on some kind of social engineering. And I'm curious, just before we dive into it,

Starting point is 00:12:01 What application do you think that deepfakes have in social engineering? Well, I think, like, you, I think, you know, when we started talking about making this episode, you know, based, you know, substantially on the intro story because it's just so funny. It's scary, but funny. Is, you know, bypassing checks is like a big part of getting access to things that you shouldn't have access to. So, you know, proving that you're something, you're not or somebody that you're, not and it's like you know Photoshop does it for photos you know we've got deepfakes now doing it in video which is like insane you know Adobe Voco and a ton of voice

Starting point is 00:12:42 things that do it for audio and it's like we're really running out of ways to verify people except for like seeing them in person and testing their DNA or something like that you know like we're running out of simple things you know we've gone to two-factor authentication by being like do you have the same phone number, question mark. It's like I guess that's kind of an alternative way to verify who you are. All of these checks and balances that we've put in society kind of depend on identity verification and a lot of identity verification can be faked at this point, I guess would

Starting point is 00:13:19 be at the end of a long ramble, what I'm trying to say. In 2019, the CEO of a UK-based energy company gets a phone call from his boss, an executive with the firm's German parent company, asking him to send funds to their Hungarian supplier. So the CEO is on the phone with his boss. It's really, really urgent. If the funds aren't transferred within the hour, the project isn't going to be completed on time. So they send the money. After the $243,000 transfer went through, the German boss calls back.

Starting point is 00:13:52 And this time, the number is coming from Austria. And they wanted the UK CEO to send another payment. And by the time, all of that kind of piles. up into one really big red flag. It was too late. The first transfer had already gone through. It was gone bouncing first to a bank in Mexico and off kind of into the financial dark. And it turns out that the voice on the other end of that call, the German, you know, Lilt was a deep fake. The insurance firm that covered this told the Wall Street Journal that this was, I think, the first case of AI being used in a hack like this that they'd ever heard of.

Starting point is 00:14:25 And I guess I'm curious with that story in mind, knowing that this tech is out there, how would you verify someone's identity remotely. You get that call. How do you check? I call them back, I guess. I'm just trying to think because this is all collateral damage to the fact that we've all been pushed so remote. So it's like, yeah, come on by the branch. It's like, well, I'm not allowed to. It's like, okay. So it's, you know, I think the, you know, I think a traditional way to do that would be to call back on a line so that you know that you're getting them. If you have their actual But then we're going back to something else that you can intercept and hack. But, like, you know, I think that's probably step one is call them back on their personal cell number.

Starting point is 00:15:12 And if they don't pick up and, you know, say, yeah, that was me. Then, you know, maybe you call the police. But that is a, that is a tough one, especially if things are done in crisis, like in temporal crisis, where it's like, this money needs to get out now or else we're going to have adverse reactions and effects. You know, people skip checks and balances when things need to get done. in a hurry. So yeah, I don't know if there's an easy way to bypass or, you know, add a check and balance into somebody faking someone's voice. Like imagine your boss called you today and said, you know, this is what's happening. You need to do this, this, this and this. And you were like,

Starting point is 00:15:51 okay. Like, would you question it? Would you pick up the phone and dial them back? Would you, you know, like, how inherently do you just trust that when you hear someone's voice that it's actually them. And I'd say probably pretty substantially. I'm not that distrusting and I host a podcast about techno crimes. It's just not on my radar yet. It's interesting also that you went to a like practical solution, which is just I need to call you back to make sure that I'm talking to who I think I am. Yeah, I guess you could do it like a like you know, how would you verify that I am me right now, Jordan. You'd ask me a question that only I would know the answer to is probably the easiest. Totally. And it's like, you know, that's a great form of verification, but it's like you have to

Starting point is 00:16:41 pre-define that or understand it. Like it's like, you know, you know, when did we first meet Jordan? And it's like you should probably only know that. Maybe a few other people do, but it's like, you know, be pretty tough for someone not to. So on the software detection side, using a piece of software to detect this software-based deception, there are some tools available but the metaphor came up it's sort of a virus anti-virus analog so I'm gonna have how Lee associate professor at the University of Southern California developed a deep fake detection software using visual markers known as soft biometrics which are like little visual things that are too subtle for

Starting point is 00:17:21 an AI to mimic right now so say like the way Trump purses his lips before answering a question that's their example and I don't like thinking about Trump pursing his lips or how Elizabeth Warren raises an eyebrow to emphasize a point. And you can train an algorithm to spot these little specific person-specific movements by studying past footage of them. And the end result was a tool that in 2019 was at least 92% accurate at spotting deepfakes. But even Lee says that it's not going to be long until that work is completely useless.

Starting point is 00:17:55 They said, quote, at some point, it's not going to be possible to detect AI fakes. So a different approach is going to be needed to kind of resolve. this. I think if you just, you know, think about an algorithm that's built to detect fakes, and then we go back five minutes to talking about adversarial networks and literally having an algorithm that is trained to detect fakes as the, you know, as the check and balance to your faking algorithm, you know, I think, you know, really what you're doing is by creating a better algorithm to detect fakes, you're creating a better algorithm to train the algorithm to make fakes. So it's like, you know, we're in a world now where, you know, the solution could also be part

Starting point is 00:18:44 of the problem, I guess. So it's a strange, strange world. And it's kind of where we get into the question of where this deception is pointed. Like, are you lying to one person, the social engineering case study, or are you lying to a bunch of people as in propaganda? So, Let's take the antivirus metaphor. And let's say that Facebook gets, there's a huge public outcry. There's a bunch of congressional hearings about deep fakes. And Facebook decides they need to become really, really good at detecting these things. They invest a bunch of money in it.

Starting point is 00:19:16 And maybe in the future they roll out some kind of a watermark thing that says, hey, we have been able to confirm that this is a deep fake. I think that A, companies are going to go as long as humanly possible without ever wanting to claim that responsibility because of how hard it would be to do reliably. I think there's, you know, if they're saying this is a deep fake and a deep fake slips through, people are going to hold them culpable. But that propaganda question is sort of different than using a deep fake to like lie to one person. To use it for social engineering as part of a larger hack. You know, it's really great in an informed public fighting misinformation kind of thing, but it doesn't stop me from using the same tech to impersonate a company's CEO and run a con on someone, you know?

Starting point is 00:20:00 Totally. truthfully I think you'd probably have you know more success as an individual hacker running icon on somebody using it for a one-off thing rather than trying to convince everybody that you know whatever you feel the need to socially manipulate the world for like I I agree with your tendency to think that Facebook's not going to want to be culpable for you know being the gatekeeper of what is a deep fake and not you know for both sides of that coin too because it's also going to be the issue of it you know false positives it's going to flag things as being deep fakes that aren't and that's going to be as problematic like i don't know

Starting point is 00:20:42 if you've tried to advertise on facebook lately but you know everything gets rejected essentially and you have to appeal everything because they're so worried about having an ad with some form of you know controversial content in it sure so it's uh yeah i'm i'm yeah i'm yeah i i i i i i Facebook's got themselves in the unique pickle there as being the content platform that all of this terrible stuff lives on. Hi, Pedro. This is, and I need your immediate assistance to finalize an urgent business deal. That was audio from a corporate fishing attempt, similar to the one from the story earlier.

Starting point is 00:21:18 And that one didn't work because the mark thought, I think, rightfully that it sounded a little bit suspicious. But this question of deepfakes for social engineering kind of came up at scale in the U.S. surrounding fraudulent unemployment insurance claims, as you mentioned earlier. In a report from the Labor Department's Office of Inspector General, they found that from March through October 2020, some three and a half billion in fraudulent job benefits,

Starting point is 00:21:43 which is only two-thirds of all of the phony claims, were paid out to individuals with Social Security numbers filed in multiple states. 100 million went to more than 13,000 and ineligible people who are currently in prison. And what this is, it's not a person who doesn't qualify for unemployment applying for, it. This is people pretending to be other people applying on their behalf and then taking the money.

Starting point is 00:22:06 It's identity theft. And apparently applying for unemployment insurance in the name of like a dead person, a person who's in prison or just someone in a different state is one of the most commonly used tools in an identity thief's toolbox, which makes a lot of sense, right? Like you get mailed a check that the victim was never expecting to receive and they don't find out something's wrong, maybe ever. So they've started. started using this tool developed by a private company called IDMe. IDME is a federally certified identity provider, specializing and helping people verify their identity online.

Starting point is 00:22:40 These days, there are lots of criminals out there stealing other people's identities and committing fraud. Our job is to keep the bad guys out. Krebs on security at a big breakdown of IDME and it started out as an e-commerce tool that is now being used to verify identities for unemployment insurance claims. Essentially, they just ask for a lot more information to verify your identity. an image of a driver's license, utility bills, details about a mobile phone service. And when an application doesn't have one or more of the above, or something kind of triggers a red flag, IDME typically requires a recorded live video chat with the person applying for benefits,

Starting point is 00:23:19 which is where masks, deep fake or otherwise, start to come in. People have been caught wearing Halloween masks to make them look like the person whose identity they're trying to steal. when you look up this story on like cybersecurity forums, the question that emerges in comments is always how long until that mask is a deep fake mask. According to ID me, a really major driver of these phony jobless claims

Starting point is 00:23:42 comes from social engineering where people have given away personal data in response to like a sweepstakes scam or applying for what they thought was a legitimate job. And when I look at these photos of this person who got caught wearing a rubber mask on this like identity verification call. This is just speculation, but I have to think that at this point, at least some of those billions of dollars in successful fraudulent claims were using

Starting point is 00:24:10 deep fake technology. I have to think this has already been done. I would say guaranteed. You know, we've we've created a game essentially. You know, like there's a lot of things in hacking that are essentially games. You know, you're trying to outsmart, outwit, bypass. You know, You're trying to get somewhere you're not supposed to be. You're trying to bypass a check imbalance that it's not supposed to be bypassable. You're trying to be more clever than the person who set up the checks and balances. And I feel like this is that kind of game. You know, there's definitely a distinct reward at the end, right?

Starting point is 00:24:46 Like if you can fraudulently submit, you know, $3.5 billion in, you know, unemployment insurance claims, you know, that's a substantial amount of money. You know, if you can bypass these checks and balances and figure out a way to get approved and get on these lists, you know, the payments are there and the money's there and, you know, they probably don't have enough people to enforce it. So really what they've created is a race of the algorithms. You know, can their checks and balances algorithms

Starting point is 00:25:18 catch the algorithms that are going to be faking and trying to bypass them? And, you know, we have a, you know, a bit of a war, an arms race, but it's more of an algorithms race. It's a really big, it's a really big carrot. Like, it's just a huge incentive for people to get good at this on both sides. Three and a half billion dollars. Like, that's, you know, tangible to me anyway. And that's one grift. Like, that's a fraction of one thing people can do with this.

Starting point is 00:25:49 Yeah. To go back a pretty long ways. There are records of ancient Romans permanently deleting a person's identity and history by chiseling their name and their little portrait off of a stone record keeping block. People have been editing photos for as long as photos have existed. I think it was Stalin who famously used image editing to scrub people out of history, which is to say that people have always manipulated media. But these specific tools have only existed since 2017.

Starting point is 00:26:19 And they're existing in a media ecosystem that works very, very, very. very differently. And I think that means they kind of have the potential to be like an order of magnitude more powerful. When we first started talking about this as our subject, you brought up some of the tech that's sort of coming around the bend about what's next. And I want to talk about that now, right after the break. Think about the last time you heard a breach story on this show. It always starts the same way. Someone somewhere saw something too late. An alert buried, a signal missed, an SOC that just couldn't keep up. Arctic Wolf set out to solve that problem by rebuilding security operations from the ground up for a world where attackers are already using AI.

Starting point is 00:27:03 They created the Aurora Super Intelligence Platform, a fully agentric system powered by the swarm of experts. Instead of single-purpose bots or lucky-guess LLMs, this swarm is full of deterministic agents that handle whole entire workflows. Humans stay in the loop and on the loop to validate the critical decisions and keep everything trustworthy. And all of this is just off running on their secure operations graph. A constantly updating intelligence engine fueled by more than 9 trillion telemetry events every week and over a decade of real-world incident response. The system reasons on real signals and real context not synthetic training data. And the result is the new Aurora Agent SOC.

Starting point is 00:27:41 It's the first SOC that is agent led by design. You get agents that coordinate, agents that investigate, agents that respond at machine speed, and hundreds more that automate the repetitive work that normally buries human analysts. Arctic Wolf didn't try and bolt AI onto an... an old model. They rebuilt the model entirely. What makes it even more effective is how it works with Arctic Wolf's concierge experience. The team brings customer-specific context directly into the platform so every AI-driven decision reflects your environment instead of generic assumptions. The automation frees your concierge security team to focus on higher value strategy and

Starting point is 00:28:15 proactive risk reductions while the agents handle the grind. If you want to see what trustworthy, production-ready AI and security operations actually looks like, Go to arcticwolf.com slash hacked. Never feel like cyber threats are evolving faster than anyone can keep up? Last year, 2025 was nothing short of a record-breaking year for major breaches, from sophisticated ransomware operators to AI-enabled attacks that turn defenses on their head. Organizations around the world saw headlines they never expected and cybersecurity teams

Starting point is 00:28:46 were tested like never before, but here's the thing. These incidents aren't just news headlines. They're learning opportunities. And that's why Arctic Wolf is hosting a live webinar on February 5th diving to the most impactful breaches of 2025. Their field CTO and security leaders are going to unpack not just what happened, but why these attacks succeeded. And most importantly, what businesses can do to fortify their defenses for it's too late. You're going to walk away with real insights into how threat actors are evolving, how defenders are responding, and what strategies can help you stay ahead of the next big breach. It's not fear mongering.

Starting point is 00:29:19 It's practical, actionable, intelligence from experts and the trenches, register now at arctic wolf.com slash hacked. We're entering an era in which our enemies can make it look like anyone is saying anything at any point in time, even if they would never say those things. So, for instance, they could have me say things like, I don't know, killmonger was right. So I remember being in our office in like November of 2016. And, you know, we're obviously, you know, in advertising agency and a creative company and we use a lot of Adobe products and Adobe Max, one of their big trade shows was on and they were demoing and showcasing this, you know, beta product that they

Starting point is 00:30:09 were going to be releasing and they were so excited about. Let's hear from Zayu about Photoshop voiceovers. Introducing Project Vocal. Project Vocal allows you to edit speech in text. So let's bring it up. And they could just do this automatically. and they were so ready to release it. Like it was in draft form.

Starting point is 00:30:31 It was in beta. And they were like, yeah, this is coming. It's going to be great. And then the world was like, hold up. This is going to break everything. Like if you can get recorded audio of anyone's voice, and it wasn't even a lot of it, you can just become them.

Starting point is 00:30:48 And that's going to be very challenging for tons of things. So please for the love of God, do not release this. And it's still has. still hasn't been released. I remember that same Adobe Creative Cloud event, and there was a section where they had Jordan Peel on stage. And he was just kind of like politely nodding along as they rolled that product out.

Starting point is 00:31:11 And the guy takes a little snippet of Jordan Peel's voice. He just said it like five minutes earlier. So I'll just load this audio piece into a vocal. So wait a second. and we have this. And you can see the waveform of what he says and the text of it is right underneath. So as you can see, we have the audio waveform above it

Starting point is 00:31:36 and we have the text under it. And Jordan Peel's kind of nodded and say, oh, that's cool. And then the guy just changes the text. We can actually type something that's not here. So I... Of what Jordan Peel said, and it buffers, and we see the waveform above changing, And then he hits play.

Starting point is 00:31:56 And I kissed Jordan and my dogs. And this is Jordan Peel responding. You are witch. You a demon. They explained it as Photoshop for voiceovers. It's a quote, we've already revolutionized photo. It's time to do the same for voice.

Starting point is 00:32:22 And Peel, I would say, really accurately and quickly summarizes every single comment and think piece that came out after this, your take, my take. He said, quote, if this technology gets into the wrong hands. And then they kind of cut them off and they clamor to clarify that, oh, they're going to be able to detect

Starting point is 00:32:39 when a voice has been faked. Don't worry, don't worry. We actually have researched how to like prevent a forgery. We have like, think about like a watermarking detection. You don't got to worry about it. They're going to watermark it. It's going to be fine.

Starting point is 00:32:56 And kind of like you said, the tenor of that presentation was, oh, this is a product announcement. Like, we're building this and you're going to be able to use it. But that was in 2016, and five years later, that product is still sitting on the shelf. They were pumped about it.

Starting point is 00:33:09 They were ready to relaunch that thing and they thought the world needed it. And the world, like, I'm not going to lie, it would be very useful. 100%. It's just opening Pandora's box. Like, we've never had the ability, at least in 2016,

Starting point is 00:33:23 there was no ability to really deep fake voice that easily. and this was just like an all in one pre-packaged tool that came for $1999 a month or whatever creative cloud cost. You're just like, yeah, here you want to fake someone's voice? You want to call somebody and pretend to be Jordan? Do we want to record a podcast but actually just type the whole thing in and just like have it manufacture our comments? So Adobe Voco, this unreleased thing that they have since said

Starting point is 00:33:47 was a research demo which I respectfully kind of called bullshit on. I was there, man. I was there, bro. No, no. It's not a research demo if it's the like fifth product and a sequence of products, the rest of which you are rolling out as things people can use. Yeah. And you know what?

Starting point is 00:34:10 Good for them. Like good for them for opening this thing a little bit, letting everyone peek inside, immediately getting so much feedback being like you are about to unleash a demon. And then just being like, we're just going to close this and we're just going to put it back up on the shelf. That's a good thing. I like when big companies do that. You think about a world based on liabilities and based on lawsuits and how many, like Photoshop has already ruined so many people's lives.

Starting point is 00:34:44 Imagine what, you know, doing the same thing to voice would do. Like, it's just, you're just keeping that going. You know, I just can't, I don't know. I'm glad they shelved it, even though it just essentially told the world that it was really easy and totally possible to do. So there's alternatives to it now, but still. Yeah. So on that note, Adobe Voco works a lot like, it seems like it's better than, but it works a lot like the software product called the script that I used to create the deep fake voice at the top of the show. I just fed it 10 minutes of data, half of what they used to create the deep fakes at the Voco presentation,

Starting point is 00:35:20 and like a third of what they actually recommend. But it still kicked out a voice puppet that I could used to type in my own voice. Think Jordan. Like, how many podcast episodes do we have? How many hours of footage is just freely available of your voice online? I am a thoroughly recorded human being. Like, if we wanted to just train a voice puppet of you and just, you know, be rid of you, that would be possible.

Starting point is 00:35:46 You could, you could host this show with DeepFake Jordan like it wants. Like DeepFake Jordan wants. Research from Imperial College in London and Samsung's AI Research Center in the UK were able to create a deep fake using a single image, a technology that was recently commercialized with an app that lets you sort of animate old photos of relatives. And the morning that we recorded this, someone sent me a tech demo that Navidia just rolled out of a video call platform that eliminates the need to send high bandwidth video data at all. All it does is it grabs a single frame from the start of the call. and then motion point cloud data in real time, and it creates a real-time deep fake using that one frame and the motion data.

Starting point is 00:36:33 So your bandwidth is just that motion data and a single frame. You've got to be kidding. No, it's very, very doable. It's brilliant. Which it's absolutely brilliant, but it also means that I see no way that we can't almost right now

Starting point is 00:36:47 create very, very usable deep fake puppets of real people without their participation. Not just celebrities who have tons of, recordings and images of themselves out there in the world, but just anyone you can find a photo and a sample of their voice of. You can turn on your mic, turn on your webcam, and you can pilot like a deep fake avatar of another human being. This is also just going to open up the next Pandora's box of being like, yeah, I don't even have to get ready for a video call anymore. I just literally load in the photo of how I want to look for this call and like train it to

Starting point is 00:37:23 to grab the point data on my face and then I'm just done. Like I don't even need to like shower and put on a close to have this call because it's just going to be all deep faked anyway. Sure. You'll know when someone's fancy because their deep fake like video presence is like a nice studio portrait that they took of themselves. Exactly. They look really good and well lit.

Starting point is 00:37:42 Yeah. Perfect lighting. Perfect everything. And it's it's tough because I don't I don't think the answer is ever to like get rid of this tech because as we can see even right there, there's such cool application. for it. Like letting people with no internet connectivity have video calls is a great use of this technology. But holy crap, those are some implications. Seriously. Well, the other thing is, too, is that you're essentially transmitting, like, you know, we talk about recorded audio

Starting point is 00:38:10 where we've got, you know, hours of Jordan Blumen speaking, you know, saved into wave files all over the internet. At some point, if I had to start recording my voice calls, I will literally just get facial point data for everybody that I talk to you on video chats. And then I could literally just manufacture them if I needed to. Like, you know, once you start transmitting all that information around, not that it's not, you know, capable of getting it from post-processing video footage and stuff, but still, like, man. You brought this up earlier, the kind of moment we're living in, but at a time when the most socially responsible thing you can do is avoid seeing and talking to another person face to face, the ability to

Starting point is 00:38:50 to just transform into another person digitally is almost as good as doing it. IRL. It's like just this side of like mystique and X-Men. Totally. Like imagine I could just be you on Zoom. Like I just had to turn on the like Jordan filter. I sounded like you. I looked like you. I was literally, I have a photo of your apartment. You know, any way that's looking for for environmental clues would miss them. I could pretty much do anything. I could call your parents. I could call your, you know, your partner, I could call your friends. I could just do what I want to do. And who's going to be able to tell that it's not you?

Starting point is 00:39:28 The only way that voice deepfakes work is if the deep fake creator can do a real-time transcription of the person's voice, it needs to be able to render text out of the audio. Right. And if you can do that, you can then feed that same text into the, you can decode that into someone else's voice, which means all of the individual parts of, I talk into a microphone, it transcribes it,

Starting point is 00:39:56 and then it replays that in someone else's voice, already exist. It just hasn't been all wired together in that way yet. But the tech is there. Going through the different things that we used to verify ourselves and essentially checking the box next to that they're easily able to be faked now.

Starting point is 00:40:16 Back in the day, it was like, You know, photos, photos couldn't be doctored. Oh, they figured out how to do that. And then it's like, okay, that was even pre-digital. And then you went post-digital or into the digital era. And it's like, okay, then we got Photoshop. And it's really easy to doctor photos.

Starting point is 00:40:30 It's like, okay, okay, but we can't fake video. It's like, well, actually, we can fake video. And it's like, okay, okay, but it's still not that good. And you're like, yeah, yeah, it's not that good. Okay, we got deep fakes. Now it's impossible to tell when it's, like, faked. It's like, okay, next. It's like the box, we just keep going down the list.

Starting point is 00:40:47 And it's like, you know, we got fingerprints. Yeah, you can't fake a fingerprint. Yeah, you can't fake a fingerprint. Okay, we scan a fingerprint in, we turn it into, you know, bite code. Okay, bite code is just essentially text or some way of verifying what the fingerprint is. Okay, we can fake that. Okay, that's fakable. What's next?

Starting point is 00:41:00 And it's like, you just keep going. And it's like, okay, there's really nothing left. You know, what is left besides like sitting down across the table from you and being like, Jordan, on the day we met, we were talking about this. What did you say about that? It's like, that is the correct response. you know, as long as nobody heard that, we now have verification. And it's like, you know, there's very little left.

Starting point is 00:41:34 As our ability to do wilder and wilder stuff, I'm always fascinated by the things that we choose to do with it. And I think that that really nicely brings us back to sort of our opening story. And I think it brings up a different, similar story from 1991. And that is the story of a woman named Wanda Holloway. And Wanda Holloway was a Texas mother who approached her former, brother-in-law looking for help to hire a hitman. Her target was the mother and daughter who Holloway believed were standing in the way of her

Starting point is 00:42:05 daughter being elected cheerleader. And when Holloway decided that she could not afford two whole murders, she just sort of settled for knocking off the mother, reasoning that that death would traumatize the girl so much that she would not be able to compete in cheerleading to the best of her capacity. Which is to say that people's reason for doing really wild shit never really. changes. It was a competitive cheer mom then and it is a competitive cheer mom now. But the technology that empowers people definitely changes. And as we said in this episode, like, I don't think that we should try and do away with deep fakes or say that they're not allowed in some way. I don't think

Starting point is 00:42:44 that's how tech works. I don't think it's how tools work. And seeing all the cool stuff that can be done with it, I have no desire to see that stifled. But I do think it's worth looking at those behaviors. What have people done before that they're going to do with this tech in new, more powerful ways? And how can we respond to that preemptively? Catfishing, identity fraud, you know, people have always bullied each other and stolen from each other. We should be thinking about how this tech is going to empower that. So we can respond to it now before it gets too bad. Agreed. Hi, I'm DeepFake Jordan. Thanks for listening, everybody. If you would like to support the show, you can find us on Patreon.

Starting point is 00:43:26 at patreon.com slash h s k, K-E-D-P-O-D-C-A-S-T. Huge shout out to our new patrons, Blake. Balge and Kyle your support means the world to meet DeepFake Jordan. Thanks for listening.

Hacked - Deepfaking It

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.