The Journal. - The Glitch That Crashed Millions of Computers

Episode Date: July 23, 2024

Last Friday, 8.5 million computers around the world stopped working. All kinds of businesses were impacted, from airlines to banks to hospitals. The cause was a routine update sent out by a software c...ompany called CrowdStrike. WSJ’s Robert McMillan explains how the meltdown happened and why Microsoft’s software was especially vulnerable. Further Reading: - Blue Screens Everywhere Are Latest Tech Woe for Microsoft  - CrowdStrike Made Its Name Fighting Technology Problems. Now It Has Caused One.  Further Listening: - The Computer Glitch That Caused Nearly 1,000 Convictions  - Hacking the Hackers  Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 Last Friday, 8.5 million computers around the world stopped working. It was a massive outage that stalled all kinds of industries. Many of us woke up this morning and discovered we lost access to mobile banking, maybe even the use of debit and credit cards. Travelers here at LAX stuck dealing with flight cancellations and delays. Non-urgent surgeries postponed at hospitals across the country. Passports couldn't be verified, so real IDs couldn't be processed. It affected banks. It affected UPS. It affected Starbucks. It affected Tesla. It affected the MTA. I talked to our colleague Bob McMillan yesterday afternoon.
Starting point is 00:00:47 There were schools that went out, hospitals, News Corp. Yes, our parent company. I just talked to our IT guy. He's still, this is Monday, he's still trying to fix stuff. Wow. The crash happened because a little-known company made a mistake during a routine software update. It was an update that went horribly, horribly wrong. A lot of people in corporate environments run this product called CrowdStrike Falcon. And CrowdStrike Falcon keeps your computer safe, but it gets these updates about what the bad stuff is all the time.
Starting point is 00:01:26 But for some reason, this update that went out on Friday morning contained data that caused the Falcon software to blow up inside the brains of Windows computers. And once that happened, those computers became very, very difficult to fix. and those computers became very, very difficult to fix. Welcome to The Journal, our show about money, business, and power. I'm Jessica Mendoza. It's Tuesday, July 23rd. Coming up on the show, how one software update caused a global IT meltdown. Well, let's say I'm at a food truck I've never tried before. Am I going to go all in on the loaded taco? No, sir.
Starting point is 00:02:27 I'm keeping it simple. Starting small. That's trading on Kraken. Pick from over 190 assets and start with the 10 bucks in your pocket. Easy. Go to kraken.com and see what crypto can be. Not investment advice. Crypto trading involves risk of loss.
Starting point is 00:02:44 See kraken.com slash legal slash ca dash pru dash disclaimer for info on Kraken's undertaking to register in Canada. So CrowdStrike sent out the faulty update on Friday. When did people start to notice the problem? Well, right away. Because all these computer systems just stopped working. People watching Sky News noticed it because it suddenly went off the air. because all these computer systems just stopped working. People watching Sky News noticed it because it suddenly went off the air. People in airports, the baggage handling system wasn't working. So it was immediate.
Starting point is 00:03:17 I was actually flying back from Milwaukee from the Republican National Convention, and it was chaos at the airport because there were all these blue screens everywhere that were just showing kind of like recovery or error. Yeah, yeah. What was that blue screen exactly? That's called the blue screen of death. The blue screen of death is a problem specific to Microsoft computers. It shows a blue screen with an error message, and in some cases, a sad-faced emoticon.
Starting point is 00:03:46 But blue screen of death means the computer is not working. It's not going to work until you do something to make it start up again. And usually what happens when you get a blue screen of death is you just reboot, and everything kind of sorts itself out. And what happens when you get a blue screen of death? Well, they call it bricking, right? When your computer becomes as useful as a brick. And so that happened to all kinds of computers all around the world. And what made this so tricky is that in order to fix the problem, you couldn't just reboot it.
Starting point is 00:04:18 You know, often with the blue screen of death, you just start all over and everything works fine. Right. But in this case, you had to physically go to the machine. You had to start it up in a certain way. Then you had to surgically go in and remove a file. So we're talking, I don't know, 20 minutes like every computer, but also you have to physically get to all these computers. So all these people, even today, are showing up at their corporate headquarters saying, like, my computer hasn't worked since Friday.
Starting point is 00:04:48 Could you get it going again for me, please? On Friday, CrowdStrike's chief executive said that the company was working to restore operations for its customers. So tell us about the company at the center of this crash, CrowdStrike. I had never heard of it until this outage happened. What is it? What does it do?
Starting point is 00:05:13 Well, CrowdStrike, they were founded in 2011. So what does that make them? 13 years old. They're a very fast-growing company. They're very well-respected. Also, I think it should be pointed out that CrowdStrike is like an incredibly flashy cybersecurity company. They're very well respected. Also, I think it should be pointed out that CrowdStrike is like an incredibly flashy cybersecurity company. George Kurtz, the CEO of the company, races sports cars. CrowdStrike once sent me a calendar that for every month it had like a cartoon picture of a hacking group. Wow. And they give their hacking groups colorful names like Fancy
Starting point is 00:05:46 Bear and Cozy Bear, and they have Scattered Spider. So they're sort of like a cybersecurity group with a little pizzazz. Yeah, I would say so, for sure. CrowdStrike was founded at a time when hackers were getting better at getting around traditional antivirus software. The company seemed to offer an effective alternative. CrowdStrike came up and they said, we're going to really pay attention to what the hackers are doing. We're going to really focus on understanding the hackers and we're going to create more behavior-based software. We're going to create a new kind of software that's better than traditional antivirus. And their software was better than traditional antivirus, and they were
Starting point is 00:06:25 very, very successful. So they went from like a small startup in 2011 to, they're about a 8,000 person, $73 billion market cap type company right now that's publicly traded. And they were extremely popular in the Fortune 500. They really focused on the big companies and doing sales to satisfy these very large corporate clients. So would you say CrowdStrike came to be known as sort of the premier software for protecting? Yeah, they're considered one of the best cybersecurity companies to go to if you're a large corporation. Yeah, big enough to put up an ad during the Super Bowl, right? Protecting your business from cyber attacks can be unrelenting.
Starting point is 00:07:15 Today's adversaries move fast. CrowdStrike moves fast. I can't think of another cybersecurity company that's done a Super Bowl ad. There might be one, but, and their ads were pretty good too, I gotta say. On Friday, almost as soon as CrowdStrike got wind of the outage, it tried to fix the bug. And the company was able to, just over an hour after the update went out. But dealing with the aftermath was another thing. just over an hour after the update went out.
Starting point is 00:07:44 But dealing with the aftermath was another thing. They clearly stayed up all night. Because, yeah, we saw George Kurtz, you know, on the Today Show Friday morning. He looked tired. And I want to start with saying we're deeply sorry for the impact that we've caused to customers, to travelers, to anyone affected by this, including our company. So they were very quick to say, like, look, at we weren't hacked this isn't some kind of side i mean sure they wish that that they could blame somebody else but they were very clear that this wasn't somebody taking over our product they
Starting point is 00:08:16 100 took responsibility for it they've been a little unclear on the precise nature of the problem. And so even now, we don't have like the 100% crystal clear, precise understanding of how this flaw got introduced, when it got introduced, who introduced it. That flaw affected a lot of computers, but only ones running Microsoft Windows. Why is next. CrowdStrike makes security software. Any company can buy the program and keep computers safe from potential hackers. But the reason last weekend's faulty update impacted only Microsoft computers has to do with something called the kernel. The kernel is like the very, very center of it. It's like the first thing you boot up in it. It's kind of like command central.
Starting point is 00:09:22 Like think of, you know, just the brain is really the best way to think of it. But it's the thing at the very center of all of it that starts up at the beginning, that has control over everything. Apple and Android operating systems restrict software programs' access to a computer's kernel. But Microsoft doesn't. A holdover from the way its programs were originally designed. In the olden times, when Windows was coming up, it was really common to just like allow the software access to the kernel. It could be much more powerful then. So if you had security software, it could do a really much better job of finding bad stuff. It was just like, once you're in the kernel, you're in like this super powerful place,
Starting point is 00:10:10 so you can do anything you want to do. And so it became really, it's a great place for security software, but it's also a really dangerous place if the security software goes wrong. This is why the faulty update was so bad for Microsoft computers. And it's why other computers like Macs weren't affected. This put Microsoft in a position where some company that they have no control over can introduce an issue that can crash eight and a half million of their users, and they can't do anything about it. And so from Microsoft's perspective, this wasn't an oversight or anything. This was just the way that Windows has been designed. It allows these different kinds of software to access the kernel.
Starting point is 00:11:00 I mean, what happened this week was 100% not Microsoft's fault, but they have designed their product in a way that allowed this to happen. What has Microsoft said about the issue? Well, they've tried to help their customers with it, right? So they've published some guidance about how to fix things. They're in a tough position because they didn't cause this, but they're sort of, you know, they're the operating system vendor that was affected by it. So they've tried to be as helpful with their customers as they can be, but there's only so much they can do.
Starting point is 00:11:37 In a blog post, Microsoft said that the outage affected less than 1% of its global footprint. A Microsoft spokesman said the company can't legally wall off its operating system in the same way Apple does, because of an understanding it reached with the European Commission in 2009 following a complaint. So who is at fault? Is it Microsoft or is it CrowdStrike? It's CrowdStrike. CrowdStrike really bungled this. But I think that if Microsoft had really been pushing the envelope on the design of their operating systems and really prioritized security, they could have made some changes to the kernel
Starting point is 00:12:13 that would make this less likely to happen. So the whole question about why they haven't done that, it's a tough one. But, you know, clearly Apple did make this move a few years ago, and Microsoft has not. As of Tuesday, some of the problems caused by this outage haven't been resolved yet. Some companies are still fixing computers, and travelers are still dealing with the aftermath of thousands of canceled flights. And for CrowdStrike, its stock has gone down by 25%. Long-term, in my experience, these kind of problems, you can bounce back from them, but it's really damaged what was a pristine reputation. And they're going to have to make sure that something like this doesn't happen again.
Starting point is 00:13:01 There is a point at which you can really seriously erode your customer's trust if you're making them go through this on a frequent basis. It's also like kind of, it's sort of incredible that after all these years with all this experience and software reliability and building systems, like you can still have like eight and a half million computers go out like that. What does this crash, you know, that brought down so many industries, it left so many travelers stranded, it turned computers into bricks. What does that tell you about the state of our technology? Well, I've been writing about computers and computer security for a long time now.
Starting point is 00:13:49 And I'm always watching to see when an outage or a computer issue transcends inconvenience, right? So this was like incredibly inconvenient and financially costly, but no lives were lost. So, you know, the kind of pessimistic cliche would say this shows that we're more dependent than ever on technology and that when it goes down, it can have wide ranging and very annoying and costly effects. But it's also showing that, you know, we haven't hit that point where it's as catastrophic as like a hurricane, you know. And will we get there? Like, I think we probably will. That's all for today, Tuesday, July 23rd. The Journal is a co-production of Spotify and The Wall Street Journal.
Starting point is 00:14:55 Additional reporting in this episode by Tom Ditton. Thanks for listening. See you tomorrow.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.