Today, Explained - Breaking the internet

Episode Date: February 19, 2025

The Trump administration’s effort to purge government websites is accelerating digital decay. It’s a trend that imperils our record of ourselves. This episode was produced by Amanda Lewellyn, edit...ed by Jolie Myers, fact-checked by Laura Bullard, engineered by Patrick Boyd and Andrea Kristinsdottir, and hosted by Sean Rameswaram. Transcript at vox.com/today-explained-podcast Support Today, Explained by becoming a Vox Member today: http://www.vox.com/members A photo illustration of the U.S. Agency for International Development (USAID) website. Photo Illustration by Justin Sullivan/Getty Images. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 President Donald Trump has been back in office for one month and what a year it's been. We've covered a lot of Trump but today explained this past month from pardons to executive orders to Greenland to Guantanamo to tariffs to Maha to Elon and Elon and even more Elon but today we're gonna talk about the websites. DEI would have ruined our country, and now it's dead. I think DEI is dead, so if they want to scrub the websites, that's okay with me. Government webpages are disappearing.
Starting point is 00:00:34 Sometimes they come back, sometimes they don't, and it's part of a greater problem we have online. Some call it digital decay, others call it link rot. Whatever you call it, decay, others call it link rots, whatever you call it our internet is disappearing and we're going to help you understand why it matters and what we can do about it on the show today. Human eggs are only the size of a grain of sand, but the space they can take up in your mind can be gargantuan. Now there are a lot of concerns with some experts saying this procedure really just serves as another way for companies to make money from stoking women's anxieties. Egg freezing has been presented as a kind of girl boss panacea, but what's the reality?
Starting point is 00:01:28 of girl boss panacea. But what's the reality? That's this week on Explain It To Me. New episodes every week, wherever you get your podcasts. What's up besties? On this week's episode of Net Worth and Chill, I'm sitting down with Lexi Alford, aka Lexi Limitless, the youngest person to have been to every single country in the world. We chatted about how she turned her passion for travel into breaking a world record and transforming that momentum into a seven-figure business. Listen wherever you get your podcasts or watch on the Your Rich BFF YouTube channel. This is good. They're listening to you. This is it.
Starting point is 00:02:03 Today's plane. Today's plane. Sean Romm is from here with Addie Robertson, senior editor at The Verge, here to tell us about the websites. What is going on with the government's websites? So Trump signed a couple of executive orders, one of which defined officially the idea that there are only two genders, male and female. And another one that ends, quote unquote, diversity, equity, and
Starting point is 00:02:30 inclusion in the government. We will forge a society that is colorblind and merit based. And so the result here has been that more or less across the government, in addition to The kind of thing that we saw in the first Trump administration which included purging information about climate change and some other general climate related issues we've seen just a massive cut of anything that involves racial equity or
Starting point is 00:03:01 Transgender people or really anything that is sort of a subject of Republican culture wars. The CDC is currently scrubbing information from their website right now to be in compliance with a recent executive order. Here are some of the pages that have gone down. The Trump administration has taken away reproductiverights.gov from the federal website. They also have scrubbed federal websites for any search of abortion. Within hours of President Trump's inauguration, the Spanish version of the official White House website disappeared. The website now gives users an Error 404 message. A lot of the stuff initially happened very quietly. Reporters noticed it, people who used the information on these sites, which included
Starting point is 00:03:50 data on the CDC or even transportation statistics, they have ended up uncovering a lot of this. And from there, the way that the Trump administration has mostly addressed it is in response to lawsuits that there were claims that they deleted this data improperly, there was a court order that required them to put it back up. And they have responded by putting it back up with a big banner that says, we reject this information, we were forced to keep it online, but it violates something like say our dictate that there are only two sexes. So we find it unscientific or we find it against our policies.
Starting point is 00:04:27 Any information on this page promoting gender ideology is extremely inaccurate and disconnected from the immutable biological reality that there are two sexes, male and female. Is there presidential precedent for something like this happening? Or is Donald Trump and Doge and Elon Musk and the gang like the first administration to come in and just start ripping apart websites? First of all, just for context, every time
Starting point is 00:04:52 there is a new presidential administration, there is data that changes, there are priorities, there are new programs or old programs that get retired. So it's not necessarily surprising that some things have changed. But we have, as part of this, seen just a massive and really unprecedented deletion of information, including information that is required for people to do their jobs outside of the White House. And so it's a really huge issue right now. I don't think we've ever seen this kind of scale of data purging, especially of records and scientific research. Obviously, the first Trump administration deleted some data in ways that seemed very ideological aimed at suppressing information
Starting point is 00:05:44 about climate change. The White House and other federal agencies are also revamping their websites, for instance, scrubbing mentions of climate change. And Trump is blasting. And obviously there have been pages that just disappeared at the end of terms, but that tended to be more about oversight. It tended to be more that there was a changing of the guard and they didn't really know where everything was.
Starting point is 00:06:07 So some websites are disappearing, some websites are disappearing and coming back. Some websites are still up. Is there anyone who has like a full grasp of what exactly is gone forever? There are nonprofit groups and journalists that are working to preserve this information. There were already groups before Trump took office, like the Environmental Data and Governance Initiative that we saw a little of this in Trump's last term. And so there was this effort preemptively to preserve information, which includes not
Starting point is 00:06:39 just web pages, but also just collections of data from groups like the CDC. So there are all of these, not necessarily fragmented, but individual and private efforts. And also one of the really big load-bearing institutions here is the Internet Archive and the Wayback Machine, which has always maintained this project that archived data at the end of every term, but now has become a place where you can go and check and see what's disappeared and has become part of this process of identifying and trying to recover data. Beyond the American people perhaps needing access to some of this information, beyond any number of institutions needing access to this information. It points at a bigger problem we have on our internet right now, right?
Starting point is 00:07:28 Something called Linkrot? Linkrot or digital decay, which is a general phenomenon where web pages either disappear or they move in a way that makes them more difficult to find. And so the internet, which is a series of links that point to information, ends up with all of these little dangling ends and dead links and places where you can no longer find information that someone has referred to, or when you can simply no longer find a record of it at all.
Starting point is 00:08:01 A 2013 Harvard study, for example, found that half the hyperlinks in Supreme Court cases, today's equivalent to footnotes, are broken, a phenomenon known as link rot. Why do web pages disappear? The most obvious case is when a page is just taken down, maybe sometimes because the entire website went under, maybe sometimes because they think that page is no longer valuable. Government agencies remove documents and companies fail and with them the sites they host. Think of GeoCities, Yahoo Video, and more recently the news site Gawker. There are also incidents where just the URL of it, the link that points to that information changes and so it's harder to find. So if you previously linked to it from another web page then
Starting point is 00:08:51 that's just not going to go there anymore. The wonder of it is it's very very simple. Anybody could go and set up a web server on their computer and make it available to the world. Unfortunately it's too simple. It's fragile. That if something happens that piece of equipment, that website's just blink is gone. So you've been covering this issue, Addie, for more than 10 years. Is Linkrot getting worse online or is it sort of, you know, continuing apace? Linkrot has been an issue that people have been identifying in some ways since really the beginning of the internet But for definitely at least a decade a really significant proportion of web pages and links have no longer functioned
Starting point is 00:09:35 I think the latest research was something like 38 of web pages that existed in 2013 are no longer available This is, I think, not necessarily an issue that has suddenly snowballed, but I think we're seeing some unique circumstances now that have added to it. One of them is something like search engine optimization, where Google rewards pages, or at least people think it rewards pages that regularly refresh or that seem like they are providing new information. And so for instance, CNET, which is a really venerable tech publication, removed a bunch of its older articles because it wanted to
Starting point is 00:10:18 appear in Google search results more highly. And so there was this sense that okay it makes people more likely to find current articles but also just this trove of information disappears. Right I mean I think we can all you know mourn the loss of like our GeoCities home page from 2003. But it's a lot rougher when like, I don't know, some billionaire buys out alternative newspaper and just decides one day to shut down its website. Sometimes it's a billionaire that buys something and shuts it down. There are also just more insidious phenomena that I think really kind of speak to the commercialization of the internet and the sort of cannibalization
Starting point is 00:11:06 of anything that can be turned toward profit. So you have old websites that say have a name people recognize and then they get resurrected, but they no longer have the old information. They've been filled with AI generated new articles that can sort of capitalize on this old name as this zombie site. Or you have issues where there's a link that goes dead and somebody tries to kind of hijack that link and they either they contact the website administrator or they find some other way to get that to point to a new page that will then build their own credibility but doesn't provide the original
Starting point is 00:11:44 information. So there are all these cases where archival gives way to profit. That information was useful sometimes because it provided, say, statistics or it provided evidence if you're, say, looking at Wikipedia and there's a dead link that no longer provides the information it used to. And sometimes just because these things are a valuable record of what the Internet used to be and of how people lived, there are a lot of things that at one point would have been written down on paper or in some other medium that's just a hard document and people can look
Starting point is 00:12:19 back on it. But at this point, a huge amount of our culture takes place on the Internet and the Internet is a very fragile place Addie Robertson reader at the verge comm when today explained returns We're heading into the way back machine to hear from the people trying to archive the entire internet one webpage at a time. ["The Daily Show Theme Song"] Support for this show comes from Robinhood. With Robinhood Gold, you can now enjoy the VIP treatment, receiving a 3% IRA match on retirement contributions.
Starting point is 00:13:10 The privileges of the very privileged are no longer exclusive. With Robinhood Gold, your annual IRA contributions are boosted by 3%. Plus, you also get 4% APY on your cash and non-retirement accounts. That's over 8 times the national savings average. The perks of the high net worth are now available for any net worth. The new gold standard is here with Robinhood Gold. To receive your 3% boost on annual IRA contributions,
Starting point is 00:13:32 sign up at robinhood.com slash gold. Investing involves risk, rate subject to change. 3% match requires Robinhood Gold at $5 per month for one year from first match. Must keep funds in IRA for five years. Go to robinhood.com slash slash boost over eight times the national average savings account interest rate claim is based on data from the FDIC as of November 18th, 2024. Robinhood Financial LLC member SIPC gold membership is offered by Robinhood Gold LLC.
Starting point is 00:14:01 Support for today explain comes from Hydro. Maybe you kicked off the week strong hitting the gym on Monday with every intention of getting the rest of the week in. But then life happened, you know, your friends called you over, there was a game, there was a movie, there was a rough day of news and you needed to come home and lie on the floor for a while. Anyway, the mental back and forth about working out turned out to be more exhausting than the workout itself. With the Hydro Rower, they say,
Starting point is 00:14:29 you can get a full body workout in just 20 minutes, no overthinking required. You can stick to the plan and get a full body workout, all from the comfort of your home with Hydro. Head over to hydro.com and use the code EXPLAINED to save up to $475 off your Hydro Pro Rower. That's H-Y-D-R-O-W.com. Code EXPLAINED to save up to $475.
Starting point is 00:15:00 Hydro.com. Code EXPLAINED. dollarshydro.com code explained. Hello, podcast listeners. I'm Sean Rommerserum here from the Today Explained show, and I've got some news you can use. We're taking Vox Media podcasts on the road and heading back to Austin, Texas for the South by Southwest Festival. March 8th through 10th, we'll be doing special live episodes of hit shows, including our show, Today Explained, Where Should We Begin with Esther Perel,
Starting point is 00:15:33 Pivot, A Touch More with Sue Bird and Megan Rapinoe, Not Just Football with Cam Hayward, and more, presented by Smartsheet. The Vox Media Podcast stage at South by Southwest is open to all South by Southwest badge holders. I'll be the guy in a Mr. T costume. We hope to see you at the Austin Convention Center soon. You can visit voxmedia.com slash S-X-S-W to learn more.
Starting point is 00:15:57 That's voxmedia.com slash S-X-S-W. This is Today Explained. So let's just have you start by saying your name and what it is you do. Sure. Hi. My name is Mark Graham and I am the director of the Wayback Machine at the Internet Archive. Which is a not-for-profit that has been preserving the web since 1996. Journalists use it all the time, but for the uninitiated, I asked Mark to show us around the Internet Archive.
Starting point is 00:16:38 Where do I begin? It's like walking into a very large library and saying, show me your favorite book. Well, for example, last year, it was a big news story that MTV News was shut down. And the founding editor of MTV News wrote about it on LinkedIn. And there was a lot of other editors talking about it. It was like, oh my God, all of our articles are gone. They're missing. And I just casually, you know, waited into the conversation
Starting point is 00:17:04 and go, hi, check here, way back machine. And they were like, oh my God, you guys like got it all, pretty much. Yeah. And they said, well, you know, people say, well, what did you do? I mean, did you, what did you do when it went down? You must have, I say, we didn't do anything when it went down because we've been doing our job all along. We've been working to archive the public web as it's published on an ongoing and continuous basis.
Starting point is 00:17:32 So if we have to start paying attention to something after it's gone down, that means we screwed up. So with that example, with MTV News, give us a sense of what you guys were doing in advance of that website going down to make sure that people could find out, you know, I don't know, what Everlast was singing about in 2004. Hello, Jancey Dunn here, and joining me now is former House of Pain member Everlast. Welcome Everlast.
Starting point is 00:18:00 Thank you. So for any one of number of thousands of reasons, we set our web crawlers and archiving software out on a mission every day to identify and to download web pages and related web-based resources. We bring in millions and millions of URLs every day that are signals to us, signals of where new material is being published on the web. And we make sure that we archive all of those URLs, all the web pages associated with those URLs.
Starting point is 00:18:39 And then we look at those pages and we identify links to other pages. And then we go to those pages and we identify links to other pages. And then we go to those pages and we archive them, etc., etc., etc. That's where you get this metaphor of crawling like a spider throughout this web. And the net result of it is that we add more than a billion archived URLs to the Wayback machine every day. And this material, as it's added to the Wayback Machine, is indexed and is immediately available
Starting point is 00:19:09 to people who go to web.archive.org, enter in a URL, and then are able to see a history of archives that we have of the web page that was available from the URL at any given time. I want to talk about government websites now because that's sort of the reason we're having this conversation today. I think most people probably think the government will take care of archiving government websites,
Starting point is 00:19:41 but here we are in a new administration and websites are disappearing, coming back online, and people are worried. When you, an archivist of the internet, see government websites disappearing, coming back online, becoming unreliable, how do you react to that? Is that better or worse than regular websites that are non-governmental going offline? Well, as an American, my tax dollars help pay for some of this stuff. And then much of it is maybe a benefit to people. Certainly, my first reaction is, hmm, that might not
Starting point is 00:20:21 be such a good thing. I do want to underscore that there is the National Archives and Records Administration that does do archiving as well. But for whatever reason, we seem to be like one of the main players in the space of trying to archive much of the public web, including, and right now especially, U.S. government websites and making those archives available in near real time. much of the public web, including, and right now especially, U.S. government websites, and making those archives available in near real time. Hmm. Were you caught off guard when you saw the new administration removing web pages, removing
Starting point is 00:20:56 websites? This is pretty normal in some respects. It's normal and expected, and it's what's happened, frankly, for each administration in the time that we've been working on this effort. I mean, look, it's under new management, right? For example, it wouldn't expect the WhiteHouse.gov website under any new presidential administration to be the same as it was before. So we go out of our way to try to anticipate the frequency in which web pages
Starting point is 00:21:27 should be archived so that we got a pretty good shot at getting those changes. You're saying, you know, the Whitehouse.gov site obviously changes administration administration. I think to some degree, people understand that, that Joe Biden's administration probably wouldn't have been posting trolley valentines about immigration, you know, a year ago, this time to their Instagram account. But what we're seeing here is websites that people need, websites that record public health information going offline, briefly, permanently, what have you. No, that's true.
Starting point is 00:22:07 Is that a different degree of sort of erasing the historical record or messing with the historical record than we've seen? It's different. It's certainly different in terms of the number, seemingly. I mean, we're still in the early stages of this administration. But, yeah, I'd say on the face of mean, we're still in the early stages
Starting point is 00:22:25 of this administration, but yeah, I'd say on the face of it, you're right. Historically, we haven't seen major US government websites taken offline like we did, say for example, with regard to USAID. And I'm gonna leave that kind of analysis to others and really just focus on trying to archive the material. The Wayback Machine, the Internet Archive, mostly funded through donations, the generosity
Starting point is 00:22:58 of people, institutions, even governments. Is that going to be enough to archive the internet to the extent that future generations will want to see and need? Enough is a very subjective term. Well, as an archivist, for me, it's never enough because you don't know, no one knows what is gonna be of use, value, importance in the future, maybe even the near future of tomorrow, much less like the very far off future. And since millions of people use our site on a daily basis,
Starting point is 00:23:41 we get a lot of feedback from them. It motivates us, but it also helps direct us and inspires us to continuously try to do a better job at being the best library that we can be. Godspeed. There you have it. Let me ask you one last question, Mark. You guys have been at this for nearly three decades.
Starting point is 00:24:03 Certainly you've saved a lot of stuff and certainly a lot of stuff has fallen through the cracks. I wonder, is there something that slipped through the cracks that you could tell us about that might suggest to our audience, what is lost when we can't archive to the extent we want to or need to? Okay, so it kind of caught me up with that question.
Starting point is 00:24:28 I'll just say, I don't know right now. I can't say that thing. Gosh, I wish, okay, I got one. I mean, this is just in recent history. Apparently there was a page up on the CDC website about bird flu last week. It apparently was only up for a few minutes and no one got it.
Starting point is 00:24:47 Huh. And by losing that fleeting webpage, that one, you know, maybe minor, maybe major webpage about bird flu on the CDC's website, what are we losing? Well, we're losing part of the story, right? We're losing part of our understanding of the story, right? We're losing part of our understanding of the evolution
Starting point is 00:25:06 of arguably a significant health issue, concern. We don't know where this is going to go. I don't know. I guess that's the other point, right? I mean, you don't know necessarily now that which is going to be very important in the near or longer term. that which is going to be very important in the near or longer term. In the time of Martin Luther, there was a raging, raging debates.
Starting point is 00:25:29 And much of that debate took the form of things that were written on pamphlets. The pamphlets at the time were considered of little value. I mean, people read them and they shared them, but they didn't necessarily save them. So today, a scholar of that time, or someone like me, who was just kind of strangely curious, what I would give for a collection of those pamphlets. Yeah. I mean, and you are,
Starting point is 00:26:02 you are comparing in a way, a CDC website to the Protestant Reformation, but I think you mean it, don't you? I do, because I don't know. And one really can't know without the benefit of the long historical view. And that's not something that we have access to today. Why? Because we don't have a real time machine.
Starting point is 00:26:34 Mark Graham, known exclusively to Amanda Llewellyn as WebMG. Check out the Way Back Machine at web.archive.org. Amanda produced the show today, Laura Bullard helped and wore the hat, Jolie Myers edited, Andrea Christensdottir and Patrick Boyd mixed, and Andrea even made some original music thanks to the free state of Aftonia for the wifi. Oh, and it's today, Explained's seventh birthday today. What did you get us? Maybe show some love in the comments and the ratings and the reviews. They say it helps. And thank you for listening for however long you've been listening. If you're new to the show, feel free to browse the archive. you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.