Endless Thread - Introducing Outside/In: 'The Papyrus and the Volcano'

Episode Date: July 5, 2024

Endless Thread presents an episode from the podcast Outside/In. While digging a well in 1750, a group of workers accidentally discovered an ancient Roman villa containing over a thousand papyrus scro...lls. This was a stunning discovery: the only library from antiquity ever found in situ. But the scrolls were blackened and fragile, turned almost to ash by the eruption of Mount Vesuvius. Over the centuries, scholars’ many attempts to unroll the fragile scrolls have mostly been catastrophic. But now, scientists are trying again, this time with the help of Silicon Valley and some of the most advanced technology we’ve got: particle accelerators, CT scanners, and AI. After two thousand years, will we finally be able to read the scrolls? ***** Reported, produced, and mixed by Justine Paradis Outside/In host: Nate Hegyi Edited by Taylor Quimby Our team also includes Felix Poon NHPR’s Director of Podcasts is Rebecca Lavoie Music in this episode came from Silver Maple, Xavy Rusan, bomull, Young Community, Bio Unit, Konrad OldMoney, Chris Zabriski, and Blue Dot Sessions. Volcano recordings came from daveincamas on Freesound.org, License Attribution 4.0 and  felix.blume on freesound.org, Creative Commons 0. Outside/In is a production of New Hampshire Public Radio.

Transcript
Discussion (0)
Starting point is 00:00:00 Support for endless thread comes from MathWorks, creator of MATLAB and Simulink Software, to design and develop engineered systems, accelerating the pace of discovery in engineering and science. Learn more at Mathworks.com. Support for WBUR comes from Is Business Broken, a podcast from the Mayrotra Institute at Boston University that explores questions like, why is innovation in healthcare so hard? Is ESG just greenwashing? of course, is business broken? Listen, wherever you get your podcasts. What's up, endless threaders? I hope you had a good Fourth of July. You got to grill some things, be they meat-based or veggie-based. We're hitting you with one of our favorite shows called Outside In this week. They've got an episode about artificial intelligence, but it's pretty different from a lot of the stories we're hearing right now about AI. In that, it's kind of about the positive promise of AI.
Starting point is 00:01:04 It's not about the scary stuff. It's about the fascinating way in which AI connects to our history. So take a listen to Outside In, and we'll see you next week. Hey, Nate. Hey, Justine. Our journey today begins in Italy, way back in 1750. We're on the southwestern coast, right on the Mediterranean, And a group of people are busy digging a well.
Starting point is 00:01:33 And while they're digging, their shovels start to hit, not soil, but this. Whoa. They hit what looks to be, is that tile? Yeah, it's like a mosaic. To me, it's almost fractally. It looks like a marble rug. It's beautiful. It's beautiful.
Starting point is 00:01:56 It turned out that this was a patterned marble floor, which had been buried in volcanic ash. because this villa was once part of the ancient town of Herculaneum just outside of Pompeii. Oh, that Pompeii. The one that was almost completely buried by the eruption of Mount Vesuvius almost 2,000 years ago, people eventually figured out that this villa probably belonged to Julius Caesar's father-in-law. And so it was lavish, complete with dozens of bronze and marble statues and an Olympic swimming pool. But the real treasure was in the library.
Starting point is 00:02:31 Excavators eventually found hundreds of papyrus scrolls, each of them containing invaluable writings on classical philosophy. Philosophy in antiquity means a lot of different things. Ethics, rhetoric can mean poetic, it can mean music in this case, it can also mean physics. That's Federica Nicolardi, a paparologist at the University of Naples. Was that the stuff, study of papyrus? Isn't that such a nice word? It's a whole discipline. I'm a paparologist.
Starting point is 00:03:12 We don't have a ton of writings from antiquity. A lot has been lost to time. And so this collection of scrolls in the villa, it's actually the only surviving library ever discovered from the Roman era. Hundreds of never-before-seen original texts. It's hard to emphasize how big of a deal this is. Each of the book that we have from the library is a unique book. So this texts are all completely new.
Starting point is 00:03:40 We don't have them from any other tradition. We don't have them from medieval manuscripts or any other tradition. So everything we read there is completely a new achievement, a new result. Or they would be if only the scrolls hadn't been completely carbonized by the scorching temperatures of the volcanic gases. Oh no. So since they were discovered in the 1750s, these 800 or so scrolls have pretty much sat in drawers, in libraries, as people fought the American Revolution, when Abe Lincoln was shot, when your great-grandparents were bored and when they died, when the internet was invented, when you applied to college, they were sitting in drawers. Pretty much. most of them, together at least, they likely have the power to not only enrich but potentially transform our understanding of antiquity and ancient philosophy.
Starting point is 00:04:38 Right, but we can't read them. If only we could read them. This is, of course, Outside In, I'm Nate Hedgy with our producer Justine Paradise. For centuries, people have tried to read these damaged scrolls. All of them have basically failed until now. I was like, so can I be happy about it? this is real or is there any chance that it is a fake or something? To know I was like among the first people maybe.
Starting point is 00:05:15 To set eyes on these like unred takes, I was almost in tears. What you got now, Vesuvius. What you got now? Take it away, Justine. Let's start with paper. Modern paper is manufactured from tree pulp. Medieval European parchment, which is the stuff of illuminated manuscripts, that's made from animal skins.
Starting point is 00:05:54 But before either, there was papyrus. It's named for the papyrus reed, which grows plentifully in Egypt along the River Nile. To make papyrus, the stem of the reed is sliced into strips and then arranged in a kind of grid, one layer of overlapping strips laid horizontally, and then another layer laid vertically. So each sheet of papyrus is actually two sheets thick. Papyrus is flexible. You can roll in into scrolls. It's also light, far lighter than earlier writing surfaces like stone or clay, and thus it's more portable. The invention of papyrus helped enable the spread of writing, knowledge, and culture across the ancient world.
Starting point is 00:06:38 But papyrus is also fragile. When Mount Vesuvius erupted, hot volcanic gases flooded into the villa in Herculaneum. They scorched and blackened the scrolls. But paradoxically, they also preserved them. If they had not been carbonized and buried, the humidity of the Italian coastline would have destroyed the papyrus centuries ago. Today, these scrolls don't really look like scrolls. They look a bit like the charred wood pulled out of a campfire the next day. Back in the 1750s, when the villa was being excavated, at first, people actually thought they were just lumps of charcoal and threw many of them away.
Starting point is 00:07:29 They look terrible. They're completely blackened, fragile in a way that's almost. almost scary. You can't pick one up without having little pieces of dust flake off. That's Brent Seals. He's not a classical scholar or a paparologist, but a computer scientist. For decades, he's been on a quest to read the Herculaneum Scrolls. When Brent was starting his career, this thing called the internet was just starting to get big. And he was thinking a lot about digitizing library materials. So I was coming at the digital library in the 90s with the idea
Starting point is 00:08:07 that Google had, which is that we wanted to get everything online and be able to index it and find it and use it. But he quickly realized that libraries and museums have a lot of materials which are not at all easy to digitize. And in that category were these things from the ancient world like manuscripts and ultimately the scrolls that we discovered from reculenium. Many of the scrolls found in the Villa de Papyri are still unopened,
Starting point is 00:08:33 rolled up like a burrito. Unrolling them has not always gone well. There was the museum director who cut them open. There was the Vatican scholar who invented an unrolling machine. And there was the Neapolitan prince who thought he could use mercury to open them. He tried three times and destroyed three scrolls. Instead of physically unrolling a fragile manuscript, Brent wanted to try scanning them, using x-rays and CT scans.
Starting point is 00:09:10 Yeah, I mean, that was exactly what we thought. image, something wrapped up, or a book that can't be opened, then we could virtually pull the pages out. Brent calls his technique virtual unwrapping. This is extremely hard to describe, so we'll put a link to a video in the show notes, but I'm going to give it a go. Brent's idea was to X-ray the scroll in extremely high resolution, high enough to create a three-dimensional model that could depict not only the internal spiral of the scroll, but also reveal the ink on its surface. then he would digitally unroll and flatten out the image. He'd already used these techniques to image a medieval copy of Beowulf,
Starting point is 00:09:53 and to read the Eingetti scroll, a Hebrew Bible written in the third or fourth century on animal skin, found in an ancient synagogue. But to read the Herculaneum papyri, he'd have to take his technique to the next level. In 2019, Brent scanned two scrolls using a synchrotron, an actual, honest to goodness, particle accelerator. It was capable of scanning objects at resolutions that are literally microscopic.
Starting point is 00:10:23 At the super high resolution of 8 microns per voxel, which 8 microns is about the size of a red blood cell. A voxel, by the way, is just a 3D pixel. Pixels are 2D. Clearly it's the golden data set. But now there was a new challenge. The synchrotron had produced an absolutely enormous amount of data. These 3D models of the scrolls would take years to virtually unwrap, let alone read.
Starting point is 00:10:49 So Brent had his work cut out for him. But then, something happened he wasn't expecting. I got cold called by Nat Friedman, summer of 2022. Didn't really believe it was him. If you're a developer, Nat Friedman is a big deal. He founded GitHub, a platform for developers, particularly in the open source community. He's now an investor, active in civil. Silicon Valley, putting a lot of money into AI.
Starting point is 00:11:20 At this point, with the synchrotron scans, the bottleneck to reading the scrolls was really time and labor. It needed a lot of people giving the problem their attention. So, Nat wondered, why not bring an open-source mindset to the problem? It was Nat who suggested, well, you know, maybe we could just run a contest and the competitors could contribute in ways that, you know, you're taking money. for your research team wouldn't be able to contribute. I went back to Kentucky and talked with my research team,
Starting point is 00:11:53 and it was actually really tricky to decide whether we were going to do this or not. Going open source was a scary idea for Brent. Foundationally, the scientific method is meant to be collaborative. But despite those aims, it's also competitive, and there are incentives to be possessive of your discoveries. Brent was worried that he'd lose control of the project and waste years of work, not just for himself, but also for the PhD students in his lab.
Starting point is 00:12:26 You can imagine that conversation where I come back and say, I'm really happy you're working on your PhD. I'm going to now make you compete with 2,000 competitors worldwide who are going to be working on the same thing as you, so good luck with your thesis defense, you know. But another possible outcome was it would be wonderful. With Nat Friedman and his investing partner Daniel Gross, their contest had the potential to bring hundreds and even thousands of people to focus on the problem. In a way, it was kind of the opposite of academia, which can be bottlenecked by a lack of funding and time.
Starting point is 00:13:03 And eventually we, we being my research team, decided it was absolutely worth the risk. And everything that Nat said and did meant that he was going to be a great collaborator. So we just went ahead and took the risk and said, we want to do this. And things moved really quickly after that. At Radio Lab, we love nothing more than nerding out about science, neuroscience, chemistry. But we do also like to get into other kinds of stories. Stories about policing or politics. Country music.
Starting point is 00:13:55 Hockey. Sex. Of bugs. Regardless of whether we're looking at science or not science, we bring a rigorous curiosity to get you the answers. And hopefully make you see the world anew. Radio Lab, adventures on the edge of what we think we know. Wherever you get your podcasts.
Starting point is 00:14:11 There is something powerful about the sound of the human voice. Beautifully produced audio has the unique power to connect and inspire. Tell your organization's story with a custom podcast from City Space Productions, the creative studio from WBUR's Business Partnerships Team. Become a thought leader. Recruit new talent. reach new audiences, whatever your goal, we can help. Discover how the magic is made at WBUR.org slash creative studio.
Starting point is 00:14:44 It was the aides of March, 2023, and Yusuf Mohamed Nader was online. Yusuf is Egyptian, but he was working on his master's degree in Berlin, Germany. That time I had finished the fun part of the masters, I just needed to write it up. He's now part of a research group which uses AI models to study the group behavior, of fish and bees. But at that point in the semester, he was looking for a palate cleanser. Yeah, that was like the less fun part of the thesis.
Starting point is 00:15:14 So I like to have like a fun project on the side to work on. He often browsed Kaggle, a website where people post AI puzzles and competitions. And something caught his eye, an announcement of something called the Vesuvius Challenge. Resurrection an ancient library from the ashes of a volcano. the post-read. The organizers of the competition had published Brent's high-resolution scans from the synchrotron,
Starting point is 00:15:44 made available to the world for anyone to use. The task was, use machine learning models, or AI, to figure out how to read the Herculaneum papyri, and it listed a grand prize of $700,000. This was different from any other challenge on Kaggle that Yusuf had ever encountered. Everything about it was appealing. Having access to historical data, working on a very hard problem, the aspect of like the virus, which is interesting for me as an Egyptian. My wife used to joke, like, okay, it will be very cool if like, you know, your Egyptian DNA sort of kicks in and like solve the papyrus problems.
Starting point is 00:16:30 Meanwhile, in California. Yes, I'm Arafathe Sharafati. Arafati was working as a postdoc in a lab in San Francisco. working at the intersection of neuroscience and artificial intelligence. RFA had spent time in medical settings studying CT scans, the very same technology used to scan the scrolls. So it made sense when a friend reached out about the Fesuvius challenge and asked her to join his team.
Starting point is 00:16:58 Yeah, it wasn't hard to convince me to work on it. Her friend warned her, this is ambitious. It's going to be like a second job. But like Yusuf... The idea, I mean, there's nothing more interesting. interesting to me personally, to think that this is the only library from antiquity that's there, but we can't read it. And it might actually be the right time for us to attempt to read those strolls they wrote two millennia ago. It's just that mind-boggling. The Vesuvius challenge was
Starting point is 00:17:38 structured very intentionally. The idea was to bring that open-source approach to the challenge of the papyri. So there were plenty of incentives for competitors to share their data instead of hoard it. For example, there were lots of incremental prizes awarded throughout the process. And that way, if people didn't make it all the way to the Grand Prize, which would take nine months, they might still be rewarded, in cash, for participating even a little bit. And in fact, the very first one was called the Open Source Prize, $2,500, awarded to competitors who built stuff and released it publicly. And whenever a contestant won a prize, under the rules of the competition, they had to share their methods and data with the rest of the community.
Starting point is 00:18:22 People could work alone or they could join up in teams. By the end of the competition, there were 1,249 teams, and a total of over 25,000 submissions for all the prizes. One of the scrolls they'd be working with, if you enrolled it, would be about 13 meters long. It's not meant to be open vertically, like a medieval he, hear-ye-hear-ye announcement. Instead, it would have been unfurled horizontally, like columns of a newspaper laid out on a long ticker tape. But again, all rolled up.
Starting point is 00:19:01 That kind of looks like a tree stump if you just slice it, and you have those yearly rings of the tree. But instead of a circle, it's a spiral, and this spiral represents the surface of the papyrus. That's Julian Schilliger. He's another Vesuvius Challenge participant. At the time of the announcement, he was a master's student studying robotics in Switzerland. The first major step of the process was figuring out where the surface of the papyrus actually was,
Starting point is 00:19:31 which meant literally tracing the spiral inside the scan. This is a process which needs to be basically pixel-perfect, or that is, voxel-perfect. Julian got pretty good at this. For the process there, you have this spiral and you somehow have to track it. It's not always very obvious to the human eye either, where this spiral exactly is positioned in this 3D data. And when you trace it over multiple different layers in this tree stump, so to speak, then you can extract the surface and virtually unroll this scroll.
Starting point is 00:20:10 That would be the first step. The next major task was identifying the writing on the page, aka ink detection. Here's Arafah. you have a CT scan of your body, you can see the contrast. Like, you can see your bone, you can see organs. So you don't need sophisticated machine learning models. The doctor can just look at it. But on most of the scrolls, there's no contrast between the black ink and the black paper.
Starting point is 00:20:37 So instead, they had to look for morphological differences, like density or texture. How a surface of the glass is different from, like, a fabric, they just feel different. they look different. It could be geometrical distances. It could be bumps. The task was to train their AI models to detect that barely detectable ink. They started with small scraps of the scroll that had fallen off already, where writing was much more visible. Contestants would zoom in, sometimes down to the pixel, and label the data. This is ink, this is not ink. Then you give the AI something new and see if it can do that on its own. It was also about making sure that the AI has like this,
Starting point is 00:21:19 room to disagree. So even if I tell, hey, this is, for example, this looks, this looks like a piece, it's like, no, this is a phi and you like miss this part. Having like this ability to kind of argue back and kind of gives like authenticity that the model is not just memorizing and you know, hallucinating. It wasn't long before thousands of people were working on virtually unwrapping the scrolls. For Brent, the computer scientist who'd spent decades on his virtual unwraping technique. This was a big change. Well, I mean, I have a very robust research group and have had for an academic setting. Six or eight undergraduates who are always interested, a postdoc, several staff members,
Starting point is 00:22:03 three PhD students. It's actually big by academic standards. But I realized that I didn't really know what big was. When Brent logged on to the contest Discord, he might see 500 people online at any given time, talking about the project from Australia, Brazil, China, the US. The vibe on that community discord, by the way, was pretty raucous and joyful. Oh, yeah, of course. Julian told me, like, there were a lot of memes. If someone did something really cool, the emoji that we would use would be a hot flame or an erupting volcano.
Starting point is 00:22:49 The burrito emoji also saw a lot of use on the discord. One contestant even restored an old CT scanner to experiment with. And one day as a joke, scanned a burrito instead of a stroll because they kind of look similar, I guess. The community is vibrant. You always have someone to talk to, to share your new findings. If you're part of this, you don't feel alone. This kind of camaraderie and information sharing led to big leaps,
Starting point is 00:23:26 just as the organizers had hoped. Like when one contestant identified a crackly, textural pattern to the ink, he shared it with the entire competition. He called it a crackle signal. Here's Arafa. And that was just the breakthrough where everyone needed to be that, oh my God, it seems like there's actually morphological difference that you might actually be able to see with the naked eye. And I got so good at it that I think I can just totally recognize the handwriting of the who wrote this and everything about those letters, basically. By October, seven months in, things were starting to change very quickly.
Starting point is 00:24:16 One of the progress prizes was called the First Letters Prize, awarded to the first person to identify a word. But remember, this was an ancient Greek. There's no spaces between the words, so it was very difficult to know when they'd actually identified a word. I kept going all over the crazy ideas of, okay, maybe the letters, start here or here or here and then try to find words. And yet, though he didn't know what the word was until later,
Starting point is 00:24:43 Yusuf was one of two contestants to generate an image in which the same word was readable. And that first word was... They saw the image and they were like, oh, this is the word purple. Purple. So the first word was this porfura, so it's a foremost porfura in Greek which names purple. It can be the collar or also the material. That's Federica Nicolardi again, the paparologist we heard from earlier. She was also one of the judges on the Vesuvius challenge.
Starting point is 00:25:23 In that moment, Federica wasn't wowed by the single word purple, which she explained is neither common nor completely rare, and it didn't necessarily tell her a lot about the subject of the rest of the scroll. But it was still cool, and the word could just have easily been a filler word, like and or the. But along with the word, Yusuf's model had generated five columns of text,
Starting point is 00:25:48 a huge achievement. It was crazy. I remember the first time Brand showed me an image, I was like, so can I be happy about it? Is this real or is there any chance that it is a fake
Starting point is 00:26:06 or something? And he said yes. After the first letters prize, it was a grueling last couple months to the final deadline. To be eligible for the grand prize, contestants would need to submit four passages of 140 characters each. By the end, Yusuf ended up forming a team with two other people, including the other person who had identified the word purple and with Julian, the Swiss robotics student. And Yusuf had developed an iterative approach where he trained a consecutive series of AI models, each one trained on the predictions of the last.
Starting point is 00:26:51 makes predictions. I take these predictions, clean them up a little bit, give them to a new AI model to train. The new AI model finds more ink. I use this ink to train a new AI model. The newer model finds more ink and just rinse and repeat, maybe more than 15 times between the first letters and the grand prize. The final deadline was midnight on New Year's Eve. They missed a lot of the holidays. We lost a lot of sleep. They set programs to run while they slept. Sometimes in Berlin, Yusuf wouldn't see the sun for days. That's like the deep darkness, but it was like really really fun. I felt like my brain was not stopping. We submitted our results, I think, maybe less than half an hour before the deadline.
Starting point is 00:27:47 And just too much adrenaline, I couldn't sleep after that. So it's just really staring. Okay, what just happened? And then they waited. They waited. Arafa was catching up on sleep and reflecting on how much time she'd just spent studying the handwriting of a scribe
Starting point is 00:28:10 who had lived 2,000 years ago. It was really emotional. There was one moment she remembers in particular, looking hard enough and long enough at a string of letters to realize she was looking at the root of the word theory. And I was almost in tears because all my life
Starting point is 00:28:30 I loved math and I never spoke Greek but to know I was like among the first people maybe to set eyes on these like unread texts it connects you to all those people like thousands of years ago who thought about these words
Starting point is 00:28:47 in a philosophy context it was really really powerful finally after about a month they found out the results Out of 18 grand prize submissions, RF's team were runners-up, and they had won 50 grand. And Yusuf and Julian's team,
Starting point is 00:29:15 they'd won the grand prize, $700,000. We were like really thrilled and almost in disbelief. I remember, like, texting Julian like a couple of days later, and do you realize we won? And it hadn't really sunk in. After 2,000 years, we can finally read the scrolls, tweeted Nat Friedman. So, after all that, what does the scroll say?
Starting point is 00:29:43 In a way, this might sound a little anticlimactic. Papirologists are still interpreting it. But they do know a couple things. The winning submission included 15 columns of about 160 total in the scroll. They actually don't even know the title yet, because that's usually written at the very end, at the most intubstion. protected part of the papyrus. But they do know it's a work of Epicurean philosophy.
Starting point is 00:30:13 In this scroll, it talks about sight, taste, knowledge, and music. In fact, the word that appears most often so far is pleasure. So in a way, the work has just begun. There are hundreds more scrolls in the Herculaneum collection. And this virtual unwrapping technique can be applied to a lot more stuff, even to objects like mummies, whose wrappings sometimes had writing on them as well. And as for Brent, the computer scientist who's been developing this technique for decades, he's now joined by thousands of people who are now personally invested in the story of the scrolls.
Starting point is 00:30:58 When he reflects on that old fear of losing control and credit, he tries to put it in perspective. You know, sometimes those instincts are right, But a lot of times they're the exact opposite of what you really want. Sometimes I still think, you know, what happens if, you know, I don't get the credit I think I deserve? And in my life experience, the answer to that is absolutely nothing. It's going to be okay. After everything, it's not about getting credit. It's not about the money, though admittedly all that stuff is nice and obviously makes a big difference in life.
Starting point is 00:31:40 These scrolls are part of a broader human story. Brent says that no one owns the scrolls. They belong to everyone. I've always known that once we start reading reliably three, four, 500 things, nobody's going to keep going back and saying, oh, you remember this guy from 2009, right? I shouldn't even expect that. What they're going to say is, we have a new work.
Starting point is 00:32:04 It's Livy's history of Rome. We never had it before. This is amazing, right? That's the payback. and that's what I want. All right. So, Justine, the Vesuvius Challenge, it's not over yet, right? Right, it is not over.
Starting point is 00:32:27 They've actually announced lots of new prizes, and their community goal for 2024 is to read 90% of four of the Herculaneum scrolls. We're going to put a link in our show notes in case you want to check it out. We'll also share links to pictures of the scrolls and visualizations of the process of virtual unwrapping. Yes, and thank you to everyone who spoke to me for this episode,
Starting point is 00:32:49 I do want to emphasize that we talk to just a few of the competitors, but this was such a group achievement. And just shout out to a couple people we didn't hear from in the episode. Louis Schlesinger, who was runner-up for the Grand Prize alongside Arifah Sharafati, and Luke Ferritour, who also found the word purple, and he was on the Grand Prize-winning team alongside Yusef and Julian. We've also got another special thanks this week to an eagle-eared listener. George wrote in from Salt Lake City, Utah, to point out an air,
Starting point is 00:33:19 in our recent episode about aluminum. In 1943, when the Allies bombed a German cryolite factory, all but one of the 180 bombers returned safely. But we inadvertently reported the opposite. We erroneously said that all but one was destroyed. Not true. Amazing how a single word can totally flip the meaning of a sentence. I know, right?
Starting point is 00:33:44 That episode has now been corrected. This episode of Outside Inn was reported, produced and mixed by you, Justine, and edited by Taylor Quimby. Our staff also includes Felix Poon. NHPR's director of podcasts is Rebecca Lavoie. Music in this episode came from Silver Maple, Zavi Rousan, Beaumel, Young Community, Bio Unit, Conrad Old Money, Chris Zabriski, and Blue Dot Sessions. Outside Inn is a production of NHPR.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.