99% Invisible - 489- Pandemic Tracking and the Future of Data

Episode Date: May 4, 2022

Data is the lifeblood of public health, and has been since the beginning of the field. But essential data gathering for the COVID pandemic was hindered by a couple of of underlying weakness in the US ...public health apparatus. We have a fractured system where the power lies in US states that don't always coordinate effectively. Also there has been inconsistent funding. When there was an immediate crisis, there would be an infusion of cash. But then, when the crisis passed, the resources would evaporate. We take a look at data gathering in regards to public health from the 1600s to today and how it might change in the future.Support for this episode was provided by the Robert Wood Johnson Foundation (RWJF). The views expressed here do not necessarily reflect the views of the Foundation. RWJF is working to build a culture of health that ensures everyone in the United States has a fair and just opportunity for health and well-being. For more information, visit www.rwjf.org. If you have a hunch about how changes to the way we live, learn, work and play today are shaping our future, share it here: www.shareyourhunch.org

Transcript
Discussion (0)
Starting point is 00:00:00 This episode is one in a four-part series that we're calling the future of. We'll be exploring how changes of the way we live, learn, work, and play may shape our health and well-being in years to come. Thanks to the Robert Wood Johnson Foundation for supporting this episode. The Robert Wood Johnson Foundation is committed to improving health and health equity in the United States. Learn more about them at rwjf.org. This is 99% Invisible. I'm Roman Mars. Hey Roman.
Starting point is 00:00:31 Hey producer, Delaney Hall. So you're here with the next story in our series called The Future of Dota Dota. And you're here to talk about The Future of Data. Yes, that's right. And I feel like I should just level with you and our listeners. This is a story about spreadsheets, okay? It's a story about spreadsheets and data entry and data systems. The story about spreadsheets is not even my birthday, Delini. How do I get this marvelous gift?
Starting point is 00:00:59 I should have known I wouldn't have to convince you. You do not have to convince me, I'm already riveted, so let's go. Okay, so we'll start in the early days of the pandemic. Today, the entire city of Wuhan is on lockdown. And as COVID spread in Asia, and then in other parts of the world, weighing the most active hotspot right now, is Italy nearly two months. You were seeing officials take these wildly unprecedented steps
Starting point is 00:01:23 to control the disease, lockdowns, quarantines, massive amounts of testing. And I think a lot of people thought we'd see the same kind of response in the U.S. when the virus arrived here. You know, as someone who grew up reading the hot zone, I expected that we had the world's foremost infectious disease fighting agency in the world. This is Alexis Madrigal. At the time the pandemic started, he was a journalist at the Atlantic, and he's still a contributing writer there. So I was expecting to see the United States of America assume global leadership and come up with ways for us to deal with it. And like Alexis,
Starting point is 00:02:02 and come up with ways for us to deal with it. And like Alexis, I was basically imagining scenes from outbreak and contagion in my head. You know, like brave epidemiologists and hazmat suits working with the best technology available to get a scary situation under control. And so of course we all wanted that to be the reality. We all wanted that to be what our pandemic response looked like. Here's Robinson Meyer, Alexis's colleague at The Atlantic. Right, the CDC seemed like the last version of like absolutely competent American technocracy. Right, like they were on top of it.
Starting point is 00:02:43 But as we started to see cases of COVID here in the US, Alexis and Rob felt like the response from our public health agencies was surprisingly muted. We were not seeing mass testing, like was happening in Asian countries. And in fact, when they started looking, it was hard to find any concrete numbers at all about how much testing was happening here. Which was not good because testing was the most crucial data point that we had for understanding the pandemic at this point. So I guess the CDC was not tracking that at the time.
Starting point is 00:03:17 Well, at first it was, but then in early March 2020, the agency stopped reporting the total number of nationwide tests. And basically, they said, most testing is happening at the state level. So if you want to know, go ask the states. The CDC just wasn't saying. So I was like, fine, we're going to go to the states. So they reached out to all the states and asked some really basic questions, you know, how many people have you tested? How many positive cases have you had?
Starting point is 00:03:47 And how many tests can you do each day? And they were shocked by what they found. Oh, man, we tested almost nobody. So what's that mean, almost nobody? Like, what are the numbers for almost nobody? Well, on March 6th, which was just a few days after the CDC stopped reporting testing numbers. Robin Alexis could only verify that 1,895 people nationwide had been tested for COVID.
Starting point is 00:04:14 Even though the White House was saying that tens or hundreds of thousands of people could be tested per day by that point. So I do remember some of this. There was this huge shortage of tests for hospitals to use at the beginning of the pandemic. And there were pretty strict rules by the CDC about who could get tested at all, because you know, this test were in short supply. Yeah. And that ended up having enormous implications, because without much testing, it was hard
Starting point is 00:04:40 to know what was happening on the ground. If you looked at the state of the data, there was no data. In many states, Kutesta does in people a day. If the virus was circulating, we had no way to know. And Alexis and Rob felt like they had wandered into this weird twilight zone, where people at the top of our public health agencies were talking about carrying out a data-driven pandemic response,
Starting point is 00:05:04 but without actually having much data. And it'd be like, well, it seems like the virus could be everywhere then, and they'd be like, well, the data doesn't say it is. But the data was sh- And it wasn't just the testing data that was limited. It was also hard to find numbers on how many people were hospitalized and how many people had died. And so Robin Alexis decided to team up with a guy named Jeff Hammerbacher. He's a scientist and software developer. And they started compiling their own nationwide data sets based on what states were publishing.
Starting point is 00:05:42 And they thought of this as a stopgap. These are not fancy things, right? Cases, hospitalizations, deaths, number of tests performed. Like this is basic stuff. Like you look around the world and like, all these other places just like have this stuff, you know? So we were just like, well, obviously,
Starting point is 00:05:56 we'll also soon have this stuff. Because how could you run a public health response without good data? It's like the lifeblood of public health, and it has been since the beginning of the field. Okay, before we continue with the saga of Rob and Alexis, I wanna tell you about some of that early public health history because some of it's fun.
Starting point is 00:06:20 Honestly, that's big reason, but also because it shows why data is so important. And why our system makes it so hard to collect. Okay, let's do it. Okay, so there were many innovators in the realm of health data, you know, people like John Snow, William Far, W.E.B. Du Bois. But I'm going to zoom in on one of the earliest examples. We'll start with John Grant and the Bills of Mortality. The Bills of Mortality. I like this one, this is very ominous. I like it a little bit.
Starting point is 00:06:54 So the Bills of Mortality were these mortality reports. They were published in London in the 1600s. And every week a group of parish clerks would gather and share information about who had died in the neighborhood and how. And people would read them and they would kind of pick up the local parish thing and be like, oh, you know, John died. So this is Stephen Johnson. He wrote a book called Extra Life, a short history of living longer. And he says the bills were a source of fascination for many people, people liked to gossip about them, but they weren't what you would call structured data. Like there wasn't a way to detect useful patterns in the information there.
Starting point is 00:07:35 But then John Grant comes along and Grant was a Haberdashur slash amateur demographer. That's a very 1600s combination of interests. Yeah, you don't really get that kind of mix anymore. So he was a classic kind of engaged amateur and he looked at the bills of mortality and thought there is more to be learned from these documents. And so it occurred to him at some point in the early 1660s that if you try to assemble that data and look at it systematically, that you might be able to tell a lot more about what was really happening to the health of Londoners. And so he spent all this time going from
Starting point is 00:08:20 kind of parish to parish and reading through all these reports and kind of tabulating basically the data and trying to organize it in a more structured way. And Grant ended up publishing a pamphlet with a very catchy title. It was called Natural and Political Observations mentioned in a following index and made upon the bills of mortality. And what it enables both Grant to do, but also health officials around the city is to suddenly be able to answer the question, what is really killing people? You know, where are the real threats? And how are those threats changing over time? Grant's work represented a huge conceptual breakthrough. His pamphlet was basically the founding document of medical statistics and public health data. the founding document of medical statistics and public health data. But it ultimately wasn't that useful because there was still a limited understanding of what was actually killing people.
Starting point is 00:09:13 For example, let me share with you the causes of death tabulated in this report. The list is really pretty funny. According to Grant in 1662, about 1,300 people in London died of Apoplex, 38 died of Cut of the Stone, 74 died of Falling sickness, 243 died of Dead in the Streets. Only six died of Leprasy, actually, which is like you would have expected more in 1662. 158 died of Luna tick,
Starting point is 00:09:45 and then my favorite category, 454 died of suddenly. You can totally imagine the scenario where you get the answer suddenly, you ask, you know, some official comes up and says, how do you die? And someone says, suddenly. Which is, I'm just right down suddenly.
Starting point is 00:10:02 Totally, yeah, like I wonder if gradually shows up in the tabulations. Well, then I can see, if bad data comes in, there's not much you can do about that, but I can still see how it's groundbreaking. Yeah, it was groundbreaking in that it took these anecdotes and turned them into data about a population. But it wasn't actionable, partly because of answers like suddenly, but also because to act effectively on that data, they needed institutions that could coordinate
Starting point is 00:10:34 a public health response. And those did not really come into being at least in the U.S. until the middle of the 19th century. at least in the US until the middle of the 19th century. In the 1800s, a number of cholera epidemics swept across the United States. And in response, cities and states across the country started to establish local public health departments. But the system here grew in a mostly bottom-up way. It was cities and states first. It took decades before any real national public health institutions came along.
Starting point is 00:11:11 So when did the federal government began to get more involved with public health across the whole nation? Pretty late in the game. There was one national agency called the Marine Hospital Service, and it cared for sick and injured sailors. Eventually, in 1912, that agency became the Public Health Service. It started taking on greater powers, but it was still very hesitant to interfere in anything that was local. This is David Rosner. He's a historian of Public Health at Columbia University. The federal government is basically the weakest part of the national structure.
Starting point is 00:11:49 This was, you know, not a country that saw the federal government's much more than a bunch of buildings in a swamp in the DC area. Over the next few decades, the national system grew. The CDC eventually emerged from the public health service and became, in many ways, the best disease-fighting agency in the world. They pioneered the whole idea of a data-driven response, you know, using stats to figure out who was being hit by disease and how to intervene. It's this moment when a lot of American culture begins to turn to technology and science in general as a means of addressing all these very sticky social problems that plagued us.
Starting point is 00:12:36 And for that, you needed data. You needed some sense of what society looked like. So if you saw statistics that showed high death rates in one community or another, you could begin to rationalize your resources and identify resources. And the CDC had some huge successes. The agency helped eradicate smallpox. It started the fight against HIV.
Starting point is 00:13:00 It stopped Ebola more than once. And over a time, it developed a really heroic reputation. But there was always this underlying weakness in our system, which is that it's very fractured. It wasn't a coordinated system like some countries have. Instead, our system is this patchwork of thousands of state and local health departments who all operate pretty independently. So the CDC can issue guidance, but ultimately state and local health departments answer more to their local elected officials
Starting point is 00:13:35 than they do to the CDC. Yeah, I mean, it sounds like federalism as a concept is a real challenge to public health. Like a lot of power resides in the states and it's made that way, you know, on purpose. But I mean, when it comes to public health, these things don't have state boundaries. I mean, a flu or a COVID can pass through state boundaries and does not care about federalism at all as a concept. Right. Viruses do not care about states, right? They do not care about state jurisdictions.
Starting point is 00:14:07 Over time, there was another damaging pattern that began to develop with the CDC and the public health system more widely, which was that it struggled for consistent funding. So when there was an immediate crisis, there would be an infusion of cash, but then when the crisis had passed, the resources would evaporate. And that only accelerated from the 1980s onward during the Reagan era. Yeah. Reagan is at the start of many of these things when it comes to like the social safety net being eroded. Yeah. There's a familiar story for a lot of agencies. Yeah. And you know, this is a complicated part of the story that we're not going to wade super deep into.
Starting point is 00:14:47 But to simplify, public health agencies saw their budgets get cut decade after decade all the way into the 2000s. The CDC's budget dropped overall from 2010 to 2019. Over the same time period, local public health departments lost more than 50,000 jobs due to funding cuts. And we also saw a ton of privatization during this time. So the hiring of private contractors to do what the government used to do. And so this was the underfunded and very complicated system that Robinson Meyer and Alexis Madrigal encountered And so this was the underfunded and very complicated system
Starting point is 00:15:25 that Robinson Meyer and Alexis Madrigal encountered when they set out to gather their own COVID-related data at the start of the pandemic. Here's Alexis. The federal government itself like doesn't actually have the people to do the things that other governments do because we just decided from Reagan onwards that we were essentially
Starting point is 00:15:46 going to take out state capacity and instead pay consulting fees and contracts to companies because I guess somehow that's like less government years or something. And I imagine all of this makes data gathering and sharing really tough. Yeah, it's kind of hard to imagine a system worse than this one when it comes to data sharing. All these entities report data on what's happening locally, often a little bit differently, using different systems and data conventions. You just have so many reporting entities and officials and jurisdictions, and it's not just
Starting point is 00:16:20 because we're a large country. It's also because we have built these data systems over time in this kind of sedimentary pile. And it's very difficult to change them when it's not a crisis. You know, like it's just hard. And so did Alexis and Rob like fully understand the complexities of the health system
Starting point is 00:16:43 when they started to gather these nationwide statistics about COVID. Not really. Like they just knew that the government wasn't releasing much information about what was going on. And so they started their own DIY efforts and recruited volunteers to help with it. And it turned out I was literally the first person to fill out the volunteer form. Erin Kassane has worked on a bunch of web-focused projects over the years, most recently as an editor and director of content with Night Mozilla Open News. And she had been following COVID really closely and just jumped right into the data gathering.
Starting point is 00:17:25 And the work immediately was just things like, hey, can someone set us up a website? And at the same time, like, how do we open up this spreadsheet and get it to hold up with 20 people going and collecting data and what kind of quality process this should be set up? Because we obviously need to have people double checking these numbers. So it was sort of everything all at once. This group came to be known as the COVID tracking project. And full disclosure, the COVID tracking project received funding from the Robert Wood Johnson
Starting point is 00:17:56 Foundation, which also funded this story. Within a few weeks, the COVID tracking project had several hundred volunteers helping out. There were students, there were people from tech, people from medicine, journalists. And so what exactly did the work they were doing look like? Like how do you begin to gather it? Well, this was very simple and very complicated at the same time. Every day they would reach out to every state and territory in the US to find out how many tests they'd done and numbers of positive cases, hospitalizations, and deaths. And they used a range of methods to get those numbers. They watched a ton of press conferences, for example.
Starting point is 00:18:37 And they would also scrape numbers from state websites. Then they'd put everything into a spreadsheet. And then like once a day, you sort of commit those numbers and you're like, okay, those are the days numbers. And it is easy to do one time. And it's easy to even do, you know, 20 times. But as time went on, it started to get much harder because they began to understand
Starting point is 00:18:56 the idiosyncrasies of the system and also of the data itself. For instance, take the most basic unit of this data, a case of COVID. Like I would think that would be pretty easy to define and count, right? Yeah, yeah, totally. Well, it is not.
Starting point is 00:19:16 We had to sort of figure out over time that a case is not a case. In some places, a case is a confirmed case. In some places, it's confirmed in what's called probable cases. Well, what are those? Okay, here's the definitions. Is this state combining confirmed in probable cases? We can't tell.
Starting point is 00:19:34 Let's call the state. Oh my God. This is making my head hurt just thinking about it. I know, and this was true for most of the metrics they were trying to track. There were all of these different ways that what the states were sending to the federal government was just slightly different. You had different data definitions, you used different systems, and no one really had any idea of how to standardize those things, particularly in the midst of a crisis like
Starting point is 00:19:59 this. And there are just so many examples of how our COVID data was unstandardized or incomplete. Like another example, did you know that a lot of our local health departments are still at the mercy of facts machines? Oh my god. What can against him surprise? I'm still a Paul. Yeah.
Starting point is 00:20:19 It caused all kinds of problems. Like some states only reported electronically submitted lab results, others combined electronic submissions and faxed submissions. But because people had to enter the faxed information to their tallies manually, there would be these big distortions in the data where all of a sudden, a backlog of faxed results would just be dumped into the numbers all at once.
Starting point is 00:20:46 Right. You hold all the facts once to the end and then you enter them and then all of a sudden the numbers leap. And it makes people mistrust data is what it does. Or here's another example. When there would be a bad surge somewhere and lots of deaths, the people who fill out death certificates would get really behind. And so death numbers would lag. death certificates would get really behind. And so death numbers would lag. Or, yet another example, really critical demographic data would just be missing. There really were whole areas, especially race and ethnicity, where most states and territories never developed a really robust way to collect that kind of demographic data. And when they did, it wasn't standardized.
Starting point is 00:21:31 Which of course meant it was hard to tell who was being most impacted by the disease. And this stuff can feel kind of dry, like ultimately we're talking about spreadsheets and data standardization. But there was stuff going on at this time, early in the pandemic, where really good data could have helped. But these N95 respirator masks in particular are in high demand and short supply. I mean, for example,
Starting point is 00:21:57 there was a massive shortage of protective gear for healthcare workers. But still, I have spoken to healthcare workers in San Francisco, Oakland and San Jose today, all who say the shortage of supplies in their hospitals is a problem. If we had had the hospitalization data, knowing who had PPE, who had access to medications, who had staffing shortages, the federal government can, in fact, step in and help with that stuff.
Starting point is 00:22:22 Instead, that became this weird, crazy scramble that benefited nobody, as far as I can tell. So was the U.S. government doing this type of data collection from the states and just not reporting it, or were they just not doing it at all? Well, in those early days, the people at the COVID tracking project certainly thought that the government had its own comprehensive numbers. They thought that someone somewhere within the vast CDC bureaucracy had stats, and you know, better stats than they did. I think all of us really thought that the data did exist,
Starting point is 00:22:59 and we just couldn't see it. So the work that we were all doing was a stopgap. And presumably, we would need to do it for a little while until the federal government released the numbers they had. But then they started noticing some weird coincidences. So this would have been in March and April 2020. And back then, there were these press conferences that Vice President Mike Pence was doing. Thank you, Mr. President. And let me echo your words about all the dedicated men and women on the White House coronavirus task force.
Starting point is 00:23:34 And the COVID tracking people would watch those press conferences to see what kind of numbers the government shared and how they compared to the numbers that they were gathering. It was reported to us that at this moment more than 746,000 Americans have tested positive for the coronavirus. Unfortunately, more than 68,000 Americans have fully recovered, but sadly, more than 41,000 Americans have lost their lives to the coronavirus uh... and we were like hey you know his numbers really close to ours we must be
Starting point is 00:24:13 doing a good job because he's getting those federal numbers clearly and you know here we are just scraping from public data despite the fact that there have been more than eight hundred forty three thousand americans a contract the coronavirus and we grieve the from public data. Despite the fact that there have been more than 843,000 Americans of contract to the coronavirus and we grieve the loss of more than 47,000 of our countrymen. And then we realized they were tracking really closely and I think a
Starting point is 00:24:38 few of us started to suspect at that point that he was actually reading our numbers just rounded. Wow. Wow. Okay. And then their suspicions were later confirmed when the Trump administration published a report that clearly used the project's data and charts and cited them in the footnotes. And it was like, oh, they're just looking at our site. Like, we are the process. We are the process. We are the ones who are making this data. We were waiting for the cavalry, and then it turned out,
Starting point is 00:25:09 like, we were the cavalry, and we were like, no, no, no. We don't even have horses. We can't be the cavalry. And it's kind of darkly funny, but it is also scary. You know, you kind of think sometimes, like, well, you know, if disaster X were to happen, well, you know, if disaster X were to happen, well, you know, somebody's thinking about that. You know what I mean?
Starting point is 00:25:31 Like, well, somebody's gonna do that. And the truth is, no, nobody's gonna f**king do it sometimes. We were not ready. We did not have a system in place. And so I can imagine them feeling panic about like, oh my God, okay. So we thought we were just messing around.
Starting point is 00:25:45 We're doing our best. And we thought that the government was going to save us and they turned out they're not. But what was part of them kind of proud that the government was using their numbers? I don't think so. No. Okay. Like, I'm not sure how everyone on the project felt, but at least some of them were pretty shaken by this realization, like Aaron.
Starting point is 00:26:09 That's an immensely stressful position to be in for a bunch of volunteers because on one hand, yeah, it's great that our numbers are actually really useful. And on the other hand, are you kidding me? That's the best you can do with the entire resources of the federal government is get the data that we make every day by looking at websites.
Starting point is 00:26:35 I mean, it's really hard for me to understand at this point, like what was going on at the CDC? Like what is going on within the agency at this time? Well, I'll start by saying that we came into the pandemic with data systems that just were not designed to gather and process the kind of fast, high-resolution data that people wanted. Like the demand for data just went way beyond
Starting point is 00:26:58 what public health officials had ever encountered before and they were caught flat-footed. There was also the general organizational chaos of the Trump administration, which certainly didn't help. But there were a number of ways the CDC tried to compensate. So as an example, in April 2020, the agency started working on a new electronic reporting system that would collect detailed testing data from every state. It took a long time to get all the states onboarded to that system, like more than a year.
Starting point is 00:27:35 As that was happening, the agency was also using some of its existing surveillance systems and methods to track this new disease. Finally, CDC quietly launched a new website. It's, you go to, so for example, this is another press conference held by the coronavirus task force in early April, where they talk about data that the CDC has started to release. This surveillance data is bringing together our influenza-like illnesses with their syndromic management databases so that you can track- And just to parse that for you. Yeah, that would be helpful.
Starting point is 00:28:13 So what they were doing is that the agency had adapted a couple of their existing systems for this new task of tracking COVID. Their system that tracks unusual levels of disease in places like emergency rooms and urgent care centers. That's the syndromeic surveillance data. And they're flu reporting systems. And the way that the CDC tracks flu is that they sample the population
Starting point is 00:28:40 and then model a broader picture. So the data isn't comprehensive. And there are definitely some reasons for using the older system. For one thing, the states were already used to it. The United States, the states are used to using this system. It's in emergency rooms, it's in hospitals, it's in doctors' offices, and it gives you insight and you can see very clearly.
Starting point is 00:29:05 But the fact was these existing systems just weren't working that well. The agency was struggling to keep track of testing and case rates across the country. It was struggling to update hospital data, which includes really critical stuff like bed availability and ventilator supply. And with the hospital data, this is like a whole other story.
Starting point is 00:29:30 But the CDC was moving so slowly that eventually the agency that oversees them, HHS, just took over gathering those stats and built a much better and faster system. But the CDC still seems to be at the center of all this today. So does that mean that they eventually started gathering their stats themselves in a different way or updated their systems? Yeah, they eventually pivoted, but it took months before they started aggregating and sharing more of their own data. So they released their own data tracker in early May,
Starting point is 00:30:06 which was 15 weeks after the first reported case of COVID in the US, and more than eight weeks after the launch of the COVID tracking project. And even when they did that, they're continued to be problems with the data and big discrepancies between their testing numbers and state numbers. And is there a sense of why it would take so long? I mean like a few enterprises and journalists and a bunch of volunteers had something very quickly.
Starting point is 00:30:33 Why do you think it took the CDC so long? Well, I have reached out to the CDC a number of times to try and get their take on all this and they haven't responded. But I think a lot of times to try and get their take on all this, and they haven't responded. But I think a lot of critics of the CDC think there is something in the structure and culture of the agency that keeps them from moving fast and breaking protocol in an emergency. You know, I think there was an attitude among a lot of people in the CDC about not overreacting to COVID, basically like, oh, well, you know, if we design all these systems around the disease de jour, they're even have like a comment like this on the CDC's data modernization
Starting point is 00:31:15 page for one of like the conferences they had, you know, basically like quoting someone at the conference saying like, we can't't like over respond to the disease de jour. And I'm like, oh, I'm sorry. Did your other diseases de jures kill 600,000 people? Like, it's closer to a million now. I maybe we should be over reacting to this one. Seems reasonable to design things around this, you know? And I think that was really a big piece of it.
Starting point is 00:31:41 It was like they didn't want to custom design systems just for COVID. That was not what CDC wanted to do. But in an unprecedented situation with a new disease we had never encountered before, moving through the population in ways that we were only beginning to understand, we needed new systems and we needed the public health establishment to be as deep in the data as the COVID tracking project was. If you're not in the data every day or every few days, if you don't know how it's constructed,
Starting point is 00:32:12 you don't understand what's actually happening and like where the future hotspots are and the future places to be concerned, and you don't understand what it looks like when a state explodes with cases. So I know the COVID tracking project no longer exists. So how did they make that decision to end the project? Well, you know, about a year into the pandemic as Biden was coming into office, you know, vaccinations were happening, and the pandemic seemed to be on the way. And so that was when they stopped it. And part of it was that it was taking a toll
Starting point is 00:32:54 on the people involved. Like they had never intended it to be a long-term project. And even something as mundane as data entry had a high cost. While this was happening, we had family members dying. We had people we knew who were in those statistics. But there was another reason as well, which is that the COVID tracking project founders really thought the government should be responsible for this work. You know, we did not think that the public health data in widest distribution for the United States and the COVID pandemic should come from volunteer labor. And the Biden administration had promised to create a pandemic dashboard, which the COVID
Starting point is 00:33:41 tracking project people were excited about. They even helped advise on a framework for how to do it. But now, more than a year into Biden's term, that still has not materialized. Even though COVID-related data remains both very critical and quite confusing to understand. And you know, what made the COVID tracking project unique as an organization was that they were dedicated to researching and explaining all these various flaws and inconsistencies in the data.
Starting point is 00:34:16 There were other data trackers and other volunteer data groups, but it was the COVID tracking project that was really dedicated to that kind of analysis, which we still badly need. Data can't talk. Data can't explain itself, particularly when you're speaking either to, you know, this idea of a general public, but also to reporters, to anybody in media, to people, even in government agencies, the data has to be contextualized and explained, and that's still largely not happening. I should say, all the folks I spoke with at the COVID tracking project recognize that there are
Starting point is 00:35:00 some things the CDC does really, really well. There are incredible scientists at CDC and NIH doing this remarkable world-changing work on vaccines and all these other things. And the CDC did have some successes with data as well. You know, the electronic lab reporting system, the agency help build, has apparently really increased the speed and accuracy of state data coming into the agency. But we're still nowhere near having the kind of surveillance systems we'll need. The next time a pandemic happens.
Starting point is 00:35:40 If you talk to pandemic people, you know, like this was like a starter pandemic. I mean, it just could have been so much worse and it will be so much worse. You know, we know we are going to face worse threats. And the thing that I have never seen is any real reckoning in the federal government with what we didn't do, with the failures to build a real surveillance system. What did you do to make your life a better place? While I was working on this story, I ended up thinking a lot about this thing
Starting point is 00:36:17 that Stephen Johnson told me. He was one of the historians of public health. I asked him if he had followed the work of the COVID tracking project and what he made of it. I had two reactions in a way to the COVID data project, which was on the one hand, it seemed scandalous that they had to do the work that they had to do that that should have already been
Starting point is 00:36:37 underway. Yeah, that makes sense. That's my reaction, too. Yeah. But two was, there was part of me that was like, okay, these are the airs to John Grant. The amateur data collector who does it because they perceive something is missing in the system and There's not a lot of time to lose and they need to get in there and fill in this missing piece
Starting point is 00:36:58 That's a beautiful Tradition in the history of health and so it was part of me was kind of moved to see it kick into gear. I mean, I get that. I love that story, and I'm struck by the fact that John Grant is a romantic figure of a person like jumping in and filling in a need, in inventing whole new fields of science at the same time. But, you know, the story of the COVID tracking project, I'm really impressed by the people who did it, but it doesn't seem like a romantic story at all. Like, it really feels like a tragedy to me.
Starting point is 00:37:28 Mm-hmm. Yeah, I think it is a tragedy. Like, we shouldn't have to John Grant the pandemic. Not after hundreds of years of public health developments. Yeah, I mean, what we really want is a boring story where a bureaucracy just does its job. Yeah, competent bureaucrats. Yeah, here's to the competent bureaucrats.
Starting point is 00:37:58 Coming up next, we'll talk about some potential fixes for our public health data systems. We'll hear from a former CDC director and someone who has thought a lot about who gets erased from our current data and how to make it better. Stay with us. Support for this four-part series exploring the future of health and wellbeing comes from the Robert Wood Johnson Foundation, which is committed to improving health and health equity in the United States.
Starting point is 00:38:30 Knowing that the healthy, equitable future we all deserve won't simply arrive, RWJF is exploring how new technologies, scientific discoveries, cultural shifts, and unforeseen events like those in today's story may shape our lives in years to come. Through these explorations, they're learning what it will take to build a future that provides every individual with a fair and just opportunity to thrive no matter who they are, where they live, or how much money they make. Learn more about their efforts at rwjf.org.
Starting point is 00:39:02 If you like thinking about the future of things and have a hundred about the future, share it at shareyourhf.org. And if you like thinking about the future of things and have a hunch about the future, share it at shareyourhunch.org. I'm going there now. Okay, I'm selecting the prompt, I have a hunch. I have a hunch that the increasing misery of air travel will cause people to reconsider train travel in the US and it will be more popular than it has been for decades. Check out other hunches and share your own hunch
Starting point is 00:39:25 at shareyourhunch.org. All right, I'm back with Delaney Hall and we're gonna be talking about how to fix our public health surveillance systems, which that sounds really ominous when I say it that way but that's really what we're trying to fix. Yeah, this is the good kind of surveillance systems. This is the surveillance that we want.
Starting point is 00:39:45 So how do we fix them? I guess, spoiler alert, I don't think there is one clear answer to that massive problem. Yeah, I can't say I'm surprised to hear that. But I did speak with people during my reporting who had a range of interesting ideas about how to make things work better, both within the system and with data
Starting point is 00:40:05 in particular, and especially with race and ethnicity data, which represents one of the biggest failures in our current system. Oh, that's really interesting. Okay. So tell me more about that. So I will get to the race and ethnicity data a little bit later. I wanted to start with one immediate fix that came up in conversation with Alexis Madrigal of the COVID tracking project. And he said it really would have helped if the federal government had just been extremely clear with states about what information they wanted reported and how. Because then the data coming from the states would have been more standardized and easier to compare and analyze.
Starting point is 00:40:50 Probably the thing that would have made the biggest single difference on a data level would be if the federal government had said basically on an ultra, ultra, ultra precise level is what we need. Like we need it to come like from this system and answer all those small questions. And this was actually something that the COVID tracking project ended up doing in the absence of really clear guidance from the government.
Starting point is 00:41:16 They pulled together their own guidelines and distributed them to the states. Oh, that's interesting, but I'm guessing that if that guidance was coming straight from the government, it would probably be more successful. Yeah, totally. I mean, the government, believe it or not, has more authority than a volunteer effort, however impressive that effort was. The other thing is, even if the government had been really clear about how it wanted the data reported,
Starting point is 00:41:46 that would have made things better for sure. But it wouldn't have solved the underlying issues with the surveillance system. Those are much bigger and more complex. You can't fix the data system without fixing the broader public health system. This is Dr. Thomas Frieden. He was the director of the CDC from 2009 to 2017. He also worked as the health commissioner for New York City for seven years. And he testified before Congress in March 2021.
Starting point is 00:42:17 So this was around the time the COVID tracking project shut down. And he said in that testimony that, quote, lack of accurate real time information was one of the greatest failures of the US response to the COVID-19 pandemic. Wow. Okay. That's definitive. Yeah. I mean, he said that our data systems are broken basically from the top to the bottom. And he said that it is not just the CDC's fault here. So I think saying, well, CDC couldn't get the data together, CDC was dealing with local and state health departments that were overwhelmed and couldn't collect the data,
Starting point is 00:42:58 hospitals that didn't have standardized data, and laboratory testing that was insufficient and the contact tracing system that never really worked effectively in most places. Wow, I mean, that is a wide range of failures. I mean, I imagine this all goes back to both the fractured nature of our public health system and the way the states and the federal government really don't work hand in hand when it comes to these things. And then we also talked about the hauling out of these institutions and systems that has been going on for decades, really.
Starting point is 00:43:29 Right. Our system got to this very bad point, thanks to underfunding it for decades. And we talked about this a little bit in the piece, but we're already seeing the cycle of panic and neglect as it's known, kick in yet again. So a crisis happens, money and resources pour in, the crisis fades and the money goes away. And this is just not a good way to fund a system that needs to rebuild some of its critical infrastructure from the bottom up. You can't build a sustainable system with one-time dollars.
Starting point is 00:44:06 And he really wants to see us build that sustainable system. He knows everyone in public health knows that this kind of real-time data collection is important. But Dr. Frieden says it's going to take years of investment to fix it. There needs to be an agreement about a national architecture for data gathering and sharing. The government needs to be able to hire really talented programmers.
Starting point is 00:44:33 We need to find workarounds for some very tricky problems, including the fact that we don't have national health identifiers in this country, which means that tracking people across different systems is a big challenge. There's just, there's a lot to sort out. We need a multi-year investment to modernize it. It's not just a matter of replacing fax machines with an electronic secure interchange. But as we also heard in the main story, this is not just about money or technology. Those
Starting point is 00:45:08 things are definitely important. But there are elements of the CDC's current culture and how it interacts with local health departments that also needs to change. Right. So my impression of the CDC is that it's a very scientific and academic organization in terms of its outlook. They do very careful analysis before they release anything or recommend anything, and that's not always what the public demands. Yeah, I think it's safe to say that the agency tends to move slow. It's also freedom-pointed out, sometimes subject to political oversight and vetting that can contribute to that slowness.
Starting point is 00:45:50 But for whatever reasons, I think it moves at a different pace than people in local health departments who are frontline responders. They need to move fast, sometimes with just the best data available at that moment. And so freedom thinks there should be more movement sometimes with just the best data available at that moment. And so, Frieden thinks there should be more movement back and forth between the CDC and local
Starting point is 00:46:11 and state level health departments, so they can understand each other's needs. There are two few people working at CDC headquarters in Atlanta who have worked for two or five or 10 years at a county or city or state or global health department embedded to understand that if you need an answer sometimes it's in the next four or five minutes not in the next four or five hours and certainly not next four or five days. And so what Dr. Frieden has proposed is having thousands of staff on the CDC payroll who are actually
Starting point is 00:46:46 embedded in county and city health departments for a few years and who then rotate back to CDC headquarters. That's an interesting suggestion. I mean, are there any indications that the CDC is seriously looking at this or changing its culture in any way? Well, I mean, I have not heard of anything like what Dr. Friedin is proposing, like a cultural exchange between the CDC and local health departments. I also think that the COVID tracking project founders would say there is nothing close
Starting point is 00:47:19 to the level of soul searching that they would like to see happening at the CDC right now. Like a real reckoning with what went wrong during the pandemic. But you know, there are some new efforts at the agency. Like for example, a center within the CDC called the Center for Forecasting and Outbreak Analytics, the CFA. Okay. So what's the CFA supposed to do? It's being built as a weather service for disease, a group that can forecast outbreaks, which is interesting and challenging work. And how it relates to data is that the quality of any given model and its resulting forecast
Starting point is 00:48:02 depends very heavily on the quality of data that goes into it. So in our current system, we're even simple metrics like test positivity rates or hospitalizations are ambiguous, that is going to be a problem for the pandemic modelers. Totally. I mean, if the CFA is going to be successful, they've got to sort out that data stuff from the get go because you can have the greatest model in the world and all these people willing to do it and they could predict amazing things. But if the data is not there, then it does not matter. Yeah, the data stuff has to come first. It's foundational. Yeah. And then finally, there's one other aspect of the data that we should talk about because
Starting point is 00:48:45 it represents a huge gap in our current knowledge. And that is how we collect or rather do not collect data related to race and ethnicity. Like if we're going to be rebuilding our data systems in the way that Friedin is describing, it's worth thinking through this question in particular. Yeah, I remember when I was following the work at the COVID tracking project, like this was an issue that they really focused on. Yeah, it definitely was.
Starting point is 00:49:15 The COVID tracking project ended up developing a whole wing of their project devoted specifically to race and ethnicity data. They did that work in collaboration with Dr. Ibrahim X. Kendi, who runs the Boston University Center for Anti-Racist Research. And early in the pandemic, Dr. Kendi wrote a series of essays in the Atlantic where he argued that we really do not know who's being most impacted by COVID-19 because the data around race is just so limited. And so why is that? Is it that race and ethnicity information is just not gathered? Is it just not
Starting point is 00:49:54 shared? Like, what is the breakdown here? The data is insufficient in a number of ways. And to help explain how I'd like to introduce you to Abigail Echo Hawk. We all know somebody. We all know somebody who is impacted by COVID-19. We all know somebody who is impacted by a death even if we weren't ourselves. Echo Hawk is a citizen of the Pawnee Nation of Oklahoma and she's the director of the Urban Indian Health Institute in Seattle, Washington. It's one of 12 tribal epidemiology centers in the country. And Echo Hawk says that it's clear native people were disproportionately affected by COVID, even just anecdotally, like she said.
Starting point is 00:50:38 Everybody knows somebody, but that it is impossible to know just how much. Even when data was collected, they weren't collecting the race and ethnicity of American Indians and Lasca Natives and many other people of color. So while we know the impact in our people was great with the scarce data we had, we know it's a gross underreporting. And what's interesting is that the COVID pandemic has recently focused people's attention on this issue. Like, this is something people are now talking about, the fact that the pandemic disproportionately affected people of color.
Starting point is 00:51:14 But Echo Hawk has been interested in the issue for much longer. Because ever since she started her career in public health, she has seen the ways that native people and other people of color are made invisible in the data. And it happens through a couple mechanisms. One is by virtue of being a small population that can be difficult to gather, quote, statistically significant data about. And I would be in meeting after meeting after meeting where we would be a little asterisk down at the bottom that would say not statistically significant or not
Starting point is 00:51:51 reported on. And so what it was is we were invisible and we were invisible in conversations that policymakers were having. We were invisible when they were allocating resources and what I saw was incredible health disparities and the deaths of my community members, of my family members as a direct result. So that's one way the data around ethnicity is lacking. Another thing that happens is that people will sometimes be given options on a form, like maybe black, white, and other. It's like a limited range.
Starting point is 00:52:21 And other might be the only option that applies, so they check that box. That could include Japanese people. It could include American Indians and Alaska natives. It could include other race or ethnicities. And even when you fill that in, they never disegger get it. That means that they kind of put it all together and they just count that other. What that does is it effectively hides what's happening to a particular population of people. And this issue of aggregation and disaggregation is important. So, disaggregating the data just means breaking the data
Starting point is 00:52:57 down into smaller units or segments, instead of bundling a bunch of it up together, which is what happens in the other category. It also sometimes happens with people who are multiple races. Say somebody like my children who are Mexican-American and also American-Indian, and they mark on a form Hispanic, they mark American-Indian, and they mark Filipino, because they're also Filipino.
Starting point is 00:53:25 And when the data is calculated instead, they put them into a category that says multi-race. But they don't just aggregate it in a way that says they are both Hispanic, they are both Filipino, and they are also American Indian and Alaska Native. I mean, anyone could look at this and know that multi-race is a meaningless category that probably doesn't yield very much information at all. Someone could look at this and know that multi-race is a meaningless category that probably doesn't yield very much information at all. That's right. It is not a very useful category.
Starting point is 00:53:51 And what's interesting is that these other broad categories that we use a lot, like Asian American as an example, when that's used in public health data, it hides the fact that Asian Americans are an incredibly diverse group, with very diverse experiences related to health and illness. Finally, Echo Hawk described yet another way that people of color get erased from the data, which is racial misclassification. Racial misclassification is when you go in and instead of asking you what race or ethnicity you are, somebody might look at you and instead check white based on visual appearance. Check black based on visual appearance. Oh wow, so she
Starting point is 00:54:41 means that whoever is filling out the form doesn't even ask. Like they just make an assumption based on appearance and they just fill it in. Yeah, that's right. And Echo Hawk says it happens all the time. And it disproportionately hurts people of color by making the data around their existence and their health issues just fuzzy and incomplete. So does Echo Hawk have ideas about how to change the way we collect data so this type of stuff doesn't happen? Yeah, she does.
Starting point is 00:55:10 She talks about decolonizing data. And in addition to basic stuff, like disaggregating data and allowing for greater nuance in race and ethnicity categories. She wants to see communities be more involved in deciding what gets gathered and shared about them. So she talks about how there's a deficit-based framework in public health, where in the case of her community, the data always shows high rates of obesity, high rates of diabetes, you know, health challenges. But she also sees a lot of strengths in her community, strengths that can actually
Starting point is 00:55:51 measurably improve health. And so she'd like to see data gathered around that too. Yes, we need to know the gaps, but we also need to know how do our youth see themselves in the future. If you can see yourself in the future, that's a protective factor against suicidality. We also want to know where their cultural ties are. Are they culturally engaged? Do they have the access to the resources for their cultural engagement? We want to use the strengths of our community, our cultural protective factors. All of those things are things that
Starting point is 00:56:25 can be measured and they can be weighted against the gaps. And her bigger point is just that data should serve the community and the needs of the community. It shouldn't just be to study the community and write academic papers about it. It should be actionable and lead to better health for the communities it comes from. Well, this is really fascinating and interesting, Delaney. Thank you so much. And full disclosure, like a lot of people who work at the intersection of health and justice,
Starting point is 00:56:56 Abigail Echohawk has received funding from the Robert Wood Johnson Foundation, the group that also funded this episode. the group that also funded this episode. 99% of visible was produced this week by Delaney Hall, music by Swan Rial, sound mix by Amidic and Atra, fact checking by Graham Haysha. Kurt Coleset is our digital director, the rest of the team includes Vivian Leigh, Joe Rosenberg, Christopher Johnson, Emmett Fitzgerald, Lashemadon, Jason De Leon, Martin Gonzalez, Sophia Klatsker, and me Roman Mars.
Starting point is 00:57:32 We are part of the Stitcher and Series XM podcast family. Now headquartered six blocks north in the Pandora building. And beautiful. Uptown, Oakland, California. You can find the show and join discussions about the show on Facebook. You can tweet at me at Roman Mars and the show at 99PI org.
Starting point is 00:57:49 We're on Instagram and read it too. You can find links to other Stitcher shows I love, as well as every past episode of 99PI at 99PI.org. Thanks again to the Robert Wood Johnson Foundation for underwriting support of this special episode. Keep an eye out for each episode in this four-part series, The Future of Dota-Dota, and remember, if you have a hunch about the future, share it at shareyourhunch.org. you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.