Microsoft Research Podcast - 123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild

Starting point is 00:00:00 The bottom line is that there's an infinite number of stories that any news publication can produce in a given day. Tons of stuff happens. And outside of very, very few select things, almost every story that even the big ones that permeate the national consciousness are choices that news media makes. What is more important? The selection of topics? What do you choose to talk about? Or how you frame the topics? And both are important. Don't get me wrong that there are different ways to frame any article or any concept. And there's a lot of really interesting stuff for that. But ultimately, the way that we perceive bias, the way that these

Starting point is 00:00:41 stations, these news sources are very different to us and the way that they really are different is mainly driven by their selection of stories. Welcome to the Microsoft Research Podcast, where you get a front row seat to cutting edge conversations. I'm Hunt Olcott, an economist at Microsoft Research in Cambridge, Massachusetts. I'll be your host today as we talk about the rise of political polarization in the United States and the role of the news media. The background for our conversation is this. There's been rising political polarization in the United States over the last 60 years. In 1960, about 5% of people said in polls that they'd be upset if their child married someone from the other political party. Now that number is 38%. Striking numbers of people say that people from the other political party

Starting point is 00:01:32 are closed-minded, unintelligent, immoral, lazy, and unpatriotic. And one recent study showed that when selecting people for a scholarship, people discriminate more against the other political party than they do against people of other races and ethnicities. So many factors have contributed to this in some way. Since 1960, the political parties have realigned so that conservatives are reliably Republican and liberals are reliably Democrats. And our political identity is becoming increasingly salient because it's connected with other identities such as race, religion, whether you live in a city, and your personality. But another factor that people talk about a lot is the role of

Starting point is 00:02:15 the media. Have changes in media technology generated echo chambers and filter bubbles that make us less informed and more polarized. In the 1990s, we had cable news that allowed us more choice. In the 2000s, the internet allowed us still more choice. And in the 2010s, social media continued this trend and then have generated tremendous personalization in the information that we receive. There have also been rising concerns about misinformation. Thus, it's against this backdrop that I'm so excited to talk with David Rothschild. David is one of the world's leading experts on the media and on public opinion. And in the past couple of years, he's put together and begun to analyze perhaps the most comprehensive data set

Starting point is 00:03:00 on Americans' media consumption. David is an economist at Microsoft Research in New York City. So I wanted to talk with David about all these topics. What are the basic facts on where Americans get their news and information? How worried should we be about echo chambers and filter bubbles? Is fake news actually a problem? And as we look forward into the next five, 10 years, what factors could change the media ecosystem for the better or for the worse? David Rothschild, welcome to the Microsoft Research Podcast. Hunt, thank you for having me. And that was an amazing introduction. Well, let's put the introduction aside for a second, because I want to talk about you and

Starting point is 00:03:43 your background. You went to Brown and you majored in civil engineering and in history. What were you interested in at the time and how did you end up transitioning into economics? So when I went to Brown, I was pretty open to the type of work that I wanted to do. And I went in the fall of 1998. And I think if I'd gone a few years later, I may have taken a very different path because at the time, computer science was on the rise, but wasn't as dominant as it is now. And data science and kind of the outgrowth of computational data science where data science meets social science and computer science also was several

Starting point is 00:04:23 decades on the horizon. So I took a more traditional path to trying to get both my kind of social science and more technical skills. And so I went into engineering where I learned a lot of the thought processes and similar kind of optimization strategies that we do here in economics, as well as working out my social science brain in history with a good deal of classes in both political science and philosophy to boot. So I can't say enough about Brown's open curriculum that allows you to take classes in a bunch of different fields. And then post-graduating from Brown, I actually worked on some political campaigns, had a great experience, learned a lot,

Starting point is 00:05:01 and then went back to grad school in a topic that I took no classes in as an undergraduate. I'd become fascinated by economics by the way in which at the time, economics was at the forefront of using computational methods to solve larger social science problems. Economists were everywhere, butting into other people's fields, and it seemed like a great place to be. And at the University of Pennsylvania, and the Wharton School in particular, I worked with a bunch of very empirically minded, very focused on solving real world problems economists, such as Justin Wolfers, my advisor, Betsy Stevenson, Joe Wolfogel, and a bunch of other economists who were in other departments like Mark Meredith and others

Starting point is 00:05:46 out of political science. And so it was a great experience and really led me on a great pathway. So I want to come back to your first political campaign. What was your first campaign and what did you learn? So when I graduated college, I went down to Texas and I worked on the Texas coordinated campaign for the Democrats. The incredible thing about that experience is that what you see right off the bat is that people right out of college with very limited experience have incredible roles to play in forming our democracy and making things go forward. So I show up, I turn into essentially the assistant field director for merchandise. So anything that showed up at rallies or is distributed around to supporters, on the phone

Starting point is 00:06:31 with vendors and thinking about colors and distributing logistics around the state of Texas. And it was quite an experience, the responsibility. And then that parlayed into work on both the Dean campaign and Kerry campaign and others in 2004. But ultimately, it really taught me a lot about logistics and other work directly. But on a larger scope, it really taught me a lot about the way in which our political system rests on people with a lot of energy, low pay. And it reads a lot of thoughts about the effect of campaigns and a lot of other work that

Starting point is 00:07:06 I ended up thinking a lot about in grad school and beyond. And so what was the moment when you decided then that you needed to do a PhD in applied economics, having come from the civil engineering and history background? So I decided that I wanted to do kind of applied economics work because we were getting confronted by a lot of questions in the early 2000s about public policy and the impact. And I wanted to think about the best mechanisms to actually be able to answer them in a meaningful way. And so at the time, things that were kind of flying around were questions about war and peace, Iraq and whatnot, questions about the impact of massive civil changes. We were looking at gay marriage and other questions that were on the forefront. And always, always on the horizon as well are massive shifts in tax policy that we were

Starting point is 00:07:59 seeing. There was a lot to think about, about both the impact of this type of policy and how people thought about it. There was a lot to think about, about both the impact of this type of policy and how people thought about it. But one of the kind of key things for people who are my age and who kind of came to political awareness in the early 2000s especially, is this idea of how things were communicated to us and how people understood them. I didn't know a single person who thought that the Iraq war was a good idea, per se. And the media did not reflect that. And it was very confusing. What kind of economy produce a media that is so not reflective of what felt like the truth or what felt like at least the

Starting point is 00:08:40 prevailing opinion in large circles? But more important, how was the media sitting there as a conveyor of this public policy? How was it in the middle between public policy that's being created and how people understood it and what was going on? A lot of this time kind of made me focus on how policy itself, it definitely also had me thinking a lot about the creation of how people perceive things, how things go from public policy to the media, to the people and how that goes back. And it was a very difficult time for a lot of people to kind of understand that route. And it's definitely led to a lifelong pursuit of understanding that loop and the economies behind it, how we ended up where we are and where we're going to be moving forward.

Starting point is 00:09:39 I want to take that exact motivation and then fast forward David Rothschild's life another 10 years or so. And so now it's 2016-ish. And I remember you starting to talk more and more and very excitedly about project ratio and that collection of data. So tell us about project ratio and the data you're collecting. Yes. So coming out of the 2016 election, the main chatter, the main conversation in academic and popular circles was fake news and social media. And I went to a lot of conferences in early 2017 where people in my space were discussing how do we fix things? Should we talk more about news literacy?

Starting point is 00:10:24 Should we ban more about news literacy? Should we ban certain actors? Should we take up other types of ways in which we protect the democracy for this problem? And working with Duncan Watts, who is a former colleague of ours and is now at University of Pennsylvania, Marcus Mobius, who is also a colleague of ours in your lab up in Cambridge, we started to question whether or not there was a real hunt for a solution without understanding the problem. Because there was a lot of basic facts out there that we realized we just didn't really know. Like really, really basic facts. How much news do people consume? What type of news do people consume? What type of news is really being produced? If you wanted to get into social media, what type of news is on social media?

Starting point is 00:11:12 How big are these fake news questions? All of this type of things are very simple, very descriptive, and seemed very necessary. And so we launched a program to work with various groups, some things in Microsoft, but mainly actually third parties with academic and non-academic groups to try to bring data together. And on the production side, that meant supercharging our version of getting all of the news articles that are put into the internet in a given day, especially in English language, but also some foreign language, working with third-party vendors to get transcripts of television and radio. So to get a pretty good view about what's being produced in the media ecosystem. And then also then thinking of the consumption,

Starting point is 00:11:59 working with various third parties that track news consumption to understand what people are consuming on TV, radio, and increasingly tricky on mobile phones. And then kind of on the third level of that is thinking about what's being absorbed. So working with various groups and directly surveying people to understand what they know and trying to think of this pattern of production to consumption to absorption, and it's led to a great deal of reasonably impactful, I think at this point, academic work that has been somewhat descriptive in nature and is increasingly causal. But even just those big descriptive questions seem to run counter to a lot of what was common sense coming out of the 2016

Starting point is 00:12:41 election. And I think it's led people to focus their direction where problems may lie. So as part of Project Ratio, David, you have collected a really nice array of different data on media consumption and production. But as you're doing that, tell us how you made sure to preserve people's privacy. Privacy and security is always forefront in the research that we do and in the larger computational social science community at Microsoft. Basically, we're looking at production data. This is generally things that are produced and widely distributed. And so it's not much there. The key part is on the consumption metrics.

Starting point is 00:13:26 We are using opt-in panels with very, very clear rules on the use of the data. More importantly, in many ways, we are stripping any personal identification data. And when I say we, I say it's stripped before I even see any of this data. So I, as the researcher, am never seeing anything that could be identified back to a human being. And basically, these are people who are volunteering to have things tracked so that we can better understand the world. And we have seen a lot of research on aggregate usage compared to these usage. You would think, oh, that would change the way that people behave. But no, no, no, it's pretty good. People forget about it. They consume news and other media in the same way as they would generally, but it's a really safe and secure way in which we are able to take a peek at media consumption and do so while preserving people's

Starting point is 00:14:15 privacy and security. So you have a database of all of the text of all the news articles that have been written in the last four years? For the most part, yeah. And it's a very tricky question, what has been written. Obviously, as new publications come and go, articles get rewritten during the course of the day as various mediums pop up and down. So one thing is what shows up on desktop or shows up in things we're naturally creating versus what gets written in a more fleeting nature. But for the most part, I think we do a pretty good job in covering what's being written with a high probability of being consumed. And similarly, looking at video in the same context. Audio is a lot trickier when it comes to radio. We have a little less understanding about what's actually

Starting point is 00:15:05 being consumed and of course, podcasts, which we're increasingly interested in. But for the most part, our attempt really is, at least in terms of what people are consuming, to get a really good understanding of what's out there. And you see, and this is one of the interesting things about the newspaper data. It's not just that you have a record of all the content of the news articles, but you see at, I think it's 15 minute intervals, where the article is on the homepage. And so, whereas I can go to LexisNexis and maybe pull down some text of some articles, you know how long it was on the homepage and other aspects of how that was prioritized. I think that's pretty cool. That's exactly right. And I think it does paint a very different portrait because if you pull from an RSS feed or you go to LexisNexis,

Starting point is 00:15:49 what you'll see is there's a lot of stuff that's written out there that's really not very well consumed. The example I'd like to point out is Fox News' website. While the nature of the site has changed a bit over the years, the design has been pretty stable. Five really big articles up top, and then essentially an endless scroll of articles underneath it. Those five articles get the vast majority of the consumption when people go to the page. And they're different in nature, because what's being highlighted, but also a lot of the vast scroll is really just reprinting of wire service articles to kind of fill out the fact that they cover everything. But does it really matter? At the end of the day, I think we care a lot about what's

Starting point is 00:16:30 produced for consumption, not what's produced to go and fill that empty void, especially when production in some sense is so cheap because it could just be a reproduction of a wire service article, never really meant to get much consumption. So one of the real challenges in this data collection space is looking inside of social media apps, especially inside of Facebook. So how much are you seeing there? So we have explored Facebook in two different ways. First, working with Duncan, who's now at University of Pennsylvania, Duncan Watts, we've looked at the Social Science One dataset, which is a dataset that Facebook put out in conjunction with Gary King at Harvard. And it basically is a list of all URLs that are shared,

Starting point is 00:17:19 consumed above a certain threshold on Facebook. And it gives a pretty interesting portrait. And it's definitely something that provides a good idea about what's being used on Facebook. And then the second thing is that working with a third party that's provided us some browsing data, what we're able to see is where people go when they leave Facebook. So it's not a perfect proxy for where people go when they're on Facebook, but it does show us something about the news consumption proxy for when they click on something and then go to an external link. And the one thing that I really want to emphasize with this is that Facebook is super, super interesting for so many reasons and has tons of consumption.

Starting point is 00:18:07 It has changed dramatically during the course of the last four years as it has basically moved around its thirst for news. It keeps on changing the way that it displays news and how much news it has, but generally it's lowered the amount of news content in its feed over time. One really interesting finding that we see from the Social Science One data is that basically people share news at a very different rates than they engage with news, so like or whatnot news, to how much they consume news. And so what Facebook makes publicly available to people is the rate that various news URLs are shared. And this comes out of the CrowdTangle data, which is publicly available. And you'll see this on Twitter. A lot of people talk about it. And basically every day,

Starting point is 00:18:56 the top shared stories on Facebook, the top shared URLs on Facebook are generally filled with pretty hardcore right-wing or, depending on how you describe it, anti-left-wing publications. This reflects the fact that right-wing content and also fake news content is particularly heavily shared at a much higher rate than mainstream content that is conditional on being consumed. And so while most news content on Facebook that is consumed is actually mainstream, the top URLs that are shared are actually not. And so it paints a very different portrait. And when you get into the data, you get to see these various quirks that lead to very different understanding of what may or may not be happening on these various important locations.

Starting point is 00:19:46 So with that backdrop of what you have in your data set and your secure underground layer... Well, I want to cut you off right there. Secure underground layers are not necessary. Most of this data that I've talked to you about is production data is just generally public. It's about pulling it together, right? And then the data sets that we do have on consumption are stripped of any personal identifiable information so that there is no need for an underground SQL layer. Okay, fair enough. It was cooler in my mind when you had it all in like a bunker somewhere in the middle of the country. We can talk at another time about identity resolution and tracking people across various

Starting point is 00:20:28 things, which is ultimately one of the greatest hurdles to social science research, computational social science on a larger scale, and ultimately all sorts of behavioral research for companies and people in the future as various identification changes and underground layers become more or less safe than they are now. Fair. So bracketing that with the data that is not stored in an underground layer, how much news do Americans actually consume? So one of the things I like to emphasize first and foremost is that the vast majority of Americans consume extremely little news. So anyone listening to this podcast is super, super abnormal in their news consumption, I would assume, on average at least, due to the

Starting point is 00:21:13 fact that you're listening to a podcast with a bunch of academics talking about news consumption. The median American consumes zero news web links a day. So the Median American goes to zero websites that are dedicated to news each day. The most highly consumed news websites by far are the news aggregators. So either coming out of search or going to MSN, Yahoo, or AOL. And yes, AOL is still one of the most dominant news sites in the country. And one of the interesting things, of course, to think about with those aggregators is that this is all mainstream news. So for whatever you believe about mainstream news or not, you're not getting fake news websites showing up inside of MSN or Yahoo or AOL, or for that matter, ending up on top search results. And then people consume incredible amounts of television. Let's forget news for a second, but one of the most shocking things about looking at behavioral data sets, media consumption in general, is just how much television people consume each day.

Starting point is 00:22:14 Hours and hours and hours. Of course, a lot of this is driven by long-tail consumption of people who leave TVs on all day long. And then conditional on news is a much more dominant feature on TV than it is online. And so for those people who spend a lot of time on TV, news is a non-trivial portion of it, whereas it is a much smaller percentage of online consumption. And ultimately, the vast majority of news consumption comes from television. It comes in three really important ways. Number one, local news is still super popular. It's diminishing a bit and it's in three really important ways. Number one, local news is still super popular. It's diminishing a bit and it's definitely concentrated in older folks. Number two,

Starting point is 00:22:55 people still watch the network nightly news. And so while this is something that seems kind of abstract to a lot of people who especially study and don't think too much about what the average American does, tens of millions of people watch that 6.30 PM nightly news that Walter Cronkite used to run. And then finally, of course, cable news is on the rise and is an increasing portion of people's news diet. And so one thing to take away is that the vast majority of people consume very little news. And so when it comes to academic research, when it comes to the popular questions, people obsessed over this margin of folks who are consuming crazy shit on social media. And this is very important for radicalization. It's very important for idea formation. There's a lot of reasons why we should care about fake news or radical news on social media. But if you care about the average American,

Starting point is 00:23:46 if you care about the marginal voter, the margin that you need to be worried about is just how little news they're consuming and where are they getting their tidbits of information from. Whether or not it's just simply from seeing some stuff on a home screen on their mobile phone or on their desktop as they go about their day, or whether or not they're just getting segments of news that may show up on their morning show or may show up as they just kind of casually scroll through the local news or the evening news on their way to watch another sitcom at night. And so this is definitely not as sexy as studying Russian hackers. But if the thing you care about is kind of the marginal American, the median American, a lot of it is about low news consumption and where do they get their random tidbits of

Starting point is 00:24:30 information from. So if the problem is in your mind that people just aren't getting enough news, how do you get people to watch more news? I don't want to take a strong normative stance there in the sense that it may be perfectly reasonable for people to consume very little news. I don't want to take a strong normative stance there in the sense that it may be perfectly reasonable for people to consume very little news. I'll reflect on Brian Kaplan's rationality concept, this idea that it's perfectly rational for most people who are busy to actually not learn that much about a lot of things they need to make decisions around because maybe they just take social cues and maybe it's just not that big a deal. Maybe it's better. I think it's important to remind people that even in kind of the golden age of newspaper consumption in the early part of the 20th century into the kind of golden age of

Starting point is 00:25:20 nightly news consumption, literacy was still reasonably low in some places. A lot of people didn't have TVs to watch Walter Cronkite. There was a question about a shared reality because conditional on watching news, a lot of people were getting it from the same stores, but there's always been a large portion of Americans that consumed relatively little news. I don't think that's something we should normatively freak out about, as much as study and understand where they do get the information they do have, because we care about it, not necessarily saying that they need to get more or less, but just kind of understand what they do get and think about what that means. That being said, if we want to increase news, one of the key things is to think

Starting point is 00:25:59 about is where news is getting to people who have a low propensity to seek it out. And so this kind of theory around news consumption that says that what we should really be focusing on is places where news has a high marginal impact. Just think about the amount of news any individual gets, look at those news sources, and then look and say, what is the news that has this really high return? And where do you kind of squeeze additional news in there, squeeze additional good news, whatever that may be? And how do you motivate it? I guess it's your question as well. It's a tough question. So now this is a market in which people are building or not building various news into the somewhat limited resources they have. And I'll emphasize this question, which is, you know, as much as we think of the internet as a limitless landscape, there's still a finite amount of kind of front page real estate on things that

Starting point is 00:26:51 people go to, right? So as much as I could create tons and tons of content and throw it up everywhere, there's still a limited amount of real estate in the places that people show up and randomly get exposed to things. And so this goes back to the question we were talking about before. So Fox News can publish a billion articles, but there's only one section at the top where they have five articles where basically everyone's consuming. Similarly, landing pages exist in both audio and video content as well. And so there's a limited amount of front page real estate and how people are incentivized to share meaningful news content.

Starting point is 00:27:26 And is it still a continuous tricky question depending on who is asking and how much people actually want to talk about the portals, the Yahoo, Microsoft Network, AOL, because those are sites that presumably collect a lot of casual news consumers that are just getting on MSN, for example, because it's the Edge homepage. This is a real opportunity then to allocate that real estate in different ways. And I'm curious, how would you do it? How would you prioritize different types of content? One of the biggest problems that we have

Starting point is 00:28:20 when thinking about how to allocate limited resources to get people to consume things that we may think are normally good for them, um, is that there is a bit of a really interesting game between what people want to consume in the new space and what maybe we think may be marginally most beneficial in the sense that, you know, when a algorithm's turned on and you do a bunch of AB testing and what you'll see is that lighter entertainment stuff does very well. We see that possibly catchier and clickbait per se by definition also has a lot of popularity. And so it's a really tough question. Let me address it this way. One of the really interesting changes that we've seen over the last few decades is the idea of moving content

Starting point is 00:29:05 away from a large collection, which is meant to be consumed in concert with each other, to segments of information that are meant to be consumed or purchased separately. And so as we move into a world of news aggregators away from, say, a single homepage of an actual news source, you move from a world in which you have basically many different stories about many different things that are meant to kind of inform and create a complete picture versus a lot of repetitive stuff, which is meant to grab people one at a time. And so the type of stuff that's produced becomes different, as well as the distribution of the type of stuff that's produced becomes different, as well as the distribution of the type of stuff that is put up onto the site for people to consume.

Starting point is 00:29:48 Once you're into the realm of being an aggregator, what would I optimally hope that you would want to do? Well, you need to both on one hand inform people, but you also want to convince them to keep staying. And so you do need a mix of various goods up there in order to do that. And in a way, the optimal space, and sometimes in my mind, may conform back to a healthy diet of goods. But of course, the more you go in that way, the more you lose people because you don't have the type of stuff that draws people in one at a time, in which maybe one more story about the same thing, even if it doesn't add much

Starting point is 00:30:26 marginal benefit to the people who are consuming it. So I think it's a very tricky question. Then the second question is not just the distribution of the type of goods, but what type of goods to put up there. There's a tendency to want to provide a safe space that provides a balance of various points of view. That becomes a very tricky thing from the aggregator's perspective because trying just to mix various publications because of their points of view may lead to false equivalencies. I think that the most important thing for news aggregators is to think a lot about the quality of the pieces that they have, and not just on a publisher level, but on the article level themselves,

Starting point is 00:31:06 and accept the fact that even mainstream news has a lot of issues there, and focus on providing things that balance what people enjoy with what normally we think is important at any given time. And so it's a very tricky choice, and I don't envy the types of tradeoffs that need to be made in order to keep everyone happy. So having laid out the data that you have at Project Race Show, we've talked about sort of the overall descriptives on news consumption. I want to transition into another area, which is ideological segregation. So, you know, one of the big concerns over the last 20 years has been that the Internet will allow us to segregate ourselves into different silos where liberals read liberal news, conservatives read conservative news, and so on. And you can actually provide some data on that in ways that the extensive previous research hasn't been able to do. So tell us about the work that you're doing on understanding ideological segregation and how that sort of builds on and extends what's been done in the past. Yeah, so there's been a lot of really interesting research in trying to understand

Starting point is 00:32:23 ideological segregation. A lot of this is built out of filter bubble discussions that came out maybe a decade and a half ago now with Cass Sunstein and Eli Parsner, worried about where algorithms were driving people. Were they driving people into kind of questions, I think, about larger echo chambers, which I'll say are more kind of just where you're in a place in which people are reverberating back the things you're saying. So one is filter bubbles, kind of this concern about literally the algorithms pushing you into a corner, and then this echo chambers concept that you may be in this corner where you're just hearing back what you want to hear or what you were already saying. And then this larger question is just ideological separation. I think that the prevailing research is pretty clear at this point that there isn't a huge amount of ideological segregation in online news consumption simply because news aggregators are such a dominant force. This is really great work by Andy Gass

Starting point is 00:33:19 and a few others recently have really taken a great look to see that when you look at news aggregators, which almost by definition have a mix of mainstream news sources, centrist news sources, it's hard to find that too many people are in ideological segregation. The really interesting stuff coming out of that work as well, too, is to show that the few people who do consume hard right or hard left news sources, they also tend to consume a lot of mainstream news sources. And they also tend to even consume things on the other side, because these are news junkies, right? So there just aren't that many people who are consuming Huffington Post a couple hours a day who aren't also looking at other places. And similarly, Breitbart on the other

Starting point is 00:33:58 side. And I think this is not surprising when you start thinking about it. You're pretty into the system. You're pretty into news if you get that far. And so you're going to be looking across various spectrums. Television, a different story. And so a lot of the work that we've been doing is trying to understand increasing segregation on television. And it really comes down to how you define various stations or groups of stations. But at the end of the day, people are abandoning network

Starting point is 00:34:26 news. So even though local news and the national broadcasts are still relatively popular, people are abandoning them and they're moving to two things, cable or nothing. And as people move into cable, you're getting increasingly segregated pockets who are looking at Fox News all day or MSNBC all day, or maybe MSNBC and CNN together, if you describe that as a bubble. And I think the other aspect of that is that one of the things we're seeing is that it's not just a consumption. On the production side, you can see where these news channels are dividing more. So the three network news sources cover relatively similar things to each other, whether or not that's centrist how you define it or not, that's up to the reader to decide. But what you can tell is that Fox News covers different stuff

Starting point is 00:35:16 than MSNBC. And so it's not just that people are separated, it's that they're separated into worlds in which are describing different things. They're having a very different news diet, a very different understanding of what's important in the day. And this is meaningful. And it is a non-trivial portion of the population that is out there. And it is definitely something that I think people should be considering, what is the

Starting point is 00:35:43 impact of this and how much we should care about it? So you're painting a picture where the mainstream three networks are losing market share to cable news. Now, in Jens Kohn Shapiro's 2011 paper, they had self-reported data on what news outlets different people were watching and what their political ideology was. And the sense you get from that paper is that although cable news is more ideologically segregated than the networks, it's still less ideologically segregated than the Internet and less segregated than social media has been shown to be in other papers. And is that still broadly true in the actual metered Nielsen data as well? Or are you seeing higher levels of ideological segregation on cable? We're seeing higher levels of ideological segregation on cable. So we are seeing that there's almost a magnitude more people who are in what we would define as an

Starting point is 00:36:51 ideologically segregated group of content on TV, especially focusing on cable, than what you'll see among folks who are online. It's a little tricky, just to be clear. There's a lot more people consuming news on TV. So this is unconditional. So just there aren't that many people who consume regular news online. So it makes it even harder to be in an echo chamber or in an ideologically segregated bubble online because there's a lot more people consuming TV news. It's not that surprising, but also it depends a lot on how you define these echo chambers. So what we try to do is to not make hard calls and try to not make normative calls, but try to leave that to readers to decide what they're worried about or what they're not worried about. So for instance, it's a big difference

Starting point is 00:37:38 whether or not we say that you consume mostly MSNBC, is that the critical thing? Or can you punch MSNBC and CNN together and say, well, if you switch between the two of them, it's roughly equivalent? We show this in various different ways, but depending on how you view this, TV has become an incredibly important part of that ideological segregation. I want to talk about that normative question. I don't want to make a hard stance on whether or not this is bad per se. If you go back in time, it's really important to remind people that this idea of the voice of God, the national consensus news is like a relatively tiny blip in news consumption and news production.

Starting point is 00:38:23 Everything was partisan news up until the 1910s, 1920s. And that was when national advertisers were like, oh, we can advertise on news as long as they don't scare anyone and become kind of centrist. So then we have these rise of centrist objective, quote unquote, news sources. And it had a good run. It's a solid amount of time, 100 years where that was like the norm for mainstream news to try to not scare people and kind of give both sides and put equal weight to both sides. You know, that may not be the future. And I'm okay with that as we kind of look towards what may be the future. I think it's important to accept the possibility that overtly partisan news may play a larger and larger part

Starting point is 00:39:05 of it. And maybe that's fine. I guess the argument there is, you know, we've made it through the 1800s somehow as a country, despite the many challenges and terrible things that happened during that period and have continued to happen since. But we survived. And so if the media ecosystem becomes more ideologically segregated, maybe we'll survive again. And I think if the bar is survival, then that makes sense. Oh, I don't think it's just survival.

Starting point is 00:39:40 Maybe two partisan news battling it out may give us a better view of the world than one mainstream source, which is kind of giving us both sides internally. I think there's a strong debate for that. And maybe also truth in advertisement is good, right? So it's really super weird that we have this need to have all of our news sources pretend to be objective. To get back to more of the science on this, the distribution of content that we produced and the distribution of content that people would consume if we had a larger place for partisan news. Again, looking at the economics of this, how we have a shifting way in which we're

Starting point is 00:40:20 paying for news. We're moving from an advertisement-based to a subscription-based, and that is going to make a big difference in what's produced and most likely will almost definitely lead to more overtly partisan. At this point, we're still seeing how that's going to lead to the distribution of people's consumption habits. So presumably part of the thinking on why it might be okay to have a left-wing and a right-wing ecosystem fighting it out is that there are a lot of readers who are then reading content from both sides. And as you were alluding to earlier, on average, the research has shown that a lot of the news hounds who read extreme partisan content also read content from throughout the political spectrum.

Starting point is 00:41:06 But there's one thing that I want to touch on here, which is that in some of the research you've done and in Andy Guess's work with others, you can point to a smaller set of people who have particularly ideologically segregated news diets. And so how concerned should one be if there are, you know, 1 million people or 10 million people or 20 million people out of the full 200 million adults that have very segregated news diets? One view is, listen, these are not swing voters. And so who cares what information they have? They've already decided who they're voting for. Another view is that you might be worried about feeding extreme content to people who already feel extreme. And I'm curious where you come down on this. It's important to take a quick step back. And I will say that sometimes I think we're all a little careless in thinking about the definitions of who is the target

Starting point is 00:42:10 population for what we're discussing at any given time. And so as you noted, a lot of this discussion has been really focused on the general US population, the median voter, the average American, the marginal Americans we're describing in various ways. And a lot of the stuff that we're discussing then applies to that context very, very differently than when you're thinking about radicalization, when you're thinking about the small group of people who are causing large disruption in our free democratic society. And there's absolutely no question that we have a problem right now. And so I do want to emphasize to your point that it is a concern, a very big concern about where people are radicalized and the echo

Starting point is 00:42:52 chambers that they're in that can lead to radicalization. Regardless of what changes people make, what things are created, radical content always exists and has always existed. And people will find ways to produce this and distribute this. And a lot of the fixes that we do now have massive short-term disruption to radicalization networks. So deplatforming per se, politicians or others who engaged in radicalization certainly does have a massive impact. But in the long run, people do find new platforms, people do find new platforms, people do find ways to do this. And so while kind of this cat and mouse game of de-platforming or pushing on it is very important, it's also a matter of, you know, thinking about

Starting point is 00:43:36 larger educational and socioeconomic impact that leads to us having a non-negotiable but small portion of the population that is so radicalized and understanding who these people are and in different ways in which society should be engaging. That's very interesting. I want to come back to this question of how different outlets portray news. So I think there's a general sense that CNN is to the left of fox and fox is to the right of cnn but how do they actually implement that and i'm curious specifically in your data you've been looking at what share of partisan presentation of news is topic selection like which topics you cover at all or don't cover, versus the framing of the topics that you choose to cover. Can you tell us a little bit about this

Starting point is 00:44:33 work? Yeah. So this is a great question. I think one that is super important to be thinking about in terms of what bias is or perceived bias is and think of it as a technical term and as a colloquial term, the bottom line is that there's an infinite number of stories that any news publication can produce in any given day. Tons of stuff happens. And outside of very, very few select things, such as the outbreak of coronavirus and a few other things, almost every story that even the big ones that permeate the national consciousness are choices that news media makes. And this goes to the heart of the question that you asked me, which is, what is more important? The selection of topics? What do you choose to talk about?

Starting point is 00:45:23 Or how you frame the topics. And both are important. Don't get me wrong that there are different ways to frame any article or any concept. And there's a lot of really interesting stuff for that. But ultimately, the way that we perceive bias, the way that these stations, these news sources are very different to us, and the way that they really are different is mainly driven by their selection of stories. There's a lot less overlap in what is leading the various cable companies and the way that they really are different is mainly driven by their selection of stories. There's a lot less overlap in what is leading the various cable companies and the various major news sources at any given time that I think most people get. And what it demonstrates to me is that there's a lot of leeway, a lot of degrees of freedom of what you choose and publications

Starting point is 00:46:00 choose things, choose to discuss topics that fit their ideological bend or the story they want to tell more than they should really shift what we know by how they cover the topics. And I'll leave one other point, which I think is really key to understanding this, is how pervasive wire services are in our news ecosystem. They're huge, especially online. Wire services dominate a huge percentage of the content that exists and is consumed. And so wire services basically make articles about everything and they do so in a very neutral light. But there are publications which feel very different simply because of which wire service articles they pick and choose to display on their sites. And so it has nothing

Starting point is 00:46:41 to do with framing in that point. It's really all driven by which stories they choose to cover. I see. So the wire service example makes sense because you're looking at the same articles that are available to different outlets. And then you're looking at, you know, Fox News chooses one article from that set and CNN.com chooses another article from that set. So that example makes sense. I wanted to actually ask in the data analysis how you do this, because setting aside the wire service articles, it seems clear that you can group news articles into topics using text analysis tools. And then you can say, well, what is the overlap in topics? And, you know, Fox News and CNN are different by some metric. You can also look at particular articles

Starting point is 00:47:33 on a topic and measure the slant of those articles in various ways, again, using text analysis. Yep. I don't understand how you could compare those two analyses and say that topic selection is, in some sense, more important than framing. So tell us how you do that.

Starting point is 00:47:52 One of the key ways in which we see that selection is a dominant force over framing is that it literally dominates the real estate or time on various news sources. So if you look at any given night on Fox News and CNN, and you compare the major lineups and look at what topics are covered, you'll see very little overlap sometimes. And if there's very little overlap, then by definition, selection is dominated because we don't even get to compare the framing on the topics that they both cover if they don't even ever cover the same things. And this is frequently the case. And then conditional on them covering the same things, we have coders that are looking at what they believe to be the partisan bias or the partisan framing of various content. But this is really tough. And so we can

Starting point is 00:48:35 look at how extreme various networks and various websites come out. It's a very tough thing, but a lot of it is actually wire service. A lot of it's very plain. A lot of it's very clear that there isn't much framing from it. That being said, framing is an important thing. It does exist. And especially interesting if you look to early 2020, when essentially networks were forced to cover the same thing during the coronavirus outbreaks or other times in which we basically took selection off the table. I want to switch gears to think about fake news. So misinformation, hoaxes, and conspiracy theories have been around

Starting point is 00:49:14 since the beginning of news. But after the 2016 election, there was a real explosion of concern and interest, as you alluded to earlier. There have been thousands of academic articles and God knows how many newspaper articles and editorials and a lot of ink spilled and a lot of brainpower spent thinking about fake news. In your data, is fake news a problem? Fake news is not a major problem insofar as it's consumed not very much and by people who also consume a lot of other news. That being said, misinformation is super important. invested in and interested in is how does misinformation, how does disinformation work its way into mainstream news, work its way into reputable news sources where people actually consume a huge amount of. And this continues to be a huge problem. And it continues to be a huge problem because mainstream news continues to amplify and repeat falsehoods,

Starting point is 00:50:27 deliberate falsehoods. And it's a tricky problem for them because a lot of those falsehoods are coming from people or institutions that they feel that they need to engage with, that they need to highlight. And these have been really tricky and difficult questions for the mainstream media to make decisions on. I want to conclude by looking into the future. So as you look into the next decade, do you think our media ecosystem is going to become better or worse? And when I say better or worse, I mean more or less useful and informative to people in making important decisions in their lives, including whom to vote for? I think the best way to look forward and think about whether or not I think it will be better or worse is to look backwards and think about if we were having this conversation 10, 15 years ago on the cusp of the internet revolution to news and how bullish we would be on killing

Starting point is 00:51:46 the gatekeepers and the masses would rise up and create great news sources and everyone will get just the information they need. And ultimately we end up with like a lot less news being produced because we have a lot of consolidation. A lot of people just turned away from news completely. Other people end up in these radicalized corners of the news. Clearly, we didn't live up to the fantasy that we had hoped it would be. And when I look forward, I do so with enough trepidation, but also with a feeling that today is super broken and that we can make it better. Ultimately, I look towards a future in which there continues to be several conglomerates that produce a lot of the news that most people consume, simply because they end up breaking through and good content or powerful content or big content has a tendency to kind of break through regardless of what the

Starting point is 00:52:45 distribution channel it is. And we're continuing to see growth in various distribution channels. Television and radio and newspapers have given way to desktop and mobile. It's given way to TV along with other streaming video services. It's given away from radio into podcasts and other groups. And so there continues to be a flourishing of different ways in which people can consume things in the way that they want to. Ultimately, I think that's a good thing. The thing that I fear most is that,

Starting point is 00:53:16 especially when it comes to video content, kind of very low quality first person commentary is becoming a very popular thing on some persons and has always been on radio. So this is not unique, but you know, you do have to worry about what this ultimately means in the long run. And I think that's something that we need to keep an eye out for. It's a continuously shaping landscape. David Rothschild,

Starting point is 00:53:41 thank you for joining us on the Microsoft Research Podcast. For more information on David's research, head to www.researchdmr.com. And to the listeners, thank you for joining us on the Microsoft Research Podcast. For more information on Microsoft Research, check out microsoft.com slash research.

Microsoft Research Podcast - 123 - Econ3: Understanding the media ecosystem and how it informs public opinion in the internet age featuring Hunt Allcott and David Rothschild

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.