Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 151 | Jordan Ellenberg on the Mathematics of Political Boundaries

Starting point is 00:00:01 Hey, everyone, it's Cal Penn. I'm inviting you to join the best-sounding book club you've ever heard with my podcast, Earsay, the Audible and I-Heart Audio Book Club. Every episode, I nerd out with amazing guests and dive into the best new audiobooks available on Audible. It's the book club for your ears. Listen to Earsay, the Audible and I-Heart Audio Book Club. On the I-Heart Radio app or wherever you get your podcasts. Indeed, sponsor jobs, gets you quality candidates when you need to be.

Starting point is 00:00:33 them most. Spend less time searching and more time actually interviewing candidates who check all your boxes. Less stress, less time, more results. When you need the right person to cut through the chaos, this is a job for Indeed sponsored jobs. And listeners of this show will get a $75 sponsor job credit to help get your job the premium status it deserves at Indeed.com slash podcast. Terms and conditions apply. Need to hire? This is a job for Indeed sponsored jobs. Hello everyone. Welcome to the Mindscape podcast. I'm your host, Sean Carroll. There's something that is very common in physics, as well as in other areas of science, namely coarse graining. This is something that you do when you have a situation where there are many things going on, perhaps at a microscopic level,

Starting point is 00:01:19 and you're not keeping track of all those microscopic things, right? You don't know exactly where the molecules of air are in the room or in a cup of coffee or something like that. So you take many different possible arrangements of the microscopic things, and you group them together and you say like this set of arrangements is going to count as one big macroscopic configuration, this different sets going to count as a different thing. Now it turns out that in representative democracy, something very similar happens, right? You have many, many people, many citizens of the democracy, and it's usually impractical to imagine that all those people will directly vote on every single issue that comes up in the

Starting point is 00:01:59 country. That's called direct democracy. It doesn't really work very well. It hasn't been tried that often. But instead, what you do is you make a republic, a representative democracy, right? You have the individual people vote for representatives. The representatives are supposed to represent their interests in the legislature or as the president or whatever. This raises an interesting problem. How do you go from the set of all the different opinions and feelings that the citizens have,

Starting point is 00:02:28 which might be a huge different sort of heterogeneous set, of ideas and goals and desires, and coarse-grained them into a small number of people who will actually travel to the Capitol and sit in the legislature or the Parliament or what have you. So today's guest is Jordan Ellenberg. Jordan is a well-known mathematician who specializes in algebraic geometry. He's the author previously of How Not to Be Wrong, a New York Times best-selling book, and he has a new book coming out called Shape, the Hidden Geometry of Information, Biology, democracy and everything else. So what Jordan and I are going to talk about,

Starting point is 00:03:05 he has many different things going on in the book, but we've picked out this theme of how you coarse-grained your democracy. And in particular, how do you draw boundaries for congressional districts, right? This is the problem that we face in the subject of gerrymandering. Gerrymandering is the idea

Starting point is 00:03:22 that a party in power can fiddle with the shape of different districts to guarantee that it gets more seats in Congress than it deserves in something. sense. Now, as soon as you say that, you say, well, what do you mean deserves? And it's not clear. We have to actually think about the politics of this, the political philosophy. How should we represent the goals, desires, interests of different people in the government? That's fine. We're not going to talk about that today. That's a very important politics and political

Starting point is 00:03:52 philosophy question. But there's also a math question. Given some particular goals, how do you make them happen mathematically, and even more importantly, at a legal level, how do you know when someone is bending the rules? How do you know when someone is cheating? Turns out, you can say, well, consider the space of all possible shapes of districts in a state. Can you say that the particular one that the legislature has drawn is unfairly biased? That's a good, rigorous math problem. And talking about it gets us into interesting questions in geometry,

Starting point is 00:04:26 but even more interesting questions in randomness. Even though the question we just asked seems pretty straightforward, you're comparing the specific map that people have to draw the districts to the set of all maps. You can't really analyze the set of all maps. It's too big. Instead, what you do is a random walk through the set of all maps. That gets us into questions of how you do random walks and Markov chains and different algorithms for searching the space.

Starting point is 00:04:53 So it's a wonderful example of how, not just physics, but other areas of human inquiry bring up these very intricate, very interesting math problems. We're using politics here really as an excuse to talk about math. We're not talking about politics that much in this episode, but the math that you need to do politics well turns out to be fascinating in its own right. So with that, let's go. Jordan Ellenberg, welcome to the Mindscape podcast. Hi, thanks for having me up. So we're going to be talking about gerrymandering and the role of geometry.

Starting point is 00:05:41 in math and gerrymandering. It's a slightly political topic, but you're a mathematician as a professional here. So why don't you just, for the sake of the listeners, say what you do for a living? Like, what kind of mathematician are you? So I'm primarily a number theorist, which means that I study these questions that mathematicians have been wrestling with, really since there was such a thing as formal mathematics, these questions if you write down the equation and does it have solutions in whole numbers? And if so, what are they? That is something we somehow still don't know that much about all these many millennia later, although we certainly know more than we did at the outset. But maybe more specifically on what's called an arithmetic

Starting point is 00:06:21 geometic geometer. So, especially in the last hundred years or so, we've come to see that these questions, which seems to be purely numerical or algebraic, actually have like a very, very serious geometric understructure. And that's very much the way that the subject is seen now in 2021. a lot. Well, that was one of the interesting things that I learned from talking to Emily Real, who we just had on a few weeks ago, and we haven't had that many mathematicians on. Yeah, we haven't had too much math, but now there's like a surfeit of mathematics. I don't know what's going on here, but she comes, I guess, from the opposite side. She's a topologist, and she keeps finding, as topologists do, these algebraic structures in the study of how you can deform things

Starting point is 00:07:03 from one to another. Yeah, that's so interesting because she is so fully manifested herself as a category theorist that I think of her as that, and I actually didn't know that she started in topology, although that's certainly the origin in that subject. Yeah, no, that's exactly right. So despite all this and despite the fact that, you know, math, one of the most pure, abstract, philosophically elevated things we can think about, I don't think that you testified in front of the Supreme Court, but you at least signed a brief that one in front of the Supreme Court, right?

Starting point is 00:07:35 Exactly, right. And so sort of when the Supreme Court is considering a big case, especially a case with a lot of technical content of various kinds, they get briefs from all kinds of specialized groups. But typically, and in this gerrymandering case, this was the case, there's briefs from historians, briefs from political scientists, briefs from lawyers, briefs from community groups of interest in the case. I think this is, as far as I know, the first case in the history of the Supreme Court where there was a mathematician's brief. And you're 0 for one. The mathematicians did not get the answer that they were hoping for, right? So we'll have to wait another 200 years for the next case, this mathematical, to come before the court. Okay.

Starting point is 00:08:15 So. We might have listeners who are not familiar either with politics or the United States or lingo or anything like that. What do we mean when we use the word gerrymandering? Right. So the fundamental idea is this, that in the United States, in both the U.S. Congress and in state legislatures, We elect the people who make the laws from geographic districts. We cut the state up into a set of districts in Wisconsin. There happened to be 99 of them.

Starting point is 00:08:44 It's sort of a weird number I know, but it's in the Constitution that every Senate district has to be three assembly districts. So it has to be a multiple of three. That's sort of constitutionally mandated. And you know, you break the state up in these districts, and then each district elects a legislator by majority vote. what we have in some sense we've known for a long time but has really like come to a kind of apex now is that this process of choosing how you execute this division how you divide the state up into distance which seems kind of technical and boring and I guess you just cross some lines and do it that turns out to have an absolutely huge effect on who sits in the legislature or rather

Starting point is 00:09:26 it has a huge effect if you work hard to give it a huge effect and that process is called gerrymandering, that process of partisan actors motivatedly drawing the lines so as to ensure that their party is advantaged in the election. And I think, you know, it sort of best summed up with a popular description of the practices saying instead of the legislators are choosing their voters instead of the other way around. Yeah, I mean, and it's all based on the fact or it arises from the fact that for whatever reason, we choose to allocate our representative geographically in some sense, right? The starting point, which we're not even going to question in this discussion,

Starting point is 00:10:07 is you divide the state up into regions of geography, and then within each region you vote for who wants to represent you. Right. And one thing I learned writing about this is that there's lots of places that don't do it this way. You know, in Hong Kong, there's a constituency like just for teachers. To run with a seat, you've got to be a teacher. There's a teacher's seat. There's a teacher's seat. There's a teacher's seat.

Starting point is 00:10:26 There's a so-called functional constituencies. You know, in Iran, there's a seat just for Jews. In Ireland, there's a seat just for graduates of the University of Dublin. There's lots of ways. You want to hear my craziest idea about this? Sure. I thought it would be, I mean, I thought it would be cool if, like, maybe we shouldn't really do this. But, like, imagine if there was a seat.

Starting point is 00:10:43 Imagine if you broke the age ranges up into bans of equal population. It was just like, okay, there's like a congressperson, there's a representative of people aged 40 through 51. You know, you could argue that those people across the state have a lot more in common with each other politically in terms of what their priorities are, what their needs are, then people have varied different stages of life who happen to live 50 miles apart and thus in the same congressional respect.

Starting point is 00:11:06 Well, it's interesting to think about the future because I can imagine that 200 years ago, geography was just the way that you could easily slice this pie. But these days, we can do better. Yeah, so I'm interested to see whether or not, I know that there are a lot of clever, maybe overly clever voting schemes on the market, but we'll see whether any of them actually catch on in real elections.

Starting point is 00:11:27 But one of the instant worries you get to, and we'll probably get to this in more detail, but when you have this scheme where you divide things up geographically and then just have a majority vote, is that so if you have a state with 99 representatives, let's say, and 60% of the state, as you very nicely put it, you know, there's the orange party and the purple party. So you're not associating these with any real parties here in the United States. So 60% are orange voters, 40% are purple voters. And in this weird idealized situation where every district is exactly 60% orange voters and 40% purple voters, all of the representatives would be orange, even though 40% of the voters are purple. So there's sort of a built-in amplification of tiny edges in the voting, right?

Starting point is 00:12:19 Yeah, and this is, you know, I sort of set this up as a thought experiment in the book, but, you know, it's also called Massachusetts. That's another name for this phenomenon, right? In Massachusetts, the state, which there are Republicans in Massachusetts, and not even that few. I think it's about a third to 35 percent of the state, so that's pretty significant. If I remember correctly, there hasn't been a Republican member of Congress from Massachusetts since 1996. Right. I mean, it's been nine Democrats for a long time.

Starting point is 00:12:45 And that's actually not because of gerrymandering. That's because the very nature of geographic districting, as you say, in a state that's two-thirds Democratic, there's not going to be very many Republican neighborhoods, certainly not like entire Republican patches of the state big enough to be a congressional district. Right. So I guess what I wanted to get on the table there was just that even, like you say, even forgetting about gerrymandering, this whole idea of just having geographic districts and voting in them can lead to big distortions if you compare it to just proportional representation from inside the state. Right. But this is a, this is one of the reasons this question is so interesting, is that

Starting point is 00:13:25 that you really find yourself saying, what does it mean to be representative? Right. So one, I mean, it's very natural to think that the right notion of representativeness is, as you say, what's called proportional representation, which means that the proportion of legislators from each party is roughly the same as the proportion of voters for that party. Maybe not on the nose, but that a deviation from that is understood to be some kind of failure of representativeness. That's a system a lot of countries have.

Starting point is 00:13:57 It is definitely not the American system, and it never has been. So, you know, look, I always say, like, people write about this in terms of Democrats and Republicans, but, like, you know, sympathy for the libertarians, man, like year in, year in, out, about 1% of American voters vote for a libertarian congressional candidate. Now, there's 435 members of the House that you believe in proportionate representation. You think there ought to be a few libertarians sprinkled in there. And how many are there? Zero, because there's no such thing as a liberal.

Starting point is 00:14:25 libertarian congressional district or even a libertarian neighborhood. It's kind of fun to imagine what it would look like, isn't it? But as far as I know, it doesn't exist. Well, yeah, because of this effect that our voting system or our system of apportioning representatives turns up the contrast knob, right? So a slight advantage in the electorate gets you a huge advantage in the chamber. And if you're down there, 1%, you'll be normalized down to zero very, very quickly. But one thing, I mean, if we're sort of imagining this is sort of some kind of like system with noise in it. If you're in a state like Wisconsin where I live, now you're in California, which is a very different story.

Starting point is 00:15:02 But in Wisconsin, where the proportion of voters who vote Democrat or vote Republican is pretty close to 50-50. But because of that, you know, the political conditions are quite different when the mood of the state is more democratic and the mood of the state is more Republican. So under what you might call a normal system, Even with single member geographic districts, there would be swings. There would be years when Democrats held the majority and years where Republicans held the majority. So at any given time, that majority might be pretty big in favor of one party or the other,

Starting point is 00:15:39 but it would switch back and forth. And in some sense, you know, there is some political justification for that. There's some political justification for saying, like, you know, it's good for the legislature some of the time to sort of have like a healthy majority of one party so that things can get done. Yeah. And then if the voters don't like those things that are done, then it switches, right? And then... Exactly.

Starting point is 00:16:02 I mean, you're bringing up the fact that we don't agree ahead of time on what we want a fair outcome to be in any of these situations, right? There's competing interests that we have to get to. But so just to be super clear, I think that you said it already in some sense. But how gerrymandering works is sort of to advantage one party. over the other, taking advantage of the fact that the representation within each district need not be reflective of the state as a whole? Exactly.

Starting point is 00:16:32 And if you're the person drawing the maps, and in many states, Wisconsin being one of them, the people who draw the maps are the legislators, the very people whose electoral fortunes are at issue. So it's like hard to imagine a worse feedback loop from the point of view of, you know, conflict of interest. Right. You know, to put it, and you can draw lots of charts and pictures and graphs, but the fundamental way it works is pretty simple. The party that you don't favor, you try to cram all of their voters into just a few districts. You try to create districts which are absolutely dominated by the opposing party, because by doing so, you're sort of using up their voters.

Starting point is 00:17:12 Yeah. And if you like wasting their votes on a few districts that they'll win by a lot, leaving the rest of the state composed of more districts where you win by a more modest amount. And that's exactly what we see in Wisconsin. And just to be clear, this happens everywhere, well, at least in the United States. Like this is this procedure of drawing weirdly shaped districts, and you can show pictures and sort of laugh at how weird they are. This is really, really common, as far as I can tell.

Starting point is 00:17:40 Well, a couple of things I want to say about that. One, it doesn't happen everywhere because not every state has the procedure where the legislators themselves are the ones drawing the map to determine their own fate. So there are states like Iowa is one where forever they've had sort of some panel of retired judges draw the maps. And nobody's going to say they're perfect, but they're not people whose like literal job is to be political operatives for a political party. And by the way, when I say that when I say that the legislators draw the maps, that's not even quite correct. is like literally people who work for the state party. Yeah. Like not even elected officials,

Starting point is 00:18:15 but just people whose entire job is not to represent the public, but to represent the interest of their party. That's who's actually sitting in the room, drawing the maps. So, you know, one thing I like about this as a sort of place where there's a possibility for reform is the system we have is like literally almost the worst possible thing we can have. So any change from it, like, you know, is the system of just asking some retired judges to draw maps perfect? No, but it's way better than what we're doing. in a lot of states.

Starting point is 00:18:43 So this then raises the question of how, and now the math begins to come in. One question, I guess the first question we might address to ourselves is, is there a way of characterizing how unfair a particular set of maps is? And so what do people say about that? Yeah, and actually that's a great question

Starting point is 00:19:02 because what it turns out, if you try to say what's fair, like what's the perfect map, what's the map that's truly representative? I don't think that's a question that has a math answer. Actually, I don't think it has a philosophical or a legal answer either. And then you might say, well, so what the hell are you going to do? Well, I think the question of what's fair is very hard,

Starting point is 00:19:22 but I think the question of what's unfair is much easier. And that's a crucial distinction. We're not looking for perfection here. I think eliminating gerrymandering saying we're going to turn the gerrymander knob to zero, that's impossible. That's an unrealistic hope. And it's not even clear. It makes sense, right? It's like saying, I'm going to eliminate all trace of bias from my thinking and be like perfectly rational. No, but you can look for the worst offenses, right? Yep. Again, which we can see right here on the ground in Wisconsin and then and there's a few other states, but, but one thing I want to say that's really important is that a very common and historical grounded myth about gerrymandering is the thing that you said that it has to do

Starting point is 00:20:01 with funny shapes districts. In fact, it's right there in the name that sort of the word gerrymander comes from this famous political cartoon about a map drawn by L. Elbridge Jerry, whose name I am told was actually pronounced Gary, but that ship has sailed. Everybody calls it gerrymandering. So sorry to the descendants of Elbridge Gary, but we're calling him Jerry. You know, this kind of crazy districting of Massachusetts that he engineered as governor, which they said looked like a salamander. And so it was the Jerry Mandar. This was the origin of the name. And, you know, now, too, there's like, there was a famously crazy looking district in Pennsylvania that was warmly called Goofy Kicking Donald Duck because of the way. that it looks. But here's the thing. Nowadays, when these maps are not being drawn by some

Starting point is 00:20:47 kind of political savant, like sitting with a giant map rolled out on a big table, but are being done by machine, if you say, I want a map that advantages my party by a huge amount, and it looks fine. The districts are shaped like, you know, roughly, rectangularly and nice. You can do that. And in Wisconsin, we have exactly that. If you looked at the map we had before 2011, which was drawn by a court and the map we had after 2011, which was a highly gerrymandered map drawn at political operatives, you wouldn't be able to say one of these has funnier-looking districts than the other. They look about the same.

Starting point is 00:21:22 So if that was ever a good measure of gerrymandering, it no longer is. We can't detect it with the naked eye. We have to do something more sophisticated. We're all increasingly concerned about our privacy and security online. And if that includes you, a virtual private network such as NordVPN is for you. NordVPN for when I'm traveling, if I'm using Wi-Fi at an airport or in a restaurant, you can protect your data from hackers. And NordVPN will automatically block malware sites and botnet controls,

Starting point is 00:21:52 and it has online security for your kids, so it protects their online privacy as well. NordVPN will block fishing sites, and it has next-generation encryption. Furthermore, NordVPN is the fastest VPN out there and incredibly easy to use. It's just one click. So go to HTTVPN.com slash Minescape. That's N-O-R-D-V-N-P-N-CAP. Or use code Minescape to get a two-year plan plus a bonus gift with a huge discount.

Starting point is 00:22:24 And if the product is not for you, there's a 30-day money-back guarantee. That's NORDVPN.com slash Minescape. That's very interesting, actually, because I hadn't quite appreciated that. So you're saying that, you know, the use of computers our modern data-driven techniques allow us to gerrymander in a way that we don't know it when we see it.

Starting point is 00:22:44 We have to be a little bit more sophisticated about what it means for the map to be tilting in one direction or another. I mean, you don't know it when you see it when looking at the map. You may know when you see it when you're like, why do I live in like a swing state that like has lots and lots of members of each party? But year in and year out, the legislator has like a huge and your super majority of one party. Like there you can see it, right? You can see the effect.

Starting point is 00:23:07 But you can't see it on the map. So there was one effort to characterize the unfairness of maps that you talk about before sort of discarding. It's not the right way to do it, which was the, I guess, the efficiency gap, how many people's votes are wasted? And that does have a superficial plausibility to me. Yeah, and I don't, discarding is too strong. I think there's been a sort of, as in any area of scientific progress. There's going to be an iterative thing where we sort of develop like better and better measures. You think of a measure and then you're like, well, here's some of the,

Starting point is 00:23:39 the problems with it. In some sense, here you see the differences between scientists and politicians, because politically, you should probably sort of choose one thing and be like, this is it. If you argue against this, you're wrong. But a scientist, you know, we're not inclined to do that. We're always inclined to look at whatever it is we're using and be like, could this tool be made better? What are the issues with it?

Starting point is 00:23:58 What are the edge cases where the tool doesn't work so well? So I think it's a bit, it can be a bit frustrating to the lawyers who are like, I thought we were arguing on this. Why are we arguing on that measure? I very well understand. But again, because my framework is not, let's make it perfect, but let's root out the absolute most grievous gerrymanders, the good news is that we have a bunch of different measures,

Starting point is 00:24:21 and on a map like Wisconsin, they all agree. Right. It's not hard to see that something like very dirty is going on. And literally any reasonable measure you can think of, well, light the big red blinking light. But it looks at that. Why don't you explain what the efficiency gap is? Because I think it's at least a good example of sort of an attempt to make the voting fair.

Starting point is 00:24:45 Yes, the efficiency gap, what it measures is it says it relies on this notion of a wasted vote. Where a wasted vote is one of two kinds. It's either a vote for the party that loses in a given district. So it's a vote that somehow doesn't go towards getting USC. Or it's a vote for the winning candidate above that 50% threshold. that you need. So it's any votes above or beyond 50%. And that would capture something very, very true

Starting point is 00:25:16 about the way Jerry Manoring works, which is that if you're trying to be efficient with allocating your votes, like if you could just draw the districts completely not geographically and just choose the voters, you'd be like, especially if you like knew how they voted, you would say, after the fact,

Starting point is 00:25:31 you would say like, oh, okay, I'm going to make a bunch of districts where 50.001% of the voters are mine, 49.99% of the voters are yours, and I'm going to win them all. And if the political nature of the state means that actually there must be a lot of your voters somewhere, I'm going to put them all in one district. I'll put them all in one district, and then I win the rest. That would be very efficient, where every time you win, you win by only a little, you're kind of getting a lot of seats with very few extra voters at the margin. And so I think that that notion of efficiency gap

Starting point is 00:26:04 is very simple to compute, and it captures something of, it captures something very real about how gerrymandering works. And you can very clearly see empirically that it's a lot bigger in states where the legislature is drawing the maps for their own benefit than it is in states where there's some other system in place, whether a citizen's commission or a judge's commission or whether it was drawn by courts after a battle or what have you. And then one of the arguments against it, though, if I remembering correctly, is that if you did have this ideal situation where every part of the state had exactly the same representation, 60% orange, 40% blue, that would not, even if you just did everything completely evenly, that would not get a good score on the efficiency gap measure,

Starting point is 00:26:50 but there was no gerrymandering going on. Yes, exactly. I think the one, the flaw of that measure is that in the end, it does kind of seem to posit that there's some right answer to how many legislative sheets there should be given a certain popular vote. It's not proportional representation, but in some sense, it's in the same genre as proportional representation. And I think, and, you know, by the way, this is something that there is definitely argument about, so I don't want to sort of say that my view on this is the only one. But it is the view that, you know, plaintiffs went forward with in the Supreme Court case of Roachov v. Compto. cause is that, you know, the opposite of gerrymandering is not hewing to some standard that you think of in advance about what the number of seats should be. The opposite of gerrymandering is not

Starting point is 00:27:38 gerrymandering. The opposite of gerrymandering is what would have happened if those maps had been drawn by a neutral arbiter of some kind? Good. So now the math is going to start coming fast and furious because your question now has shifted to the realm of the ensemble of all possible maps we could imagine drawing of congressional districts and asking questions about that. And this is what gets mathematicians very excited. Right. I mean, this is where there starts to be some like really interesting math and where I think a lot of mathematicians have gotten involved. You know, I can name a lot of names, Moon Duchen at Tufts, Justin Solomon, at MIT, Jonathan Mattingly and Greg Hirschlag and Duke, at West Pegg, Carnegie Gall. There's lots of people

Starting point is 00:28:24 who's been, whose day job is like proving theorems have gotten interested in this because, you know, it wouldn't be great if we were also civic-minded that we like worked on problems just because they're important. But no, I mean, there's lots of important things in the world. We work on them when they're important and mathematically interesting. I mean, so that this speaks to like our special skills, which to be honest are not usually called for in most political questions. So exactly right. What you'd like to say is, well, what if the map had just been drawn by somebody who didn't care which party came out ahead. Let's say a map plucked at random from all the possibilities that there are. That would certainly be like an

Starting point is 00:29:06 unbiased, neutral, nonpartisan way to do it. That sounds very appealing, but then you have like a big problem, which is the set of all possible districting the space of the state is like absolutely unfeasibly, uncomputably huge. I mean, I couldn't even tell you like how many, you. How many, there are. The ways to cut Wisconsin up into 99 pieces. Right. That's a lot. Well, you're slightly aided by the fact that there are these voting wards, thousands of them, but still, they're discrete units. So at least it's not literally an infinite number, right? But the question is, how would it divide up the voting wards into a set of congressional district? Right. I think there's about 7,000 wards in the state of Wisconsin. So if, right, so if you think about how many ways there are to sort of

Starting point is 00:29:54 partition 7,000 things into 99 pieces, that is a very, very, very big number. Even if you start to oppose requirements that each piece should be connected in Wisconsin, constitutionally, you're not allowed to have districts that break up into several pieces, although in some other states that is allowed. Even so, you simply can't write them all down on a sheet of paper or store them all on a hard drive and then pick one at random. You have to do something, you have to find some proxy for that. And the way we typically do that is by this mechanism of what's called a random walk.

Starting point is 00:30:28 Right. I mean, it's something that people in physics do all the time, right? So a Monte Carlo simulation is something that happens all the time in physics. Yeah, no, we... You probably know better than me, what's a context where that would be? Random walks are surprisingly important for many reasons. But I just want to, before we get to there, I want to home in on one question that I had. So, I mean, I can imagine the math problem of given a state with a certain shape,

Starting point is 00:30:54 divided into geographically pretty regions with approximately the same number of people in them. That sounds both easier and very different than imagining all the ways to partition the state into geographic regions and find a typical one. I mean, maybe the typical one looks kind of like weird and fractal. Is there a reason why we don't just pick the geographically pretty ones? Yeah.

Starting point is 00:31:20 Minimum area curves or something? And by the way, I want to say that's not a hypothesis, it's good intuition. If you actually could choose a uniformly random one, the boundaries probably would look really bad in fractal. You can sort of see that in experiments on much smaller cases. Mike Francis and Lorenzo Knight knows a lot about this, actually. So, so, right, you know, one thing I like about this problem is that it is a math problem mixed with a political problem and a legal problem, and you can't really separate the strands from the other.

Starting point is 00:31:52 So there is, like, a history of work that treats it as just a mathematical problem. In other words, like, let's mathematically sort of define what it means for districts to be nice and try to figure out, like, what would be the nicest division of a state into ecopopulous regions. And you get, you know, you get sort of nice polygons, you know, with, like, straight line boundaries. And then you, like, show this to, like, a politician or a lawyer, and they will, like, laugh until they pee their pants. Right. Because this has no bearing on anything that's politically feasible or anything that respects actual political divisions in the state. And actually, you know, I mean, these are people who live in real communities. And actually, you know, we have in Wisconsin, we have a people's districting commission, which has no legal powers, but the governor made one anyway, just to sort of say, like, this is what it would look like if we drew that. And, you know, I fit this out of their hearings and it's pretty cool. Like just, you know, they hold a hearing and, like, people call in and just like regular people from these places in Wisconsin. It's like, you know, my community is like split down the line between two assembly districts. Why?

Starting point is 00:32:52 It makes sense for us. You know, we have these specific needs having to do with like this particular lake and its health. And like, why are we not, why do we not have like one assembly member who like cares about this? I mean, so those kinds of considerations if you're actually doing it, you know, I want to say, I think the system we have is bad. But if you want to know why would there even be such a system in the first place is because in principle, a representative is actually supposed to be listening to the people that they purportedly represent working for those people, like not working for a political party. They happen to be a member of.

Starting point is 00:33:23 So there is like a reason this is thought of as part of the political process. So if you talk to political science people or elected officials, they'll say there's no way we're going to let a computer draw the map. That's not nobody wants that. Voters don't want it. Politicians don't want it. Judges don't want it. That's a non-starter. And that's what happens if you try to treat it like a math problem without treating it also like a political and legal problem. On the other hand, having now sort of crapped on my own tribe, if you try to treat it as just a political or legal problem and not a math or geometry or quantitative problem, your results are going to be just as bad. And traditionally, that is what we have done. Yeah, that's fair enough.

Starting point is 00:34:03 But the nice thing about the ensemble approach, right, is that these other considerations, like you're not picking a completely, random partition of the state. You want things to be geographically continuous, for example, but also, you know, you want to give representation to minority groups and, you know, maybe have some sensibleness to the way that people are divided up. And in principle, that can be included in how you define your ensemble. Is that right? Exactly. So, you know, the ensemble, it just refers to like, that's like what we call like a set of things in the set is kind of big, like dress it up and make it sound fancy. You just have to say the same thing, but in French.

Starting point is 00:34:46 Yeah. And then that sounds like bigger and more technical. So yeah, you have this big ensemble of randomly generated maps. And it's certainly not all possible maps, but it's a set of maps which for reasons we can talk about if we want are seen to be fairly representative of the set of all possible maps. And we can build in certain biases that we want. We can build in, hey, I want the districts to be like roughly

Starting point is 00:35:12 if you want that. You can build in, hey, I want the districts not to cut the county lines if they don't have to, which in Wisconsin is a constitutional ask. I mean, different states have different criteria in what the districts should look like. Basically, you build in everything the districts are supposed to be like, except you don't build in. Oh, and by the way, I want my party to have like five more seats that would be for fair. And then you look at your like, whatever, your 20,000 maps or 100,000, it's easy to make lots and lots.

Starting point is 00:35:39 And you see, you look at actual election results and say, like, okay, if the awards were arranged into districts in this way, what would the outcome have been? How many seats would each party have in the legislature? And invariably, you just get a very nice, roughly Gaussian curve, a bell curve, just like you always get to do any kind of experiment like this, right? You sort of see that there's very typical outcomes and then there's very atypical outcomes. And in states like Wisconsin or North Carolina or Maryland where they have heavily gerrymandered maps, the outcomes from the actual maps are way off of the tail of the distribution. They're very visible outliers. And so it gives you a really nice, clean signal of this is not something that happened by chance.

Starting point is 00:36:26 Much as the people who drew the map may want to tell you that. No, this is an absolutely like grievous thumb of the scale in favor of one party and against the other. I think this is very, very important because probably if people hear that we're talking about gerrymandering and math and geometry, they might leap to the idea that you're using geometry to draw the map. But that's not what you're doing. You're suggesting that, yeah, no, human beings will draw the map. We're going to use math to judge how fair, how typical or how non-distorted that map actually is in terms of comparing its actual results to what typical results would be, is that right? Exactly. Math is, the geometry is the referee here, not the player.

Starting point is 00:37:11 Good. And so we need to construct, or at least deal with, this idea of an ensemble of maps to figure out what would be fair in the world of all the maps. And by the way, is it, is this one of the, I hypothesize that this is one of the gaps between professional mathematicians and civilians, that when a regular person talks about, you know, a random number, if you say you have a random number between zero and one. Instantly in their mind, in my mind, a number appears. 0.71835, whereas I think mathematicians actually just instantly go to the ensemble or to the probability distribution or something like that.

Starting point is 00:37:49 Like converting the phrase random element into really what you mean is an ensemble or sort of a random variable is really a whole ensemble. Is that a helpful step into thinking more like a mathematician? Yeah. I think, you know, the word random is of course like one of those. heavily overloaded words, you know, to like my kids. It means a song I don't like, you know, to, uh, to, uh, and often, you know, to a mathematician random, something that's completely deterministic could be random.

Starting point is 00:38:19 It's just random and all the probability is concentrated at one point, right? So it literally just means something, uh, it just means the outcome. How would I even describe it in words, Sean? This is hard, actually. Um, maybe it means something in which chance, could play some kind of a role. But it doesn't mean, for instance, like, you know, if you flip a coin and it's weighted

Starting point is 00:38:42 so that two-thirds of the times it comes heads and one-third of the time it comes tails, a civilian, as you put it, would probably say, well, that's not random. It's biased in favor of heads. A mathematician would say, oh, of course it's random. It just has a different probability distribution. So we would say sort of like,

Starting point is 00:38:57 it doesn't have to be even anything that is subject of chance in whatever way we would consider a random variable. That's right. Okay. And so, but the particular issue that we're facing with here, as you already made very clear, is if you're trying to compare the actual map to an expectation of over all the possible maps you could reasonably draw, that ensemble of maps you could reasonably draw is far too big. So rather than, and sorry, I guess maybe something to get right here is there's an NP hard problem sneaking in here. We did have Scott Aronson on the podcast a while back and we talked a little bit about NP hard problems. But remind you. us what those are and what the specific one is. Yeah.

Starting point is 00:39:38 So if you were to say, I want a way of provably drawing, and now I'll just like a type of current, a uniformly random map, which means that every possible map is equally likely. Yeah. You have to formalize it a little bit more than that to sort of get it into like literal things you can prove the ams about complexity theory.

Starting point is 00:40:01 But essentially, that problem is known to be NPR. What that means is, you know, complexity theory is this very funny subject where there's this whole class of problems. And what the kind of theorem is people prove is like problem A is as hard as problem B. And problem B is at least as hard as problem C. And problem C is at least as hard as problem F and et cetera, which in turn, at least as hard as problem A. Okay. But proving that any individual problem actually is hard and doesn't have a fast algorithm to do it is like completely out of reach. So the whole edit is could come crashing down tomorrow, right?

Starting point is 00:40:33 Somebody could be like, I found a fast offer for like one of these problems, and now they're all easy. Okay, nobody thinks that's actually going to happen, but it's a very funny subject in that way. So Enfey Hard kind of refers to this class of problems that are, so to speak, the hardest problems. And we think the case is that they're all hard rather than that they're all easy. Yeah. But we don't actually know. You thought a soft mattress meant comfort, but every morning your back tells a different story. This is the sound of support.

Starting point is 00:41:03 The sound of Air Weave. Born from Japanese innovation, our air fiber technology provides the firm, even support your body has been craving. It's not about sinking in. It's about rising up. Airweave. The aha moment your back has been waiting for. Manufactured in Japan, AirWeave is truly supportive sleep. Discover more at airweave.com. My best skin ever? At 45?

Starting point is 00:41:33 Give me a theme song and a best skin care. award because it feels like this, right? Can you feel like there? That's Farmhouse Fresh Skin, all right? I'm blowing and everyone asks how. The best skin care is Farmhouse Fresh and the award is you, your best you. Visit Farmhousefreshskincare.com and use code radio for a free starter routine with any purchase.

Starting point is 00:42:00 And so, but we can approximate them often. So an empty hard problem takes a lot of steps to start. solve, but then there's a somewhat independent question, which is how efficient could an approximate solution be? Is there a general theory of that? I don't think so, and I've been to a lot of seminars about it. And I mean, this is definitely, and this is a little bit outside my field. I'm not a complexity theorist, but like I said, there's so many interesting talks in like the sort of machine learning and optimization group at Wisconsin, which is very closely tied to the math department. It does seem like the way the

Starting point is 00:42:36 subject is going is to say, like, I'll give you an example, traveling salesman problem, very famous NB Hart problem. You want to find the fastest route. You have like 100 cities or something and you want to sort of draw a path through all of them and you want to find the fastest one. Now, that is an NB hard problem to do that provably and find the very fastest one. And you can say, like, okay, great, we proved a theorem. Like, that's it really hard. Meanwhile, while you prove that, people are just doing it. Right? Because, you know, the engineers while you do that, they're like, Yeah, but I'm content to get a solution to write an algorithm which with probability 99.99% finds a solution that's at least 99.99% as good as the optimal one. So while you're over there proving your theorem, I'm like, I'm just making the back. Half the people used Google Maps to get to the conference where you show that you cannot solve this problem, right?

Starting point is 00:43:26 Although people don't usually plan a hundred conferences in a row, their entire itinerary to the entire thing. I actually had no idea how Google Maps performs. if you give it like 100 destinations and ask it to chart of math through them all. But so I do think the way that field is moving is towards like, you know, what problems can we solve approximately? And ideally provably with high probability, but maybe probably with high probability,

Starting point is 00:43:56 it starts to get a little philosophically fuzzy what it is that you're doing. And so we want to talk about random walks now? Is that the time? Well, I think that's exactly where we are. So just to sort of catch our breath because we should pat ourselves in the back for getting this far. So we have this problem of we want to compare a given map that some people in a smoke-filled back room came up with to what to expect in the ensemble of all the fair maps.

Starting point is 00:44:21 But the ensemble of all the fair maps is way too big. So what you're going to tell me is the way to sample from it, but unbelievably in my mind, is to pick one and then start changing it in little random ways. Exactly. So what you do is you start with some given map, and it could be the map that we have right now. It could be this gerrymandered map and say, let's do something to it. But something pretty simple. Like the set of things, again, the set of possible things we can do to it is pretty big. So I'm going to be very restrictive about what I'm allowed to do. I'm going to say you're allowed to take two districts that are adjacent and move a ward on the boundary from one to the other. Okay, I'm going to be honest with you, the best possible rhythms are a little bit more complicated than this in an interesting way that makes them work better, but I can't do it in an audio. Okay. You've got to look at the book and look at the picture. They should buy the book.

Starting point is 00:45:12 They should buy the book anyway. Yeah. And the other boards we come out with me. But the basic idea would look kind of like this. Rather than sample from every single way you could possibly change it, you restrict just these very natural local changes that I take a word that's on the boundary and shift it to the district on the other side of the boundary. And doing moves like that, and then you say, okay, I'm going to do like 10,000 of those moves, each time choosing a random, some ward on the boundary. Remember, the number of wards is not very big.

Starting point is 00:45:46 The number of wards is about 7,000. So choosing a random thing on the 7,000 is that you can definitely do. And it turns out that just doing a few moves like that actually can sort of get you around the space pretty well. and can sort of get you a lot of very different-looking maps. You know, the analog I always like to bring up is the number of ways a deck of card can be you. Cards can be ordered is pretty big. It's a number called 52 factorial. You are not going to be able to list them all and pick one at random.

Starting point is 00:46:18 So then you might say, well, I guess the problem of, like, putting a deck of cards in like a pretty random order must be an unfeasible problem. But it's not a feasible, right? I mean, you do it all the time. What do you do? You just shuffle the deck. We know that there's an algorithm for that. So how does that work? Because a shuffle is a particular kind of change that you're allowed to execute.

Starting point is 00:46:36 But unless you're some kind of slight of hand genius, like there's a shuffle that you do is not going to be exactly the same every time. There's some amount of randomness where there's sort of some small set of possible shuffles that you can execute, depending on exactly what order when you like, like when you flip the cards down. And it turns out, and this is something you can prove theorems about. So my undergraduate advisor, Percy Diaconis, is a very, together with Dave Byer, is a very, very famous theorem called the seven shuffle theorem. It's one of those theorems that does right what it says on the can. It says that seven shuffles, which is a very small number,

Starting point is 00:47:11 is essentially enough to put a deck in extremely close to uniformly random order. So this is the magic of the random walk, that choosing from a very limited menu of possible changes and choosing one at random and just repeating that has a wonderful power to allow you to explore a space that you simply can't see from afar. You can't see it globally. It's like wandering around in a landscape where in order to have a map of it, you would have to sort of have a balloon and be like 10,000 feet in a moment.

Starting point is 00:47:45 Well, what if you don't have a balloon? All you can do is kind of like wander around at each moment sort of choosing whether to walk northwest, east or south. Northwest, did I say them all? Yes. Okay, good. But if you do that, that's actually like not a bad way to explore a whole space. do that for a long time, you actually probably will get everywhere, give it enough time. So I don't know if you know, but if you go to Vegas and play poker, many of the good

Starting point is 00:48:13 poker tables will have automatic shufflers, so the human being is not involved. But if they don't, the casinos do not let the dealers actually shuffle the cards in an ordinary way, because they are good enough to more or less put it in whatever order they want it to be. So the way that they shuffle is to just scatter the cards. down on the table and swirl them around with their hands before gathering them back up, because that's at least a little bit less controllable. But I would think that doing that once would not be enough. Is that right? My intuition would be that you'd want to do that several times. Is that right?

Starting point is 00:48:46 Or is once sufficient? Well, if they're really scattered all over the place, right? And then it's not just like a single shuffle where you pick two and move them back in. And then you cut the cards, et cetera. So I think it's good enough for government work or for Vegas work. Yeah. Okay, good. So with that footnote on the table, so just to be super duper clear, we're doing a random walk, but we're not doing a random walk through the state, assigning things to different congressional districts. We're doing a random walk through the

Starting point is 00:49:16 space of all possible maps. And the nice thing is there's sort of an obvious local way to do that. You have a map and you pick one boundary on the map and you sort of mess with that, right? Right, except as I said, it turns out that there's like a better local way to do that. There's a little more specific. But it's something like that, yeah. Slightly more sophisticated way. Yeah. And this is a move that happens again and again in the history of geometry and it happens again

Starting point is 00:49:39 again in the book, is that you think you're dealing with one geometry, but then to really solve the problem, you have to kind of go one level up. And it's kind of meta and think about the geometry of geometry of geometry. It's like the geometry of the space of all possible ways to cut the state up. So that is definitely a leap into abstraction. And in a book like this, it is not for professional mathematicians or Supreme Court justices

Starting point is 00:50:03 or professional gerrymanders. You know, we kind of lead up to it. I mean, this is at the end of the book. But it is a move that we constantly make going from the sort of very concrete geometry of two and three dimensional objects that we can understand, kind of building our intuition there

Starting point is 00:50:19 and then lifting that into geometrizing entities that are much more abstract and yet somehow can be approached using the same techniques. Well, one obvious question is, I mean, convince me or at least assert very strongly that this kind of procedure will be a fair sample of the space of all the maps. I mean, isn't it possible there's just some maps we never get to by starting with one map and making little local deformations? So that is absolutely possible. In fact, we don't even know that the space of all districtings is what's called connected, which means it's possible. There's some other complete undiscovered country of ways to district the state that you never get to.

Starting point is 00:51:04 You can choose for yourself how much of a problem you think of that as being. For me, it's not, and I'll explain why. Because if you start with the map that the legislators drew and you mess with it, and basically every single thing you do to it rapidly dissipates the partisan advantage that it has, that's a pretty strong indication that that particular map was cooked. In other words, you're right that we may not provably know whether it's an outlier among all possible maps. What we do know is that it's an outlier in its own neighborhood.

Starting point is 00:51:34 Like, even things that are very much like it still are not as incredibly favorable to the party through the map as the map will we actually use. So that to me is a very strong signal. Yeah, and I think that this makes sense to me as a physicist in sort of entropy terms in some sense, right? I mean, there are a lot more equilibrated configurations of a set of molecules

Starting point is 00:51:57 than there are low entropy out of equilibrium configurations. So even if you start with a weird configuration and mess with a little bit, you'll push it much closer to an equilibrium one because there's just so many more of them, right? Right, I mean, right,

Starting point is 00:52:10 the headphone cord in your pocket, it's got to get tangled up because instead of untangled states is like pretty small. You know, I actually mentioned this example in my book and I'm like, man, even for like readers now or under 30, do they know what a headphone court is? This already. And then like, you know, 10 years from now, people will be like,

Starting point is 00:52:27 what is that, what does he mean? Like, what is the thing that you would happen to? Outdated. I'm not. Yeah. But okay. So, yeah, I am, I don't want to let this go by without taking advantage of some of the work you did for the book, digging into the history of the idea of the random walk, because it's fascinating. Like, all these history things are always very, very fascinating. So when you say, I started a point in some space and I walk randomly in some direction, that doesn't sound so hard. But, you know, in fact, it crept up on us quite slowly. Like mosquitoes were a big player in the first random walks? Yeah, I knew almost nothing about the

Starting point is 00:53:04 history of this. So I had an incredibly fun time learning about it. And it's one of these things that is surprisingly common in the history of science where somehow the world is ready for a certain idea. So this idea of the random walk kind of comes up in a bunch of different contexts at once. And it first comes up, well, maybe not quite chronologically first, but as you say, Ronald Ross, who is the fellow who discovers that mosquitoes are driving the spread of malaria. He's the person who discovers the transmission mechanism of malaria. And so he becomes very interested in like, well, what's the right mitigation? Like, you can't kill all the mosquitoes, right? That's another good example of a problem that's infeasible because there's too many, right?

Starting point is 00:53:48 But you can kind of clear some small area of mosquitoes. And then what you want to know is like, well, how long does the area stay mosquito-free? Like, how good of an intervention is that? Because, of course, the mosquitoes are going to get back in from elsewhere after you cleared the area. So to understand that, you want to know if a mosquito starts in sort of some stagnant pond where it's born. Like, how far is that mosquito typically going to migrate in a certain amount of time? And because mosquitoes are not really very goal-directed beings, that's a random walk problem because each day the mosquito is going to do something.

Starting point is 00:54:20 But as far as we know, the mosquito doesn't have particular aims. Like it just kind of flits where it will, like day by day. And so its progress is sort of a random walk towards the landscape. And, you know, Ross, I mean, he's a fascinating character who I literally had never heard on before I started writing this book. I mean, his very famous figure in the history of medicine, he actually kind of secretly, really wanted to be a mathematician and also a poet and was kind of like, you know, bitter about his life in medicine and sort of like felt like it wasn't really like that great of a thing to do,

Starting point is 00:54:49 even though he was the sort of super famous Nobel Prize winning doctor. So he would like write poems about how unappreciated he was. Anyway, it's a whole thing. It's a whole thing. But he recognizes that this mathematical problem is a critical import to understanding the spread of disease. And actually, he comes to the St. Louis Exposition of 1904, which I think among physicist is famous because it's where Poncarat gives this sort of famous speech about like maybe we're going to have to change physics maybe things look different near the speed of light

Starting point is 00:55:18 and things are going to have to be like modified so that the laws we know are only approximations i mean this is uh um but he's also there um i think he sort of talks about how like well they expected me to give a speech about medicine but what's really important is the math so i gave to talk about math and no one understood like he was that kind of guy like he was sort of a difficult character. So he asks Carl Pearson, the statistician Carl Pearson, if he knows anything about it. Because he recognizes he can't really figure out analytically how to solve this problem. And then Pearson writes a letter in the letter's column of nature.

Starting point is 00:55:55 And Pearson coins the term random walk. He tells Ross, like, I'll put it in nature, but I can't mention the problem is from biology because none of the mathematicians will work on it because they'll be like, we don't do biology problems. This is considered very disreputable, right? So he made it an abstract problem. and put it in nature. And rather quickly, a physicist, Lord Rayleigh,

Starting point is 00:56:16 sort of explained, like, oh, I actually know how to do this, and I did it like 20 years ago for an application in acoustics. So this is sort of like in England, how this notion of a random walk was first analyzed, the two-dimensional random walk of a mosquito moving around in the landscape. But what I discovered is that at the same time, almost at the same time, you know, I think if, as you say,

Starting point is 00:56:36 civilians, if people who are not not on the conditions, actually know the phrase random walk. They probably know it from the title of Burton Malchio's book, a random walk down Wall Street, a book I really love. Because it's one of the most, it's probably like the giant bestseller with the most math in it of any book. And so I discovered that actually this idea is old too.

Starting point is 00:56:57 There's a graduate student of Banri Poincaray, Louis Bachelier, who comes from this non-traditional background. He worked in the boards. He worked in the bond market in Paris before coming, before coming to study math, and he was like, look, everybody's like buying and selling and trying to understand the market, but I think it's just a random walk.

Starting point is 00:57:16 I think things are just like bumping up and down randomly. Let's try to understand how that random walk would behave and see if it matches the behavior of the market. Well, again, as of the biology, this was considered a little disreputable, and Poincorre is kind of, I get that what you're doing is good math, but it's not the kind of math we do here in Paris.

Starting point is 00:57:33 Like, we're supposed to be studying the three-body problem in celestial mechanics, not like bond prices. But of course, now it's the dominant point of view, right? I mean, if you're going to sort of do finance, if you're going to get a PhD in finance, you're going to be studying stochastic differential equations and you're going to be like random walking like how. Well, and by the way, this goes back to the point you made earlier

Starting point is 00:57:51 that random doesn't mean absolutely equal probability of everything happening, right? So the stock market can trend upward while still having a random component. Absolutely, right. It's like it's noise pest plus drift. There's like Danny Kahneman's new book, It's just about this, right? No way. I like to plug out of those books, too.

Starting point is 00:58:12 I mean, I can't even talk about it at all because it's too long, and then there's the whole Einstein story, and sort of Einstein is sort of like, you know, ratifying Boltzman's theory of what the understructure of matter is by sort of treating Brownian motion like a random walk, which I didn't really know that much about. But then, you know, what we call such a thing today, what we mostly call it in math is not a random walk, but a Markov chain.

Starting point is 00:58:33 So Markov is among mathematicians, the person who really first did. develops this theory. And here's a, this is a story that blew me away that I'd never heard. He invents the Markov chain because there's this huge, like, theological war between this, the arch-conservative, super Russian Orthodox, Russian Methodians, and the angry atheist, Russian Methodians, of whom Markov was the leader. And so his opposite member, a guy called Nick Krasov, believed that he had found a mathematical proof of free will kind of ratifying the views or the Russian Orthodox Church in this matter.

Starting point is 00:59:09 And Markov was like, no way. You can believe what you want, but don't bring math into it. Like that now I'm offended. Like you would sort of say that math proves that your church stuff is correct. So Markov develops the Markov chain as a counter example.

Starting point is 00:59:22 Necrossov's claimed proof that free will is implied, that free will is implied by probability theory. I mean, you know, I always say like, you're on the internet, right? I mean, you know, is there anything more important? intellectually sterile and unuseful than like a long argument

Starting point is 00:59:40 between like a movement atheist and like a movement Christian, right? Those are nothing could ever comes out of that. No, I think I got to push back there against I think that there's a sampling problem when you when you say that. There are loud, noisy exemplars of both sides

Starting point is 00:59:57 that have very, not very helpful arguments between each other. But there can be useful, intellectually productive discussions between people who hold the same opinions, but hold them in more productive ways, I would argue. I guess you are absolutely right, but I would counter that by saying that, in fact, in terms of their personalities, Markov and the cross-off were both exactly the kind of loud and strepherous,

Starting point is 01:00:20 like people, the hopeless people. So it's amazing that out of their dispute, like, came this like absolutely fundamental mathematical idea. But then it's amazing that like none of them knew about each other. I mean, this was math, right? I mean, now sort of math is global. But then stuff is happening in England, stuff is happening in France, stuff is happening in Switzerland, with Einstein and stuff is happening in Russia with Markov. And none of them do. It's all happening

Starting point is 01:00:41 between 1900 and 1905 and none of them knew about each other. It's incredible. Right. So this is pre-Russian revolution, Necrosov and Markov. And I do, I think that the audience deserves just a little bit of a glimpse into what Necrosov thought he was doing with the free will business. Like he was trying to reconcile the idea that individuals are allowed to have free will, even though whole societies act in predictable ways. Yeah, exactly. So one thing, so you start with a law of large numbers, which has been known for a long time,

Starting point is 01:01:13 which essentially put in terms of coin flips, because the only thing we have are coins and earns, right? Everything is about one of those two things. So it's a, but this one's about coins, right? You flip coins for a long time, and with a very high degree of probability, the proportion of heads you're going to get,

Starting point is 01:01:29 it's going to get very close to 50%. For any fixed deviation, from 50%, the chance that you're going to get, say, 51%, you certainly might get 51% heads if you flipped 100 coins, but if you flipped a million coins, that's incredibly unlikely. And what people started to see, and this is long after Bernoulli when sort of people started to study demographic statistics, is that just as coin flips kind of settled down to predictable averages, if there was enough coins, so social measures, like tended to converge to predictable averages if there was enough people in your aggregate.

Starting point is 01:02:07 And so there was a lot of kind of philosophical unease about what that meant. Did that mean we're just all like mindless coins, were just all like sort of like little balls on the roulette wheel, like sort of with, you know, faded in some sense essentially deterministically to sort of like, at least as a society, like acting a certain way. Well, as you can imagine, that is not a happy thought for people of the certain. philosophical bent. So, Necrosov thought you had found a way out of this. Necrossov said, hey, look, this theorem of Bernoulli's, this law of large numbers, it requires that the coins be independent from each other.

Starting point is 01:02:45 If your coins are not independent, like let's say if each, in the simplest case, let's say each coin is constrained to fall the same way as the previous one, well, then you'd either get all heads or all tails. You wouldn't get versed to 50%. So necroft said, ah, this law of large numbers, these statistical regularities we see of like, you know, age of first child, age of first marriage, like proportion to various crimes, like all these things you see. The fact that those seem to converge to stable averages might have made you thought were just mindless coins, but no, what it really shows is that we must all be independent from one another. So, aha, I win. Yeah, but there was a basic logical

Starting point is 01:03:22 flaw there. Yeah, so what Markov observed is that he was mistaking the theorem for its converse. I mean, Brunuli's theorem says that if the coin flips are independent, then you get these, this convergence to stable averages. But it didn't show that if you have convergence to stable averages, then the flips must have been independent. So what Markov showed is that any kind of what we now call a Markov chain or a Markov process or a random walk, that the behavior of such a thing also converges to predictable averages, even though if you're sort of walking on a random one, where you are today is anything but independent from where you are yesterday, because you're very close to where you're only one step away

Starting point is 01:04:04 from where you are yesterday. So they're very, very, very tightly connected. But then, you know, having won this battle, he did all kinds of crazy things with it. He analyzed Eugene O'Nagan, the famous poem by Pushkin, and sort of studied, like, I mean, if you treat this as, it seems crazy to think about a poem as a random walk. And he certainly knew that a poem was not just a random walk,

Starting point is 01:04:27 But he nonetheless showed that you could just treat it as a string of continents and vowels. There's a very modern kind of computational style approach, reducing it to a string of bits, zeros and ones, and show that you can calculate what's the probability of transitioning from if a letter is a consonant, what's the probability the next letter is a vowel or the next letter is a consonant. That's a very simple Markov chain. You might say, like, that cannot tell you very much about this famous Russian poem. But you know what? he was able to show that it distinguished it from a different book by a different Russian author,

Starting point is 01:05:00 that there were like measurably different patterns in the Markov chain, just in the sequence of consonants and vowels. Because different authors have different favorite words or different sequences of words, so whether a consonant follows a vowel will be different likelihoods for different authors. Right. Okay. My best skin ever at 45? Give me a theme song and a best skincare award, because it feels like this.

Starting point is 01:05:24 Right? Can you feel it? That's farmhouse fresh skin, all right? I'm blowing. And everyone asks how. The best skincare is farmhouse fresh, and the award is you, your best you. Visit farmhousefresh skincare.com and use code radio

Starting point is 01:05:42 for a free starter routine with any purchase. And that's in some sense the precursor to like so much of like the natural language processing we do today. Of course, we do it in a much more sophisticated way than him writing down these kind of like, grids on pencil and paper of consonants and vowels. Well, right. So all the way up to GPT3, right? I mean, this purported artificial intelligence is actually just looking for these kinds of patterns or these likelihoods, these probabilities of different things happening in texts and spitting them back at us. Right. It's in some sense trying to find the Markov chain that best matches the language production it is able to see.

Starting point is 01:06:21 And it's able to see a lot because that has access to this sort of gigantic corpus of English language text that exists on the internet and the people out store. Okay, wait a minute. What is the definition of a Markov chain? How should we be thinking about that or visualizing it in our brains? Yeah, you should think of it as any process that explores a space where you're in some state and then, I mean, we can be getting even more complicated than this, but let's say like the simplest thing would be to say you're in some state and then where you go next only depends on where you are right now. And there might be some randomness to it.

Starting point is 01:06:54 It might be that, like, you know, if I'm at Fifth Avenue and 29th Street, I'm going to walk either a block north or block south, a block east or a block west. So there's four places I can be. And it's not deterministic. Maybe I, you know, flip a four-sided coin and decide which are the four ways to go. But the point is that if I choose to, like, walk north to Fifth Avenue and 30th Street, then my next choice doesn't depend on the choice I just made. All I know is that it's kind of like having, it's kind of like the movie Memento or he has amnesia. Every moment, you don't remember anything except where you are right now and you have to decide what to do next based on that. That's the characteristic feature that makes a Markup chain and Markov chain.

Starting point is 01:07:35 Good. But so it's very much like a random walk. It's a generalization of the idea of a random walk. But we're walking in some very abstract space, which for Markov himself was just vowel or consonant when he was looking at Pushkin. Yeah. Right, a two-point space, vowel and consonant, yeah. And so when you're either in either one, you're either in vowel or you're in consonant, there's the Markov chain is characterized by the probability that you'll either just stay at Val or jump to consonant.

Starting point is 01:08:02 Right, these are so-called transition probabilities. And you can already see that this model is not complicated enough to really capture how language works, because, you know, maybe you're going to have two vowels in a row, but you're probably not going to have five. Yeah. And this simple model cannot see that because at each stage when you're at a vowel, there's some chance of staying in a battle. So you want to dress it up a little bit more and make it a little more complicated. Yeah.

Starting point is 01:08:24 So the whole point of the Markov chain, the Markov process, is this memoryless quality. It only depends on where you are now. It doesn't remember where you've come from, which makes the GPT something you can't really have a conversation with that has no recollection of what you've already said. Well, no, because, okay, I'll say this. There's a trick to this. It's very beautiful, so I'm going to tell it to you, which is that let's say you're doing a Markov chain on letters.

Starting point is 01:08:46 But what you do, to be clever, is you say, actually, my Markov chain is on strings of three letters. So let's say if I'm like, if I see THA, if that's where my current state, probably the most likely next letter is T, right? Because I'm probably not saying like Thanos or I'm probably saying that, right? That's a very common word I mentioned in the middle of. But then your next state is not T. It's H-A-T because you're walking on the space of three-letter sequences. and you're only allowed to lop off the first letter of your sequence and replace it with the new letter at the end. So this business of memory, you can kind of build in a certain amount of memory just by enlarging your idea of what the state is.

Starting point is 01:09:25 This is a very clever idea. And so GPT3 or any language generation system, like the auto-complete in your Gmail, is a little bit more than that. And like, you know, look, I'm not competent to say what's inside the guts of GPT3 because despite the company being called OpenAI, like we don't really know what's in it. But it definitely has, they've definitely worked hard to give it more long-range memory. Yeah. And a sort of more very basic Markov chain-based system, that's part of what makes it cool. And the important thing for the audience is that when they're texting and there's autocorrect or when they're doing Gmail and they're suggesting answers, that's a Markov chain at work

Starting point is 01:10:03 proposing those possible solutions to what comes next. Right. It's saying, here's where you are now. What's the most likely place you're going next? Now, I don't know about you, but I always type something else. When it starts to suggest something, I'm like, well, I'm certainly not going to say that now that you suggested it. I'm my own man.

Starting point is 01:10:19 Yeah, but you're still part of the overall ensemble, you know, like you're going to contribute to the expectation values at the end of the day. But I'm disrupting it because it's like, oh, I guess I was wrong about like, you know, I'm reducing its level of certainty of what it's doing. I think the system takes you into account. That's my prediction. But okay, so, and then with these Markov chains, we have a way of generalizing the random walk and you can see how we're coming back to the gerrymandering, right? Because we're going to

Starting point is 01:10:43 have a Markov chain that represents where every place that you are represents a map and we're going to hop to a different map, right? And then you can sort of, once again, the space of different places you might hop to is enormously big, but then you can start talking about the ensemble and its properties and there will be in general a favorite equilibrium distribution, right? There'll be, you know, I don't know, a set of probabilities. Tell me, what the equilibrium distribution is? What is the simplest way to characterize that? Ideally, if you run the random walk for a long time, you will get to sort of some consistent answer to how much of your rock do you spend

Starting point is 01:11:25 in each possible place on the map. Now, in the case of the gerrymanders, you're probably not going to approach that distribution because you would have to run it so long that you could conceivably see like every possible distribution, which we are probably not doing. But a very well-known place that does that is Google page rank, which I think, you know, both of us are old enough to remember the incredible difference in how the internet worked. There were search engines before Google, but they did not work the way Google works. And what Google did was build the Markov chain into the process. Basically, they did a random walk on the internet, a very good abstract space that we like a lot. You're on a page, you follow

Starting point is 01:12:02 a random link from it. That's the walk. We do that for a long time. And, you know, you you start to converge to a probability distribution. There are sites you're hitting again and again. Which sites do you hit? Well, you might say, well, it must be sites that have a lot of links to them, so you're likely to get there. But it's not just that. There's something recursive about it,

Starting point is 01:12:22 because you want the sites that link to it, to themselves be sites that have a lot of links to them. If I just have like two random sites that have like a billion links to each other, I'm never going to get into that little closed ecosystem unless other people link to it. So this process of random walk and then keeping track of what's this limiting distribution of how much, how much proportion of your time to spend at each place. That's the basis of what Google called page rank back in the mid-90s when they developed

Starting point is 01:12:49 this. That is the order in which they serve you your search results. And that absolutely revolutionized people's ability to find things because this limiting distribution, what the Markov chain finds for you and competes for you is it's somehow a global property of the system. that it's a very, very hard. It's kind of an emergent property that it's very, very hard to sort of see any local way. You just got to do the walk and see what happened.

Starting point is 01:13:15 And what they saw is that it does an incredibly good job of sort of capturing this notion, this human notion of importance or centrality or the search results that I actually want. But for something like our maps, our gerrymandered maps, or the maps that we were trying to show are not gerrymandered, I think like you said, we don't have the computational power to sample every single map, much less keep coming back to the same map over and over again. There's just too many of them. So what is the thought that makes us say that some Markov chain is going to help us figure out

Starting point is 01:13:52 what a typical map looks like? Well, it helps us. Well, as I say, we can't know for certain what a typical map looks like, but we can know what a typical map sort of like, you know, within a typical map, neighborhood of the starting map. So I mean, imagine a random walk is in some sense like sort of particles diffusing from a given source. Like if you've seen any of the many newspaper articles with these incredibly terrifying COVID pictures of like what the spray of COVID particles from somebody's nose and mouth looks like who's infected, I mean, it's much denser near the person's

Starting point is 01:14:24 head and then it rapidly gets less dense as you go farther away from the person. So you should think of it like that. You should think of it as like you may not be seeing like the entire universe but you're getting a pretty good sample from the random walk. It's sort of like what's going on in the neighborhood of the map that you started with. But again, the fact that we're not sort of converging to some of the distribution, it means we're not going to do what Google does. We're not going to say, I'm going to rank all maps for you and tell you which map is best, the most random, the most likely to be hit.

Starting point is 01:14:56 But that's not the goal. The goal is not to tell you what to do. The goal is to sort of tell you what not to do. The goal is to sort of identify when an existing map is completely an outlier. So maybe the analog of that, I'm just making this stuff as I go. Maybe the sort of Google analog would be that if I claim to you, hey, my homepage is the most important site on the internet. And Google sort of started its random walk from there and walked like a billion steps and never found its way back there. That would be a pretty good, it wouldn't necessarily tell you like what was the most important site of the internet.

Starting point is 01:15:31 but it would definitely rule out that my homepage is the most important side of the internet. Because it would be like, come on. Like I walked around for like a billion steps and I never found that way back there. It can't be that important. So what is the modern mathematical research level problem in the gerrymandering and Markov chain game? Is it coming up with algorithms to do these random walks? Or is it, are we ever going to get ambitious enough to say, well, yeah, we think we actually can draw the maps as mathematicians? Get out of the way, politicians.

Starting point is 01:16:03 But why would they agree to that? Again, I'm not a politician either, so I'm not going to say that you should agree to it. So what are the mathematicians trying to do? So I would say, I mean, there's a lot of questions in research in this area proceeds a page. And I should emphasize, I am not one of the people doing this research. I'm a popularizer of this research. And I sort of, I'm a conduit between, you know, between mathematicians and various, like, you know, lawyers and commissions that I've testified. five to four, et cetera, et cetera.

Starting point is 01:16:33 So I would say some of the things people are doing are trying to get like, you know, better provable guarantees about like, maybe you don't know it's like 100% random or like sort of like can you get some kind of provable guarantees about how close to uniformly random you are. Studying the properties of different choices of what the local moves are. This is like very granular stuff, but it turns out to actually make a difference to how rapidly you converge to something that looks kind of uniform. So this is kind of, you might probably have an engineering problem just being in practice, like what versions of the many local things

Starting point is 01:17:09 you can do actually work. I mean, something that I'm particularly interested in, and I actually don't know to work except people are working on this, but as you probably know, like, around the country, there is a lot of appetite for an interest in reform of the basic voting mechanisms themselves. Like people have introduced like rank choice voting, you know, in Maine, in New York City. I think in Austin today or something like that in Austin, Texas. I think they've just adopted something like this. There's a really interesting proposal with some bipartisan support actually here in Wisconsin where the legislature hasn't been anything for like two years,

Starting point is 01:17:42 but maybe they'll do something and ask some support for both parties. So I think a lot of the work in gerrymandering has really focused on conditions as they are, conditions where we have, the voting system we have, and we have the two parties we have, and we sort of even ignore the existence of like candidates or voters who don't identify with one of those two parties. Introducing fundamental changes in how the vote works, like ranked choice voting, is going to scramble this a lot, both for the people drawing gerrymandered maps and for the people trying to detect gerrymanders' maps. So one thing that just personally I'm interested in is trying to understand how this landscape changes with more variation in possible voting systems, which, as I say, we're already starting to see, and there are a state, well, there's one state, Maine,

Starting point is 01:18:31 that has really gone all in on this mechanism. And it'll be really interesting to see how things play out there. But again, doing the work, it's going to be both math and political science and law, like, all mixed together. If you try to do it with one discipline by itself, you can't do it. We've said it before in the podcast, but why don't you just say what ranked choice voting is, so the folks get an idea for why it would be an attractive thing to shoot for. Oh, yeah. Sorry, sometimes I get immersed in this voting stuff, but I forget that, like,

Starting point is 01:18:55 most people probably to their psychic benefit, like don't think about democracy maintenance all the time. So rank choice voting means that you don't just vote for one candidate and say, this is the one I like best. You rank all the candidates who are on the ballot in order of how much you would like them to hold the office. And what that allows you to do is if you are somebody who doesn't identify strongly as a Democrat or Republican, and maybe even identifies with a third party, it allows you to vote for that party's candidate while not giving up your right to express a preference between the Democrat and the Republican,

Starting point is 01:19:34 which you might well really have. In fact, in the proposal of Wisconsin, they're talking about having ditching partisan primaries and just having open primary where all the Democrats and all the Republicans and all the libertarians and all to everybody and the unabillions, all are in one primary. The top five finishers go to the general election,

Starting point is 01:19:54 and then it's ranked twice from there. So in a situation like that, you probably would have a general election with multiple candidates. You might have two Democrats and two Republicans and a fifth person, for example. And then I think, again, I can't do this for myself as a mathematician. We need political scientists to weigh in. But intuitively, you feel like it scrambles things up and gives people more choices in a political arena, which, frankly, just listening to people talk when you go to these commissions, a lot of people are pretty frustrated with the choices that they have.

Starting point is 01:20:26 Do you think that there is some future collaboration between political scientists and mathematicians where you are very ambitious and say, look, let's try to think about what are the goals that we have in electing people, in sort of course-graining the individuals in our society into a legislature and try to figure out a priori, or, you know, Ab initio, I should say, what is the best voting system that would reflect the actual will of the people? Is that too utopian? I mean, people are going to work on it whether you think it's too utopian or not, right? There's nothing too utopian for people to think about. No, I think back, those kinds of discussions already are happening and have been happening. Look, like, go all the way back to Eros theorem. There's, like,

Starting point is 01:21:12 lots of interest in kind of, like, mathematical formalisms about not just how voting, not just analysis of existing voting, but like how voting could work, right? So that's like a pretty old idea. So what I would say, maybe this is a good opportunity, because one thing I just thought of that certainly Mung Duchen has told me is that, you know, they see these techniques not just as a way to sort of like give meaningful testimony in court about a particular political question that's happening in the moment, but as a sort of new kind of tech for doing social science research in the same way that in physics, now I'm going to talk about physics, so please tell me if I say something like completely off-face. I mean, people use Monte Carlo's simulations all the time to sort of simulate and try to understand what the effect of doing a certain thing would be in a situation where for various reasons they can't actually experiment.

Starting point is 01:22:04 Isn't that a thing that's a thing they do? I mean, I went to a great seminar of like a guy who's like, well, we're sort of trying to figure out things about like atomic weapons, but we are not, you know, we don't just. like drop atomic bombs on places where we think not too many people live or not too many people who are important to us live anymore. Like that's not how we test anymore. Like a lot of it is like simulations based on like pretty hardcore American analysis, but also multi-carlogic me. So I think that at least some of the people in this area envision a world in which, you know, also in political science like experiments are like hard to do and often ethically questionable, right? So maybe this is too utopian, but I mean there are,

Starting point is 01:22:45 there is an idea of like we can at least get some good ideas about what the results of sort of different kinds of voting systems would look like from simulations of this kind. And maybe to wrap up on a less utopian note, because I know that, you know, as scientists and mathematicians, we like to be utopian. But as you emphasize very, very properly, politicians feed on the ground have a different set of values and different set of cares. So one of the things you mentioned very briefly, which intrigued me was the idea of inventing schemes by which, you can admit that the map will be drawn by a legislature, not by an independent commission. And yet inventing schemes where sort of like for two kids trying to slice up the pie, they are driven toward a more or less equitable outcome. Yeah.

Starting point is 01:23:34 I mean, so this is sort of like a famous, there's a very famous problem called the cake cutting problem, which is a very elegant answer. Like, how do you cut a cake fairly into two pieces between two people, who both want to eat and both like cake and want a lot of cake. And the way you do this is you have one kid cut and the other kid chooses. So the kid who, so somehow this iterative process, who you separated into steps,

Starting point is 01:23:58 somehow sort of creates an incentive to like make the division roughly fair. And then there's like an entire literature of like how do you do this if there's three kids and three pieces, four kids and four pieces like, you know, multiple kids who may be able to choose a subset of the pieces, whatever. Like, mathematicians, once we have something,

Starting point is 01:24:15 be like generalizing that. But so, yeah, so one thing that people have discussed is, could there be a protocol where you make a game out of gerrymandering? Like, you know, you start with some map, but then the parties take turns. I'll change it this way. Now I'll change it. Now, I'll change it. Each team sort of tried to make it maybe presumably tried to make it more favorable to their party.

Starting point is 01:24:41 Now, actually, you have to be a little careful on the people who write about this. There's more complicated protocols than what I just said. to make it even possibly work. But the idea is that the reason this might be useful, I haven't decided if I think it's practically useful, but the reason that might be useful is that because these, it's precisely because these highly gerrymandered maps are so special and they're so unusual in their own neighborhood.

Starting point is 01:25:03 They represent this kind of local apex of partisan advantage. So, like, imagine if there's, like, two kids wrestling on top of the mountain, and one kid is trying to get you to the top of the mountain, and one kid is trying to get you off the peak. Well, there's a lot more non-peak than peak. So if the two kids are going to take turns pushing each other, they're not going to stay at the peak. They're going to be like somewhere else that's like much farther down.

Starting point is 01:25:24 So I think this is like some, but I listen, here we come back to something we said at the very beginning, that the current mechanism makes it so easy to do so much to cement your partisan advantage that almost any disruptive change, I think would be much more likely to improve the situation than to make it worse. So this is a rare example where entropy is working in our favor. Kick the TV. That's what I say. Kick the TV and it's going to start working again. Or at least we'll work better than it works now. I like that message. Thanks so much. Jordan Ellenberg, thanks very much for being on the Mindscape podcast.

Starting point is 01:25:55 Oh, thanks for having me. This was great. What if you could have even more and more and more help to pursue your goals? At LPL Financial, we offer more ways for advisors and their clients to thrive. So what if you could? Paid advertisement. Investing involves risk including potential asset principal, LPL Financial LSC member FINRA SIPC.

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas - 151 | Jordan Ellenberg on the Mathematics of Political Boundaries

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.