Microsoft Research Podcast - 065 - Securing the vote with Dr. Josh Benaloh

Starting point is 00:00:00 Elections in the U.S. are conducted mostly at the county level. There are over 4,000 counties in the U.S. There are over 8,000 separate election jurisdictions in the U.S. And the thought that a small county somewhere, maybe in an important swing state, could have enough cybersecurity in place to withstand an attack from state-level attackers is just not realistic. Even though we can't defend an election from being tampered with, we can institute good auditing. And one method of good auditing is this end-to-end verifiability that allows you as a voter to see whether your vote has been changed. You're listening to the Microsoft Research Podcast,

Starting point is 00:00:52 a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. If you've ever wondered why, in the age of the internet, we still don't hold our elections online, you need to spend more time with Dr. Josh Benelow, Senior Cryptographer at Microsoft Research in Redmond. Josh knows a lot about elections, and even more about homomorphic encryption, the mathematical foundation behind the end-to-end verifiable election systems that can dramatically improve election integrity today,

Starting point is 00:01:23 and perhaps move us toward wide-scale online voting in the future. Today, Dr. Benelow gives us a brief but fascinating history of elections, explains how the trade-offs among privacy, security, and verifiability make the relatively easy math of elections such a hard problem for the internet, and tells the story of how the University of Michigan fight song forced the cancellation of an internet voting pilot. That and much more on this episode of the Microsoft Research Podcast. Josh Benelow, welcome to the podcast. Thank you so much. Your bio begins by saying that you're a senior cryptographer at Microsoft Research.

Starting point is 00:02:11 What does a senior cryptographer do for a living? We'll get you up in the morning. It varies quite a bit. I tend to have a foot in a few different places. I do research on elections in particularly, but cryptographic protocols more generally, multi-party protocols. I do policy work of various sorts, and I spend a lot of my time doing internal consulting with Microsoft product groups. So what do you do when you consult with a product group? Do you give them insights into why they're not as safe as they think they are? Well, there are a few aspects to this. There's a benefit to having one part of me on the research side and one part of me in the practitioner side, because I know what is being developed. I know what's coming down the pike. So I can talk to product groups about how they should do things. Cryptography is very subtle.

Starting point is 00:03:06 It seems easy. There are some books out there that are very popular that people read and think, OK, I understand this all. And when I go into somebody's office and I see this on the shelf, I know I have a hard time ahead of me. What, cryptography for dummies? There are books with titles like Applied Cryptography or Practical Cryptography that often oversimplify things. And just the order in which you do certain steps can make a critical difference.

Starting point is 00:03:35 So if I can go to a product team and they show me what they're doing and I can say, no, if you do A before B, you're going to have a problem. You're going to regret it. Do B before A. So do they seek you out, the product groups, as help us out here, Josh? Or do you go around and say, hey, you guys need this? It's almost always seeking out. But there are a lot of different ways in which I can work with product groups. For about the past, I think, 14 years, we've had something called a crypto board at Microsoft. And the crypto board

Starting point is 00:04:14 is responsible for multiple tasks. One of them is setting corporate policy for how cryptography is used. Another is working with product groups to try to consult and help them. Let's talk about this concept of security. We're doing so much on the internet these days, and arguably we're going to do more and more. And yet the internet was not designed with security in mind. So what are folks like you doing about that? Yeah, it's certainly true that the Internet was not designed for security. The Internet was designed by a few people who knew each other very well, and they were trying to set things up so that they could communicate with each other. And there was no sense in the initial designs of the Internet

Starting point is 00:05:02 of having to secure it against malicious forces. More recently, we have had to build security on top of the Internet. If we had designed it initially with security at the base, it would have been a lot easier. Cryptography is an important tool in trying to secure it in a variety of ways, trying to keep data private, trying to make sure that the people you're talking to or the people who you think you're talking to, trying to maintain integrity of data, which is distinct from privacy of data. And basically, I should have access to the data that I'm entitled to have access to, and I should not have access to the data that I'm not entitled to.

Starting point is 00:05:51 Right. In practice, although cryptography is an important tool there, and it's very widely used now, most breaks of the internet are not about the cryptography. Most breaks are maybe the use of the cryptography or the implementation of the cryptography or things that even go around the cryptography. We've essentially got a very good front door, but the windows and the back door and other parts of the house that we're trying to protect are more vulnerable. All right. We're going to come back to that what freaks me out at night kind of thing. I was going to say what keeps you out at night. That freaks me out at night. All right, Josh, let's talk about securing the vote. Okay. First, could you give us a little election history? I mean, I don't know that people think about how we're voting now and how it used to be and how it's changed, but I think that would be a good level set for launching into what you're doing with election security now? Sure. Well, voting certainly goes back for millennia and attempts at voting secretly

Starting point is 00:06:49 go back for millennia. Public votes are difficult to tamper with in the sense of tampering with the votes themselves. But of course, when votes are public, then votes are coercible and people can sell their votes or have their arms twisted into voting particular ways. The history is not a pretty one. People have been stealing elections for centuries. As long as they've been voting. Yes. Vote for me. It's interesting to note that most U.S. presidents were elected without the benefit of the secret ballot. The secret ballot in large elections actually was a technological innovation when it came about in the mid-19th century. So the way that people voted for president or many offices was typically you would go to a public space and you would announce in some public way what your vote was. And that makes it easy to tell who's voting and that the only authorized people are voting and that the votes are being counted properly, but it enables coercion and vote selling.

Starting point is 00:07:59 The way that we vote in a poll site, in-person voting, was a technological advance in the 1850s. It's typically called the Australian ballot, the process wherein people are voting in private within a public space so that the fact that they are voting privately is publicly monitored and enforced. So this was an innovation at the time, and it took a while to move around the world. In the U.S., the number of U.S. presidents is different from the number of the U.S. president because Grover Cleveland was president number 22 and number 24. And that actually came about because of the lack of a secret ballot. So in 1884, Grover Cleveland was elected without a secret ballot. In 1888, his competitor, Benjamin Harrison, defeated him again without a secret ballot in an election that was utterly rife with vote selling, with coercion. And it was so bad at that time that the U.S. moved very quickly to secret ballots.

Starting point is 00:09:10 And in the election of 1892, there was a rematch between Grover Cleveland and Benjamin Harrison. And with the benefit of the secret ballot, Grover Cleveland won again and became the 24th president. Okay, so we've got the secret ballot now. Since the mid-19th century, lots of stuff has happened technologically and the size of the country. And so where are we now? Give us a lay of the land of what the unique problems are and what you're doing with technology to work on these big problems.

Starting point is 00:09:58 Well, if we look at elections broadly, they're much harder than most other problems. The thing about banking or e-commerce or other fields is that if something goes wrong, I know about it. I can have an opportunity to correct it. If my vote is changed, I don't know it. There's no way for me to discover that my vote has been altered, much less correct it. That's, of course, within the context of a secret ballot. Of course, if we gave up on the secret ballot, then it would be very easy. We could have a public vote in which

Starting point is 00:10:38 people cast their votes in person, by mail, over the internet, even by carrier pigeon. They just make their intentions known to the voting office, and the voting office publishes a list on a webpage somewhere, digitally signed for integrity purposes, so that everybody can look at the list and see, yep, my vote is there. It's correct. I can see all the other people who are voting. I can check that they are legitimate voters, at least according to the registration lists. Of course, the accuracy of registration lists is another question, but that's a little bit out of scope here. Right. That's a public process.

Starting point is 00:11:14 Scope creep. Yes. But I can check that everybody else who's listed there is supposedly a legitimate voter and that they have had an opportunity to check the list as well because the list is public and I can add up the votes on the list myself. But that's not going to happen. Yes. So it turns out it's possible to achieve all of these things with a secret ballot. And this is where cryptography comes in. With just a little bit of cryptographic sauce in just the right places, it's not that hard to achieve the same things. And the basic trick is we can still have a public list, but the public list is not of my vote directly, but it's of my encrypted vote and everybody else's encrypted votes. So you can still check that you're on the list. You can still check who else has voted. But now you need means for checking that the encrypted votes really

Starting point is 00:12:15 do correspond to the tally. That's all math. And cryptographically, we can do that. We can compute unencrypted data in such a way that we can process the data in encrypted form, turn it into an encrypted tally, and then provably decrypt that tally so that anybody can check and confirm the whole process and say, yeah, this all looks good. We also need to provide tools that allow voters to be certain that the encrypted votes next to their names, which are opaque, really correspond to their selections. And that seems hard, but it turns out there are some pretty good ways of doing that that are quite effective, and we can make that happen as well. And I should try to be clear. I'm describing one process,

Starting point is 00:13:06 but there's this notion that's called end-to-end verifiability that enables this kind of election verification to take place. Talk a little bit about end-to-end verification. It's a definition, but there's many ways to get to it. Exactly. Could you break that down? Sure. The definition of end-to-end verifiability is the properties that I just described. Voters should be able to check that their own votes have been properly recorded, and voters should be able to check that all of the recorded votes have been properly tallied.

Starting point is 00:13:37 You might think that's missing a little bit that, well, but I can't check that your vote has been properly recorded. But in some sense, that's not really possible in any means without my knowing what your vote is. Right. So this is really the most we can hope to have. And this is what we achieve in a public election. The challenge now is to achieve the same thing with a secret ballot election. And that's where

Starting point is 00:14:03 the cryptography comes in. And there turn out to be multiple ways of achieving these goals. Some very creative ways have come up for both parts, for the checking that votes have been recorded properly and for the tallying. But the tallying tends to be highly cryptographic, either by using homomorphic tallying methods, which compute on encrypted data and transform these individual encrypted votes into encrypted tallies. Or there's another approach that's commonly used called a mixnet, which is basically a cryptographic shuffle of the ballots, which has some benefits. It also has some privacy concerns.

Starting point is 00:14:46 So there are trade-offs, but it's good to have both techniques available. Well, and in fact, we're going to have trade-offs no matter what. I mean, it's just a matter of what is the sweet spot for what we're willing to reveal versus what we want to keep private. Yes. And that manifests itself very distinctly in the election context, because you would think that it would be a good thing to have all the ballots be public in the end. But if the ballots are public, then it allows voters to sell their votes once again by various little tricks. For instance, if I work for a small company, say, with a couple hundred employees, and the owner of the company is running for mayor, the owner can say, I want

Starting point is 00:15:36 everybody in the company to vote for me as mayor, and there's some minor office down at the bottom of the ballot, you know, dog catcher type of office, write in your own name there. And I will now look at all the public ballots that have been published and make sure that for each of you, there is a ballot of that form. And if I don't see a ballot with your name for dog catcher and my name for mayor, don't bother coming to work tomorrow. And that is sort of an obvious way of revealing things,

Starting point is 00:16:06 but there are less obvious ways that can be used that are collectively called pattern voting, where a different pattern is assigned to each voter of the low-level contests. And that pattern had better appear. If a town council has 10 different seats, then there are 10 factorial ways of ordering the names. And the local controlling entities could just go around

Starting point is 00:16:39 and assign a different permutation to each voter and say, if this permutation doesn't appear on a ballot, then whatever consequences might occur. Suddenly, Josh don't feel so good. Exactly. You know, there's another side of it, too, in the 21st century, where if your vote was made public, it wouldn't just be the vote-buying thing. It would be the Twitter mobbing afterwards. If somebody decided they didn't like how you voted, they can ruin your life after the fact.

Starting point is 00:17:13 Yeah. Well, I mean, these are all ruining your life after the fact kind of things. Sure. But there could be vote-shaming of various kinds. Vote-shaming. What a fascinating word. We've invented a new term.

Starting point is 00:17:23 I love it. Okay, so cryptography in general being a way to get closer to that sweet spot. What's the math and the theory behind this? Our listeners are pretty technical, and I've seen some of your slides on this. I didn't understand any of the slides, but I saw them. That's because you didn't hear me talking about them. They're easy. It's busy. So homomorphic encryption has existed for over 30 years. When I was an undergraduate, I took a cryptography class from Ron Rivest at MIT. And Ron is great. And Ron does in many of his classes, and I do this very often when I teach, whenever I can, is have students do projects at the end of the semester or quarter or whatever it is. And in this case, this was in the spring of 1981.

Starting point is 00:18:38 There had just been an election in 1980. It was a three-way election. And I was sort of naively thinking, maybe if we could computerize this somehow, we could have ways that would allow people to express their preferences in some way. I didn't know at the time that it doesn't really work very well. There's Arrow's theorem and related theorems that say that there's nothing good that we can do with preference lists. But I was thinking about this and I decided I would do my project on elections. And Ron handed me a paper that he'd written with a few colleagues, Edelman and Dertouzos, called Privacy Homomorphisms, which

Starting point is 00:19:18 talked about how you can manipulate data that had been not encrypted, but protected in some way. Mathematically. Yes. And I had been a math major. I jumped on the word homomorphism. I knew what that meant. It meant that you can apply an operation that preserves the structure of something. So I thought, yes, maybe I can do that in this context. And I did that I could actually do something that went much further to solving that problem and started using homomorphic encryption as a way of enabling an election that really could be verified. Elections really come down to doing addition. It's just ones and zeros, who you voted for, and the tallies are just adding them up. And there's really not much more than that.

Starting point is 00:20:32 So the trick is basically if we have an additively homomorphic function, then we can take encrypted votes and apply this additive homomorphism, and we get an encryption of the tally. And that's pretty much all there is to it. There was a whole thesis in that. Well, there was a little bit more in the thesis, but that was the core idea. That's awesome. And it turns out, could be done very efficiently. So I invented a little additive homomorphic encryption system of my own. The general form of RSA, at least, typically uses a large exponent. RSA in its practical usage today typically uses a small exponent. And my encryption system used two small exponentiation,

Starting point is 00:21:22 which is a lot more efficient than one large one. So it was an efficient way of doing things. People have been interested in homomorphic encryption for years, saying, well, there are additive homomorphic encryption systems, there are multiplicative homomorphic encryption systems. I wonder if we could do something that would do both at the same time. And it turns out that if you can do addition and multiplication together, then you can do everything. You can do out that if you can do addition and multiplication together, then you can do everything. You can do any computation because you can break computations down into addition and multiplication. It's really the equivalent of ands and ors. But people thought

Starting point is 00:21:55 for many, many years that it would be nice, but there are good reasons to think that this just can't be done. And then about 10 years ago, it turned out it could be done. It was Craig Gentry that found a way to do this that was wildly inefficient. It slowed things down by, I think, about 25 orders of magnitude. It was about 10 to the 25th slower than doing the computation directly.

Starting point is 00:22:20 Yes, that's of course ridiculous. You can't do the simplest computation with that. And there's been a lot of research for computation on data in encrypted form. So once we have that, then we really enable very powerful cloud data services. And we can do some of that today. We can't do all of it because the general computation is still nowhere close to efficient enough. The reason I want to distinguish it is I want to make clear that even though we're using homomorphic encryption for elections, we're using what's now come to be called simple homomorphic encryption, which is very efficient. We've heard the pronouncements. People aren't going to put public elections on the Internet until it is end-to-end verifiable.

Starting point is 00:23:19 And we can't do that yet. So we're not doing this yet. Right? Well, I would say a little bit more. That's what I want you to do. OK, then I will. Isn't this speculative technology now? Where is it being used? Is it being used? No, it is not speculative. Yes, it is being used. So end to end verifiable election systems do exist and have been fielded in a variety of contexts. They tend to be in small, localized contexts. There was an end-to-end verifiable system used in Tacoma Park, Maryland for a couple of public elections, a few hundred voters, but they used a very nice end-to-end verifiable system there called Scantegrity.

Starting point is 00:24:05 And the voters liked it very much. But it was a lot of work. And after doing elections in, I believe it was 2009 and 2011, Tacoma Park was very happy to continue it. But the researchers who had put a lot of time and effort into it were saying at that point, find a way to do it yourself, please. And, of course, there was nobody to pick it up. So it hasn't been done so much since. There's an internet-based end-to-end verifiable system called Helios that's used in a variety of places.

Starting point is 00:24:35 It's used commonly by professional societies, like the ACM uses this, the IACR, the International Association for Cryptologic Research, and many professionals. Of course they do. Well, actually, it was very hard to bring this in because cryptographers are very suspicious people. That's funny right there. So it was quite a challenge to move the community to this. What the community had been doing

Starting point is 00:25:02 was mailing out paper ballots and double envelopes and such. And many people thought we should continue doing that, even though it costs tens of thousands of dollars. You don't even eat your own dog food. Well, we weren't. But eventually, the economics, together with the import of the election, moved us to, you know, we can really do this. Many of our offices, there's only one candidate. There aren't a lot of people spending a lot of money trying to steal this election.

Starting point is 00:25:33 It's okay. I would not recommend this for public elections. Not today. End-to-end verifiability is a very valuable tool, and it mitigates a lot of the concerns of internet voting. I was a part of a U.S. Vote Foundation study that was published three years ago that looked at using end-to-end verifiability for internet voting. And some of the conclusions were that it's absolutely unconscionable to do internet voting without end-to-end verifiability.

Starting point is 00:26:06 It mitigates many of the problems in a way that nothing else can. And without it, it's just so vulnerable, you shouldn't even touch it. However, there are still problems even with end-to-end verifiability that have not been adequately mitigated. And therefore, we should not be doing that today, even with end-to-end verifiability. What the report recommends is we should be using end-to-end verifiability in personal elections, in poll sites, in traditional ways first, getting more experience with it before we contemplate the next step of moving it into the internet realm. We should definitely be doing things carefully. Anytime we try something new,

Starting point is 00:26:53 it's good to put it out to the public to have dry runs and trials. There was an internet voting system that was built for Washington, D.C. a few years ago. And it was actually built very well, although it wasn't end-to-end verifiable. But it was open source. It did many of the right things. The code was actually quite good. And they did the right thing of putting it out for a public challenge in a mock election before it was actually used. And a professor at the University of Michigan, Alex Halderman, got a few students to look at this during the trial, and they eviscerated the system. They were watching the internal cameras in the voting center, watching the workers look at what was going on. They were able to compromise all of the votes,

Starting point is 00:27:47 change all the votes to anything they wanted. They changed the actual voting mechanism and put the University of Michigan fight song at the end of the voting process. The first that the election officials in D.C. got wind of what was going on was when they started getting complaints from voters saying, I really like this internet voting process, but that music at the end is kind of annoying. And that's when they discovered that this trial had been hacked and they did cancel the project. I always ask my guests on this podcast, What Keeps You Up at Night. I want this end-to-end verifiability to be there. It's not. But what ought we to be thinking as we now know these things, I think, that were made clearer? Maybe we had our head in the sand a little bit? There's a lot that we can do.

Starting point is 00:28:39 Okay. So I was an author of a report that was just released in September. It was released by the National Academy of Science, Engineering and Medicine called Securing the Vote, Protecting American Democracy. There's a lot of good stuff in the report. A few highlights I can give now. Feel free to download it and read it yourself. I think it's a very readable report. But some of the highlights include that, of course, the 2016 election was infiltrated. But we saw no evidence of actual tampering with votes themselves. Even though we know now that it's very possible, we didn't see any evidence of that. In fact, I was on a phone call with election officials from all over the country two days after the election. And the consensus was that as far as the casting and counting of votes go, this was the cleanest election that people had seen in a long time. There were certainly some incidents, but they were relatively small compared to what is typically seen in an

Starting point is 00:29:45 election. So we know there are vulnerabilities and they really do need to be addressed. So there are a few steps that can be taken. Certainly, we can apply best practices, things that are not being done today, to secure registration lists, to better secure the actual voting equipment. And there are things that we can do in terms of basic general security practices that most industry does today. The problem we have is that it's an asymmetric battle. Elections in the U.S. are conducted mostly at the county level. There are over 4,000 counties in the U.S. are conducted mostly at the county level. There are over 4,000 counties in the U.S. There are over 8,000 separate election jurisdictions in the U.S. And the thought that a small county somewhere, maybe in an important swing state, could have enough cybersecurity in place to withstand an attack from state-level attackers is just not realistic. Even though we can't defend an election from being tampered with, we can institute good auditing. And one method of good auditing is this end-to-end verifiability that allows you as a voter to see

Starting point is 00:31:02 whether your vote has been changed. We have other methods of auditing, and there are better methods of administrative auditing that can be used. But end-to-end verifiability offers a public audit, allows voters themselves to become part of the process and to check for themselves and candidates to check for themselves and interest groups or news media or others to do the checks. So that's something that I very much hope that we'll be moving towards. So in conclusion, we need more research? Classic last line of a dissertation. I'm actually reluctant to say that. Yes, it's true, but. And the reason I want to say but is that end-to-end verifiability is ready to go now.

Starting point is 00:31:47 And many people are putting it off and saying, well, that sounds good. You should do more research on it and wait. And sure, there are ways that we could improve it. Better usability studies would be helpful. There's a lot of opportunity for good usability research. There are some opportunities for other areas, but usability applies in traditional elections as well. And we're not saying, well, we haven't done enough usability studies, so we can't have an election. Let's do that research. But that doesn't mean we should wait on deploying end-to-end verifiability

Starting point is 00:32:22 and the best tools that we have now. And then also, let's do some research on how to make them better. Josh Benelow, thank you for coming on the podcast. It's been delightful having you as a guest. Oh, thank you so much, Gretchen. This has been really a lot of fun. To learn more about Dr. Josh Benelow and how cryptographers are working to secure the vote,

Starting point is 00:32:48 visit Microsoft.com slash research.

Microsoft Research Podcast - 065 - Securing the vote with Dr. Josh Benaloh

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.