Epicenter - Learn about Crypto, Blockchain, Ethereum, Bitcoin and Distributed Technologies - Eli Ben-Sasson: Zero Knowledge Proofs

Episode Date: February 1, 2016

Zero Knowledge Proofs are methods of providing cryptographic proofs to another party while keeping some information secret. The simple concept of ZKP offer tantalizing possibilities: Banks could prove... solvency without revealing depositors. Governments could prove the fairness of an election without compromising privacy. Computer science professor Eli Ben-Sasson joined us to discuss where blockchains and cryptocurrencies intersect with Zero Knowledge Proofs and related technologies such as zkSNARKs. It offered a fascinating view into what will surely become a core part of blockchain tech in the future. Topics covered in this episode: What are proof systems? Zero Knowledge Proofs (ZKP) and other terminology such as SNARKs and zkSNARKs The mechanics of Zero Knowledge Proofs The role of performance in Zero Knowledge Proofs Applications of ZKPs The widespread potential impact of ZKP to verify processes Episode links: Eli Ben-Sasson's Website SNARKs for C talk by Madars Virza Stackexchange: What are SNARKs SNARKs for C paper [PDF] Zerocash Talk This episode is hosted by Brian Fabian Crain and Meher Roy. Show notes and listening options: epicenter.tv/116

Transcript
Discussion (0)
Starting point is 00:00:00 This is Epicenter Bitcoin, Episode 116 with guest Eli Ben Sassan. This episode of Epicenter Bitcoin is brought you by Hi. Dot Me. Protect yourself against hackers and safeguard your identity online with a first class VPN. Go to high.me slash Epicenter and sign up for a free account today. Hello and welcome to Epicenter Bitcoin, the show which talks about the technologies, projects, and startups driving decentralization in the global cryptocurrency revolution. My name is Brian Fabian Crane. And I'm Meher Roy.
Starting point is 00:01:04 Today we have got a very special episode for all of the crypto nerds out there. When I started with Bitcoin, I was reminded of Arthur C. Clark's statement that any sufficiently advanced technology is essentially indistinguishable from magic. I saw Bitcoin and I thought this is technology that fits that bill. It's a huge organization without an hierarchy. It's a global financial system and it's an immutable. ledger. That was something very special that I found in Bitcoin when I started. About a year into my journey in understanding Bitcoin, I came across this other technology
Starting point is 00:01:43 that, which I also, for which I also had the feeling that this is sufficiently advanced, that is indistinguishable from magic. And that was the technology of ZK Snarks. Now, you might have heard of this name. It is the upcoming currency Zed Cash or Zero Cash is based on this. and it's going to be a currency that preserves fungibility by being completely private and anonymous. So today we are joined by Professor Eli Ben Sasson from Technion. He's a professor there. And we are going to talk about some of the basic principles of zero knowledge proofs and cryptography that allows zero cash to be built.
Starting point is 00:02:26 So we are going to talk about the fundamental constructs today, followed by an episode on zero cash in the future. But before we get started, let's first have an introduction from you, Eli. Could you give us your introduction? Yes. So, hello. My name is Ellie Ben Sasson. I'm a professor at Technion, Israel Institute of Technology at Haifa Israel. And I do my research there mostly on math and the theory of computation. And I also try to apply it in the context of various proof systems, such as ZK Snarks. So how did you become interested in this area? And what is the kind of importance of ZK Snarks?
Starting point is 00:03:23 Okay. So proof systems. in general are well known within the theoretical computer science community to be, as Meher just described it, some of the most amazing magic that came about in the past 30 or 40 years in computer science. And like many other colleagues of mine, I got attracted to this field and wanted to understand the mathematics and the implications and how do you build these things, how do you analyze them, and what are their attributes and limitations. That's how I got into it some 15 years ago. So what is a proof system?
Starting point is 00:04:03 Okay, so a mathematical proof as you might learn it in a course in mathematical logic is just a string of bits that describes a proof. A little bit just like a computer program is a string of bits that describes a computation. Now, the amazing discovery that started in the late 1980s called interactive proofs is that you can get almost the same mathematical rigor in a much more efficient way if instead of asking for a full proof as in a textbook in mathematics, you are willing to undergo the sort of proof mechanism that we see in courts of law, where, you know, there's a defendant or a prover in this case that wants to prove a statement, and there's a verifier that you could think of him as the counselor, what's called, the advocate,
Starting point is 00:05:03 that tries to query and question the prover, and the goal is to find out whether this statement is true. So in an interactive proof, there is back and forth between prover and verifier, and it's all in the purpose of proving or checking whether a statement is true. And the amazing thing is that you can get proofs of any statements in a very efficient manner that's much shorter than the classical mathematical way in this sort of courtroom-style interactive proof system. Okay, that's very interesting. So normally one would have to provide a lot of data, for example, to prove that a certain thing computation was executed, for example, whereas with an interactive proof, you could have someone
Starting point is 00:05:53 questioned you. And does that mean it's like probabilistic? Does that mean like I'm going to ask, you know, a random set of questions and, you know, I'm going to get a right 99.9% of the time or do you get the same kind of certainty as if the other party gives you the entire proof? Great question. So you don't have absolute mathematical. certainty with an interactive proof. That's something you have to, you know, throw away. So you don't have absolute certainty. You only have probabilistic certainty, but you could get that probability to be as close as you want to one. And you do use randomness in order to facilitate this whole process. And also it's very important that there's interaction between the two parties.
Starting point is 00:06:46 So those are the three ingredients. You use randomness, you have some probability of error. It's not 100% proof, you know, mathematical proof, but it's as high probability as you are willing to accept. And you need to use randomness, interaction, and there is a probability of error. What you gain is amazing efficiency. So for our listeners, like right here, when we use the word proof, what you can think of mentally is kind of some of the proofs that you used to do when you were in school, which are mathematical proofs. But you must also remember that a lot of financial transactions or financial, data can be also written in the form of proofs.
Starting point is 00:07:43 For example, just bear it is in mind that the word proof is a very broad thing, I think, here, and it could include something like proof of owning 100 Bitcoins, for example. That is also a proof, right, Professor? Yes. Whenever a computation is executed and reaches a result, any computation, any computer program that you run that got an internet. input, ran, ran, ran, and then produced an output. You could think of it as a statement.
Starting point is 00:08:15 The statement is, from the input, I ran this computer program and got that output. That is a mathematical statement. It's a very general mathematical statement, and one that captures many, many things. These computations could be any computer program. And each one of them forms a statement. And now we can prove these statements about any computer
Starting point is 00:08:39 program being executed. correctly. Exactly. So you can apply it to financial statements. You can apply it to statements about healthcare. You can apply it to statements or computations that are related to law enforcement. You name it. Anything that a computer executes, you could get a statement out of it that you can now prove using all kinds of cryptographic proof systems. So in this case, to verify a proof, does that mean that the party that's verifying it has to have, you know, all the input data that went into that function? The answer is yes, but you could sometimes have a cryptographic reference to that input data. For instance, maybe there's a large input file that needs to be, you know, used by the computation, but to make things.
Starting point is 00:09:35 succinct, you could run a different program that gets only a hash of that data file. And now the execution is relevant to the hash of the file. So you could cryptographically compress inputs to make them very succinct. But at the end of the day, one has to start with a particular explicit input and reach an explicit output. And then there's all sorts of auxiliary inputs that maybe you want to hide them using things like zero knowledge proofs. So you start with some input, you reach some output, and in the middle you may use all kinds of auxiliary inputs that you don't want to be revealed.
Starting point is 00:10:19 So revealing inputs and outputs is an option. It's not something that you must do, but it depends on the statement. For instance, let me give an example. I can prove, you know, someone owns a Bitcoin. Okay, you know, I won't tell you anything about, let's say any transaction or any pointer to any Bitcoin blockchain, I could just prove to you there exists some Bitcoin owned by someone.
Starting point is 00:10:46 Well, okay, that's a statement that I can prove. But in the context of a financial transaction, that probably is something you don't care about. What you care is you want to know that I own a Bitcoin, that I know of a Bitcoin transaction. And when you want me to say that to you, want to see what is, you know, indicator of that Bitcoin transaction. So you would want to see some input. So for example, like, let's take this one example and then go through what kinds of proofs there are.
Starting point is 00:11:20 So the simple example I want to take is like in 2014, BitStamp the exchange did a proof of solvency. So they wanted to prove to the world that they owned around, I don't know, it was around 100,000 bitcoins or something. The way they did that is they called Mike Hearn. Yes, Mike Hearn. And Mike He came, he physically went to BitStamp
Starting point is 00:11:44 and he actually verified that what, which outputs BitStamp is claiming as their own and he had BitStamp sign messages with private keys corresponding to those unspent
Starting point is 00:12:02 outputs. So you know, maybe there's like 100 outputs in the Bitcoin blockchain. BitStyme was claiming, okay, I own these outputs and I have the private keys. So if you imagine me as BitStime, I'm trying to say that. And then there's Brian who is like Mike Hearn. And he comes to me and he says, okay, if you want to prove that you own these 100,000 bitcoins, this 100 outputs, then get those private keys, take this message from me and sign all of, sign this message by all of these different.
Starting point is 00:12:34 private keys. Then I go and sign and I send those signatures to Brian and Brian can check those signatures and say okay I agree that you have a hundred thousand Bitcoin spread over these hundred different outputs right. So this was done in 2014. Now for this what's what's relevant is we needed a human like we needed Brian to be there verifying it. Now what you're saying professor essentially is this this whole process could be automated by having, let's imagining an ideal program. What that ideal program does is it takes inputs, the private keys that BitStamp has, goes to the Bitcoin blockchain, checks how many Bitcoins corresponding to these outputs it has,
Starting point is 00:13:23 and then it outputs the value of the total number of Bitcoins controlled by BitStamp. So the input is the private keys, there's a program, and then the output is, is how many Bitcoins does the person who gave the input have? Right. Now, what you're saying is, BitStamp can run this program on their end and generate a proof that somebody else could verify on their side. And he doesn't need to be physically with BitStamp or communicating with BitStam.
Starting point is 00:13:56 Precisely. Yes. And these private keys would be what we. would call in our papers and talks auxiliary inputs or a non-deterministic witness to take a more technical term. So the private keys that are used to sign these various messages are data that you do not want revealed. You just want to prove that you know its existence. So this would be auxiliary input to our computation, this non-deterministic witness that proves knowledge of of the keys that control these 100 coins.
Starting point is 00:14:36 And the statement then is, the statement being proved is there exist 100 keys. There exist 100 keys. No one is revealing them such that, you know, they are plugged into the signing and verification of signatures. Algorithm, all of them, you know, are correct. And the total sum of them is, whatever, a thousand coins. So the statement is someone except.
Starting point is 00:15:02 execute the program and this program output the value a thousand coins and that program is the one we just described, the one that checks signatures and then sums up the amount of value in the coins. Perfect. Yeah, that's very interesting. Now, just to take a little step back, the term snarks, so there's the term snarks, right? There's the term ZK snarks, like zero knowledge snarks. And then there's also the term zero knowledge proofs. Can you briefly explain what exactly each of those are and how they're related? Yeah. I think there's an abundance of terms that talk about very similar things. And so zero knowledge, there are a lot of zero knowledge protocols out there. Okay.
Starting point is 00:15:59 theoretical and a few are implemented. And there are several different names for them. Sometimes some of them are called computationally sound proofs, and sometimes it's non-interactive, zero knowledge, and there's snarks and snarks, and I'm sure there are a few other proofs of delegation of computation, various different names for things that are somewhat similar. So zero knowledge means that informally means that the auxiliary inputs that are
Starting point is 00:16:38 involved in the computation, all these keys and stuff that is not explicitly, you know, produced as an output of the computer program, all this information is hidden and kept private. That's the informal description. There is also a formal definition that is a little bit more involved than I don't want to get into. So there are many, many different ways to get zero knowledge proofs. And then there are many, many flavors of zero knowledge proofs according to what cryptographic assumptions are used and what kind of zero knowledge comes in several variants.
Starting point is 00:17:15 They're statistical and computational and perfect and in all kinds of models. And I think that as you go into different models and different set up assumptions for these systems, you get various terms that. I'm not sure. I mean, the distinction between the various flavors is not something that the general public, you know, would want to go into necessarily. Okay, no, no, that's perfectly fine. And that's also a good answer. So we can know, we can sort of all lump them together and we can, we can know what we're talking about.
Starting point is 00:17:52 Yeah, non-cryptographers, I think, you know, calling them either ZK approved. or snarks or, you know, there are other names for them. I like to call them skip, succinct, computational integrity and privacy proofs. But, you know, there are similar variants of the same big phenomenon of an interactive proof that is zero knowledge and, you know, proves that a computation executed correctly.
Starting point is 00:18:27 Let's take a short break and talk about about hi.me. Hi.m. me is a VPN provider and if you don't know yet why you should need a VPN provider, let us help you. I'm sure you were like me and when all the crazy revelations came out during the Snowden time of all the spying that is being done by the NSA and other government agencies, you were shocked and you said, not with me, not with my own rights. Now, the way government agencies can spy on you, there's many of them, but the most easiest way is by simply, going to your ISP and getting all your traffic capturing all your traffic. And the VPN can protect you from that.
Starting point is 00:19:06 It can give you a secure tunnel from your computer to any of the exit nodes all over the world so that all your traffic goes to this secure pipe that's encrypted and cannot be intruded on. And with Haight.Me, you can choose any of their 30 exit nodes all over the world so you can enter the internet in a secure location. The best thing about Hyde.me is that they have a free plan, which includes two gigabytes of unthrottled bandwidth per month. So you can go to Hyde.me slash Epicenter to create your free account. And when you use that URL, you'll automatically get 35% off if ever you decide to go premium. Now the premium plans are really great.
Starting point is 00:19:43 They include unlimited bandwidth, access to all of the 30 exit nodes that Hyde.Me provides, and you can install it on up to five devices at a time so you can have this running on your phone, your tablet, your computer at work, your personal computer, and just be completely protected all the time. And of course, hi.me accepts Bitcoin. So we'd like to thank high. dot me for the support of Epicenter Bitcoin. So in this case, like when you're saying zero knowledge,
Starting point is 00:20:10 if you map it to the BitStamp example, the zero knowledge part is that when BitStamp is trying to prove that it owns 100,000 coins, it has some knowledge. The knowledge is basically private keys that unlock these different coins on the Bitcoin network, that is BitStamps knowledge. And then if I am a verifier of BitStamp,
Starting point is 00:20:36 I want to verify that they actually have this thing. Then it is in BitStamp's interest to not reveal any of these keys to me. They want to prove that they own 100,000 coins, but they don't want to reveal any of these keys, because I can steal those funds if I knew the keys. So, zero knowledge here means that they can send me a proof that they own 100,000 coins, while also not revealing their knowledge, which is the private key. So zero knowledge basically is a reflection of that, that they can keep their private keys and their knowledge secret,
Starting point is 00:21:16 but yet prove something to me about the nature of the knowledge that they have. Yeah. So I try to explain it this way. If I make a statement, like I own a thousand bitcoins, you're already getting knowledge. Now you know that I own a thousand bitcoins. Okay. Now we, and you gain some knowledge from that. Now we start interacting where I send you various strings of bits and you ask me certain queries and I send you information back. And the question now is, at the end of this extra process, after you heard the statement, what have you learned? And in a zero knowledge proof, there's a formal way of stating that you learned nothing. You learned as much as if you would have asked me your questions and I would just answer with, you know, 0,000,000, 0,000, like, you know, some completely random string. That's the informal interpretation of a zero knowledge proof where you learn what you learn from the statement. But the whole interactive or non-interactive, everything extra,
Starting point is 00:22:24 everything appearing in the proof or in the snark and this extra information is meaningless to you from a cryptographical point of view. It's just as tossing a few coins and using that thing. What do you learn from that extra about, you know, do you learn anything? If I just toss a few coins and tell you the answers, will you learn something about the keys that control my 100 bitcoins? No. So that's the informal explanation of.
Starting point is 00:22:49 what it means for a proof to be zero knowledge. It conveys no information beyond the veracity of the statement. So anything you can learn, you will have learned from the statement itself, not from the proof. So if you look at the bit stamp example like there, so who are exactly the actors and the participants in these things? So is there just bit stamp on the one hand that would, you know, do this proof and then supply it to some verifier who would,
Starting point is 00:23:19 then I guess either run the proof entirely or they would be interacting with them. So those would be two different cases. Is that right? Okay. So a lot of work has gone into. So we started talking about interactive proofs that there is a back and forth. And it's really crucial to get these savings in running time and all the amazing properties. You really need the interaction.
Starting point is 00:23:47 But then people, you know, ask. okay, can we sort of minimize this interaction or make it really, really simple and the amazing answer is yes, you can do various things that will make the interaction very minimal to the point of only needing a setup phase. So it's like a three message protocol or actually just a two message protocol. there's a setup phase where some queries, if you will, are broadcasted. And then after that, everything is non-interactive. So after that, there's just a prover using the information that came from the setup phase
Starting point is 00:24:35 and using it to cook up proofs and sending them or broadcasting them outside to the outside world. and then anyone can at their leisure, at their leisure, you know, check these proofs. So there still is interaction, but it's a very minimal kind of interaction. I mean, formally, there's a very small number of rounds of interaction. There's one publishing of, you know, set up parameters. And then beyond that, there's only the proofer sending proofs
Starting point is 00:25:07 along with the statements being proved. So it still is interactive, but, you know, very minimal interaction. So like in this podcast, before we jumped into this example, you were going through this, you're telling this story about in the 80s, you had interactive proofs and now you have these ZK Snarks. And the key takeaway, like I'd like to ask if this guy, the key takeaway is in the 80s if there existed bitch stamp and it wanted to prove me
Starting point is 00:25:42 this thing in the 80s, then we would have to exchange a lot of messages, right? I send him a query, it replies, I send him again, it replies, and maybe after a thousand rounds, I'm convinced that, okay, this guy actually has a thousand bitcoins. What has happened over the last 25 years is I don't need to keep on sending these queries receiving replies in order to know, I can basically, the bitstime can just generate a string, send it to me and I can read that string and verify that this is true. That is the advance. Not exactly.
Starting point is 00:26:20 So I think it's just, it's more that theory progresses very quickly. And there's usually in advanced science a few decades go by between an initial idea and even its implications and various improvements on it. and, you know, technology making its way into the real world. This happens all the time in computer science. For instance, error correcting codes were started to be, you know, devised in the 40s and, you know, various other advanced codes were then devised in the 60s.
Starting point is 00:26:56 And it takes really like two to three decades for those things to show up inside actual systems. Even if, you know, the things that show up in the end are pretty much known at the start. So now to answer this particular question, if you asked a theoretician in the early 90s, you know, if we wanted to do this computation that you just did, do I need a lot of rounds of interaction and so on and so forth, the answer would have been theoretically not. You know, you have very, very, very good systems, theoretical ones that work really extremely well, have an even better setup. We'll get to the setup process later. And then, you know, the true. trust assumptions. So even better things were known in the 1980s, in 1990s, but converting them
Starting point is 00:27:44 into practice, I mean, a lot of the initial theoretical constructions were extremely inefficient from a practical point of view. So again, to sum up, in the early 90s, there were very good answers that gave already non-interactive, zero-knowledge proof systems that had, asymptotically and theoretically pretty much the same performance as the snarks that showed up only recently, but the constructions were not efficient enough to be done in practice. Now, I should say it's not that research is not advanced. I want to, so the currently published version of snarks that's out there that Zero Cash uses and others did benefit from several beautiful recent results that said something like this.
Starting point is 00:28:41 Because these early constructions are a little bit too inefficient, these constructions based on probabilistically checkable proofs, let's use additively homomorphic encryption to bypass the problematic parts. And additively homomorphic encryption is something that's been around for some time, but this idea to use additively homomorphic encryption to speed up the way you use probabilistically checkable proofs is a relatively new idea. It started in 2007 and a work of two of my peers at Technion, Yuvali Shai and Nail Kusiliwitz, and a researcher from UCLA, Rafael Ostrovsky, and in this beautiful paper, they said, you know, if we use additively homomorphic encryption, we could speed up and circumvent various technicalities in previous attempts.
Starting point is 00:29:38 And this led to a long sequence of works that culminated in a beautiful construction by Gentry, Gennaro, Parano, and Reikova that uses something that introduced something called quadratic span programs and quadratic arithmetic programs that really became quite efficient. and they are the theoretical basis of the currently implemented systems. The two, I think, most adopted ones are Pinocchio and Lib Snark. So again, in the 90s, they knew to do really amazing things theoretically. Constructions were inefficient. 2007, marvelous breakthrough said, let's use additively homomorphic encryption to bypass certain technicalities. That thing has led to very rapid progress,
Starting point is 00:30:26 that led to these systems that we use today, I should mention that the old school kind of systems are still very much interesting and we're still working on getting them off the ground as well. Perhaps we could, so that's a very interesting history that theoretically it was there for 20 years and now only it has started to become practical as an engineering solution to certain problems.
Starting point is 00:30:55 So we will want to go into what is the current state of implementation because a lot of our audiences are like programmers who might, you know, someday want to use these systems. So they might want to know what's going on right now. But before that, perhaps we should go into the setup phase. Like in your scenario, you said that there's this set up phase that is needed before in our case, like BitStamp can actually generate the proof. So what is the setup phase like? Okay, so the setup phase, I should say, and this goes back to the old systems versus the new systems. So in the old systems, the old theoretical systems that have not yet been reduced to practice,
Starting point is 00:31:42 in these old systems, the setup phase consisted of picking a very short random string and posting it. So for instance, practically speaking, you could take the first two, 256 digits of pie and say, this is, you know, this is the system set up. And from here on, it's non-interactive and everyone uses it. Okay. But again, these systems have not yet been implemented. The newer systems that you would use additively homomorphic encryption have a more involved and susceptible setup phase that you have to do really carefully.
Starting point is 00:32:21 In this setup phase, the parties running it are going to use asymmetric encryption, so some forms of public key cryptography, to take some secret and cook up from it various queries. Remember, we talked about an interactive proof where the advocate is going to ask, you know, the proof of various questions. So with the use of additively homomorphic encryption, you take those queries that you would want to ask later on and somehow encrypt them in advance in a way that the proofer, the honest prover can answer them very easily and efficiently, but a prover who wants to prove a false statement will not efficiently be able to answer this question. So this setup phase involves some secret that if it is somehow leaked or revealed, you know, can ruin the system in the sense that knowing that secret, that trapdoor, would allow whoever knows it to, you know, forge pseudof proofs of any statement, true or false. So in the new systems, the ones implemented, like Libsnarc and Pinocchio and Hawk and Zero Cash, the setup phase is a really, really critical one, a really critical phase. It has to be carried out right. And carried out right means that the party that runs it has to basically be unobserved
Starting point is 00:33:54 and then destroy the computer or laptop or whatever it used to generate the initial part. in the interactive proof. Okay, so, so to rephrase that, like, let's, let's take the bit stamp example again. And now let's say that Mike Kern instead of going by there and looking at their computers and all, you know, he was going to play that role, but using, you know, zero knowledge proofs. So let's say he, he was participating in that setup phase to, to, as the party interacting with BitStamp. So that means he, he was, he was, he was, he was, he was, he, he was, he was, he, he would have to have some secret key for the setup phase,
Starting point is 00:34:40 but if he leaked that to anyone, then they could forge the proof. Is that correct? That would work, but it's a little bit better. So suppose everyone knows the computation for checking solvency. It's a well-defined computation. So someone, it doesn't have to be my current, but someone trusted or someone trusted by everyone, should run the setup phase on behalf of everyone else,
Starting point is 00:35:10 and then publish to the world, this is the key that everyone uses, or the set of keys that everyone uses, and then that person could disappear. Now, there's no need for Mike Hearn or anyone else. I mean, everyone knows the computation. Everyone has the key to proving and the key to verifying it. Anyone can run this computation,
Starting point is 00:35:32 and this company can run the computation and just generate the output along with the proof. Right, but you still need that trusted party, except that there's a different role for the trusted party, and the trusted party doesn't have to verify the process, doesn't have to verify the keys, they just have to produce some secret and then not leak it, right? Yeah, they have, they consume a certain secret,
Starting point is 00:35:58 which is, you know, just some random string. And that random string is used to generate a key that is revealed to everyone. And the important thing is that the random string should be just destroyed. Because if it is leaked, it could be used to forge. It doesn't break zero knowledge, but it could be used to forge statements, sorry, proofs of any statement. I'm sure the researchers, including you, must have developed some of hardware or other systems that kind of give high certainty that the secret has been destroyed.
Starting point is 00:36:39 For example, let's say I want to do the setup phase professionally, right? Like I want to make my profession just setting up these zero knowledge systems and then destroying their secrets and then that's it. And everyone can use it and that's what I want to do. right now now the world doesn't trust me obviously because to the world i'm just just a single guy in in a in a strange city they don't know of why would they trust me to to set these systems up so have have the has the community developed some kinds of methods by which a gen set up person like me could ensure the world that the way i've set it up is actually right and i've
Starting point is 00:37:24 destroyed what needs to be destroyed Unfortunately, to the best of my knowledge, no, I don't have a good answer for that. The best answer I have for that is so my colleagues and I, in particular, Professor Matt Green, Alessandro Kiezer, Eran Trom, Eran Madal Zvirza. Last year, we published a paper showing how you can do multi-party, secure multi-party computation, to amortize this harmful secret, this trapdoor among several parties. So, for instance, we could all participate in this protocol. And what we proved in that paper is that if even one of us is honest in the sense that we actually do destroy our computer and don't reveal the secret,
Starting point is 00:38:22 then even if the rest collude, the whole system is still secure. But this isn't the ultimate answer that should convince everyone that things are okay, because some central agency maybe could ease drop on all our computers, or maybe we will have a large enough incentive to actually reveal these, keep these secrets and combine them together, even if we do this multi-party computation protocol. So from a cryptographic point of view, I'm not aware yet of any research that shows how to, in this additively homomorphic encryption system, make sure that there is no key that can be used.
Starting point is 00:39:09 Now, you can do this. The good news is that the good old systems, the one that have not been implemented yet, don't have this problem at all. there is no trusted setup, period. The first message is, practically speaking, I already can tell it to you. It's the first 256 digits of pie. Or pick any other random looking string that's probably good enough for practical purposes. And there's nothing that any of us can do that I'm aware of that could break such a system.
Starting point is 00:39:42 So there is hope beyond the horizon. It's just not yet ready for. practical use like zero cash. Okay, that's interesting because I was going to say before, well, with the BidSamp, example, you know, if you say like we don't want to trust Mike Earn to destroy that thing, well, maybe what one could do is just run it, you know, five times, right? Five different people do it. And then Bidstamp has to prove it for each one.
Starting point is 00:40:08 But then what you're saying, right, is the even simpler way would be that sort of the five people together generate that secret. and then if even one of them is honest, then Bidstamp only has to verify it once and you have that same sort of security. Exactly. Yeah. So what we proved is that if the secure multi-party computation protocol
Starting point is 00:40:33 that we suggested is used, then it's enough that just one party used, destroyed the randomness that it put into the baking of this first message, these parameters that everyone uses afterwards. And again, ultimately, when it comes to systems like zero cache or or whatever other systems are built using the encryption-based snarks, you would have to, I mean, users and customers would have to use their own discretion and decision about whether they trust the party or the process that generated
Starting point is 00:41:07 the keys of the system to be trustworthy. because there is no cryptography in that flavor of systems that assures you that the key is okay. Today's magic word is proof. P-R-O-O-F. Head over to let's talk bitcoin.com to sign in, enter the magic word, and claim your part of the listener reward. So in this world of the trusted, like, let's call it a little. trusted party snark what are the kind of the performance implications of of doing something like this like let us go back to the initial definition which is
Starting point is 00:41:56 like there's a program there's some input to it and there's some output and I have a computer that's running this program and I get an output and I want to prove to the world that the output actually matches this input and program so I generate a snark so what what what performance penalty do I get and then there's And if I send you this proof and you want to verify this, what performance penalty do you have? So the tradeoffs, very roughly speaking, are that the prover pays a lot and the verifier pays very little or even benefits. Because verification and the published snarks is around 9 milliseconds is really, really fast for any length of computation. So theoretically, you could take a computation that would take.
Starting point is 00:42:45 even without proving it, you know, a year or something really ridiculously large and cook up its snark and still it would take only nine milliseconds to verify. Okay. So verifier really, really gains a lot in or saves a lot. Now, Prover pays a lot. It's much more time consuming to generate a proof than just to run a computation. But of course, you have to pay for, you know, being able to. to prove something. And I would say that currently, you know, within a few hours, you could get a computation that's roughly something like, let's say, up to a million cycles of a simple machine.
Starting point is 00:43:34 That's roughly where we get stuck in terms of RAM consumption. So roughly a million cycles, or maybe 100,000 cycles, some very simple machine that has small word size, some early version of an X-286. I mean, that's the kind of systems that we actually built and tested. So, for instance, zero cache, which puts our system pretty much uses it to its limit,
Starting point is 00:44:05 involves something like 128 invocations of shot 256, plus a little bit of logic. So the big thing there is the number of invocations of SHA-256, and it will take you roughly an hour to produce a proof that you ran 128 invocations of Shah and did some minimal logic, you know, integer addition and stuff like that. Now, I mean, I'm sure that our Bitcoin audience viewers know that you can do Sha-50s, Shah 256 pretty quickly on modern computers. And I'm telling you that to prove that you executed it,
Starting point is 00:44:47 with a snark would be one hour for roughly 100 Shah. So that's right. What's the rate of that? It's pretty small, but you know, you get zero knowledge and you get proofs. So isn't that a constraint? Like if I want to spend a zero cash in the future, I want to spend a Zed cash coin, does it mean I need to generate the proof that I have the coin and I am spending it correctly for, I need one hour to do all of these computations?
Starting point is 00:45:21 So in the academic paper, as an academic, I would say, yes, you spend one hour. I know that the engineering team is looking at various things, like getting this number down. And I think that when you'll talk to Zucco, who's the CEO of Zcash, he'll probably be better suited to answer these kinds of questions. But everyone who knows computers knows that anything that runs X now in an academic version, probably when it gets to production grade and after optimizations, is going to be much faster than X. So can you walk us through the landscape of landscape of systems?
Starting point is 00:46:03 systems that are available today and can be used by very good programmers to experiment with these technologies. Like you mentioned Pinocchio and Libsnark. Could you explain what else is there and what's the difference? So I think the basic systems that I am aware of are pretty much these two. Pinocchio and Libsnarc. There may be differences in the licensing strategies, which probably matter. to a lot of programmers. I don't know about Pinocchio's,
Starting point is 00:46:37 you know, the license that's used there, but I'm sure it's posted there somewhere. Ours is an MIT, open source, MIT standard license. So it's, I think, very welcoming to programmers. And it's available at LibSnark, the GitHub repository. Additionally, those interested in playing with other systems could look at Zcash that has
Starting point is 00:47:03 has its own GitHub that uses LibSnark and also already has an Alpha TestNet running. So, and that's another place to use, to look at code and use it and play with it. I am not aware enough of other basic systems that other than Pinocchio and LibSnart, but, you know, there's there's Google. I know of Pinocke and Lip-Snark and I know of Z-Cash. I'm sure there must be others that already improved or forked, but I'm not the right person to ask. So we mentioned zero cash, right? So that would be one application to make basically something like Bitcoin, but truly anonymous. What other applications are there for this kind of technology? And then what are the ones that you are most interested in?
Starting point is 00:48:00 It's a very good question. And it's one that's one that's It's a little bit hard for me to answer because I'll explain why. The things that interest me most are getting these things to be universal, which means it's a little bit like building a compiler. So whoever builds a compiler, if you ask them, you know, what is the program that you would like most to compile? I'm guessing they'll say, well, you know, all of them. And, you know, I just want to build the compiler to be as good as possible. So the mindset that we have is often more or less like this. We first try to build universal systems that apply to any computer or to a large class of computer programs,
Starting point is 00:48:37 and then you could look at applications. That's actually what happened with Lib Snark. First, we published a bunch of papers about how you could get arbitrary C programs to compile into zero-knowledge snarks, and then we optimized it and downscale the functionality, but also optimize the speed and efficiency to build zero-cats. So any computer program as long as it's not too many cycles, let's say a million cycles of some simple machine is probably where we top out right now is a candidate. And I think the first things would be things within the financial domain. So smart contracts, you know, not just proving things about statements, but maybe, you know, that you pay taxes, that you, you know, that you comply to certain regulations. So certainly financial applications are really, really good for that.
Starting point is 00:49:35 Let me say why, because they're very simple as a computer program. This is not loading up bootstrapping the Linux kernel. It's just addition and subtraction of integers, right? That's what money is about. And these computations are really important. They have a lot of economic incentive to them. That's another good thing about them. And yeah, so they're really simple and as computations and they're really important.
Starting point is 00:50:05 I think other domains where it may be interesting is, of course, security and cybersecurity, all kinds of things related to, you know, gateways and entry points to systems and proving system health. Other things would be in the healthcare area. And as I said, legal and government areas where, you know, privacy is really valued and important. Those are areas. But, you know, when I go to the lab and talk to my peers, what do we talk about? Not so much about the applications about getting the systems more efficient, new functionalities and so on. So if you look at this from a higher level, would you say that zero knowledge proofs are basically always valuable when someone wants to verify that the process is done in a certain way and that it's still.
Starting point is 00:50:57 done correctly without having access to some private data. Yes, that's why I like to call, you know, that's what we call our labs, Skipper Lab, succinct, proofs are short, computational integrity, you want, you expect integrity from your computations, you want the output to be correct, and privacy, succinct computational, integrity and privacy research, Skipper Lab, that's, yeah, that's how I would describe it. You want all computations to have their outputs produced with integrity without cheating correctly. Okay, that's great. And so that sort of ties into some of the areas you mentioned, right?
Starting point is 00:51:38 So if you talk about finance or you mentioned that like, for example, let's say some bank could prove that before every payment, they checked a certain database and the result, they get back was, you know, the person is not a terrorist. or, for example, with the tax authorities that they could prove certain things about maybe how the money was spent, like some sort of accountability to the public or auditing function could be interesting, right? So that... So, okay. And what about the realities of this? I mean, are you seeing adoption? Yes.
Starting point is 00:52:17 Okay. for my seat as a professor doing research who before this only did, you know, only did math improved theorems, I see a lot of interest from, you know, people like you. I've never been interviewed about my work on propositional proof complexity and resolution. And, you know, I've never been asked to be interviewed about it. And, you know, those are results I'm very happy with. So I'm joking here. But what I mean is I think there is adoption, but I'm also not the right person to ask
Starting point is 00:52:51 because I'm not really a startup entrepreneur. I don't really know what the trends are. I think that as a scientist, I'm very excited to see that people outside, you know, the mathematical, theoretical, theoretical domain of computer scientists is actually interested in talking about this and hearing more. So yeah, that's the sign of adoption.
Starting point is 00:53:14 Yeah, that's really a sign of adoption. perhaps it's just the nature of like distributed ledgers, like, you know, whether they be Bitcoin or the ledgers that people are building for banks. And we kind of see that this whole, our whole space like A to Z, we are going to hit this big wall where none of us have an answer for financial privacy. Like with Bitcoin, what happens is generally Bitcoiners are the people who care enough about having something independent of government to say, okay, I'm willing to potentially sacrifice my financial privacy to have this, to transact in this system that's not
Starting point is 00:53:57 controlled by a government. But I'll be immune to things like the Greek crisis or the Brazilian crisis or whatever, right? And that's kind of the person I am. But whenever you try to go and say that, okay, a bank should use a system like this, then, you know, you get into a system of situation. where the banks already have give you financial privacy and why why should they move to another system which doesn't so this is kind of our this is the reason i think there's a lot of interest into this line of research because it it seems one thing that can tie untie the guardian not
Starting point is 00:54:35 really and allow this tech tech to go mainstream and you know careers to be made etc yeah i think that this this hits upon a point that's common to things like ZK Snarks and to Bitcoin, which is that these are technologies that are very much for decentralization and for, you know, the little guy out there and anti-establishment. Whereas a lot of the world and a lot of business is about trying to be that trusted party, the portal, you know, the portal where everyone goes through the hub, the central place. That's how business succeeds.
Starting point is 00:55:16 And now the question is, does business want or needs something like this decentralized thing? I would say that past experience is the Internet. You know, the Internet is something very decentralized, very distributed, and it did spawn a lot of business, good business. It is always constantly under these pressures that make it more centralized. I'm sure some people would say it is already completely centralized and it's just one big scam. I don't know. I don't think that's the common view. So there are these forces of centralization and decentralization always going on in society.
Starting point is 00:55:55 And Bitcoin, as I understand, undergoing one of those spasms as we speak, right? You know, is it really decentralized anymore? Is it controlled by mining pools? I don't know. I'm not a Bitcoin expert. I'm sure similar things can also apply and will apply to systems, built on zero knowledge proofs and things like that. So, so a few, a few applications, perhaps, perhaps you're not the perfect person for this
Starting point is 00:56:24 conversation, but I'd still want to pose a question anyway, because I don't know who is, to be honest. Like, I have this feeling that zero knowledge proofs could, could help in scaling up things like Bitcoin and Ethereum. Like, let me walk you through my thought process and tell me if it's right or wrong. For example, in Ethereum, you have these smart contracts, right? And, like, if I and Brian do a smart contract, so there's a program in the middle of us that's mediating our financial relationship, that program needs to be checked by all of
Starting point is 00:57:01 the nodes on the Ethereum network. So there might be like 6,000 of them. So it's one single computation repeated like 6,000 times across the network. Now, the simple question I kind of get in my mind is, this is a technology where computation can be repeated once, done once, and then checked 6,000 times that the result was correct or not. So it naturally seems something that could help, like Ethereum scale, that you could delegate some of your computation to a specialized node, and that node does the computation and it generates a proof, and then all of the rest of the network just verifies that proof. that allows massive scalability. Would you say this view is right? I would say that it's completely right, theoretically.
Starting point is 00:57:48 And now practically, it remains to be seen. It's very challenging practically. Because, you know, taking this ballpark number of roughly we max out at a 128 shot 256 invocation. So if you take just even a single block with thousands of shot 256 and, you know, thousands of, ECDSA signatures, you know, we're not up to scale right now with that. So theoretically, yes, it could compress, you know, again, theoretically you could take the whole history of the blockchain from the Genesis till today and have a single 288 byte snark that attests to the fact that the, you know, you don't need to know anything just that all these transactions till the very latest state is okay. theoretically. Practically, I think we're heading that way, but it will be a non-trivial scientific
Starting point is 00:58:44 and then engineering challenge to actually compress practical things that could help in the scaling problem. But yes, I'm very optimistic in the midterm and long term. In the short term, it's challenging scientific work that we're thinking about. So what's midterm, but in your definition, in your dictionary, what is midterm? Well, you know, a few years as opposed to many, many years.
Starting point is 00:59:15 And short term would be, you know, a few weeks and then midterm is, I don't want to give any, you know, dates or, you know, any prophecies about when something may happen or not. It's sometime in the future. So perhaps like before this show, me and Brian were talking about this technology and we were talking about this technology. and we were having this discussion about how, about how, like, like with Ethereum, you see that they have, you have smart contracts and with zero knowledge, you could delegate this computation to a specialized node. But for Bitcoin also, we were, we were thinking of this other example where,
Starting point is 00:59:53 let's say like a miner creates a block. And now the other nodes, other miners in the network need to download this, block, verify it, and then start building on top of it. Right. And like if the block is extremely huge, like if the block is like 100 mb, so let's say a minor in like Xinjiang province China generates this block, that's 100 mbs. Now for that block if and I'm, I'm sitting in Iceland and I'm another minor for that 100 mb block to come to me is going to take 10 minutes or 12 minutes. And in that time that minor in Xinjiang has an advantage over me because he knows the block and can build on it, but I don't know the block and I don't build on it and
Starting point is 01:00:39 I kind of lose out. So if you wanted to scale Bitcoin to a massive block size today, you would have mining centralization. Like if Chinese miners are dominating, like they would communicate all the blocks to each other and then, you know, they have an advantage over a minor like me in Iceland. Now with ZK snacks, what we were thinking was like the miner in Xinjiang generates a block and then generates a ZK Snuck proof that says, okay, the accounting in this block is correct. And he just sends the block header, like, you know, with the numerical route
Starting point is 01:01:14 and the new UTX output, says the block header and the proof to me in Iceland, and that's just maybe, you know, one MB or two MBs. So that comes very quickly, like in 500 milliseconds. And I can also start building on it. So it allows the miner in Iceland to stay on par with the miners in China. and then actually Bitcoin can scale the block size using this technology.
Starting point is 01:01:39 Does that make sense? Again, yes, I think it does. Theoretically, practically, I don't think we're, you know, we're not yet in the hundreds of megabytes or hundreds of millions of Bitcoin transactions. But yes, theoretically, I think that's, you know, something we would like to do at some point when the technology is sufficiently efficient for that. The question here, I guess, would also be, how long does it take you to generate that proof, right?
Starting point is 01:02:10 Because, of course, like, that also delays, how long it takes until you can send it out to the rest of the network. Yeah, precisely. So that's actually the only, that's the only bottleneck. Because length of proofs is very small. It's going to stay around 300 bytes for these ZK snarks. and verifying them as nine milliseconds, irrespective of this 900 megabyte.
Starting point is 01:02:39 So it's really the only bottleneck has and always will be on the prover side. And right now you can't prove, you know, feasibly, right? Within our lifetime, a block of 100 million transactions. And then I guess another application could be, when you talk about and I guess this ties a little bit into the Hawk project, I think, to have smart contracts run off-chain, right?
Starting point is 01:03:10 Could it also be used for that? Yeah, similar. It's the same thing. So you basically say, or for two-way pegging, you say, you know, here's a proof that, you know, some sequence of transactions was conducted offline someplace. And, you know, all of them verified because the program, that you're proving to me is the program that checks transactions and just reports, you know, the final state is that whatever, that side chain now has a partition of coins
Starting point is 01:03:42 along, you know, this small file or a proof that Bitcoin will accept. It says you can now redeem a coin back in Bitcoin because something good happened in a side chain without knowing what went on. Yeah. So one, one, One other interesting thing that I see in the literature around cryptocurrencies and zero knowledge proofs is, like, many people say that this technology is something that is too complicated to be implemented by the normal developer. Right. Like there have been like posts across Reddit in Ethereum's blog that says that it is significantly involved. And until you get to the point where a normal programmer can just use this thing and not worry about it, just like we use our operating system today and don't think about it, it's hard to kind of spread this. Do you see that as another challenge for the technology?
Starting point is 01:04:46 Not in the long run. I mean, you just mentioned operating systems. We're talking over the internet with myriad complexities of networking. And, I mean, that's all really complicated. never wrap my head around any of these things. And yet it works seamlessly, right? I open up the computer. I talk to you guys.
Starting point is 01:05:06 I have no idea what's going on, you know, vision compression, you name it. So, no, I don't think this kind of cryptography is any different. It's advanced technology that just is emerging from, you know, the science labs and it will take some time. But the same thing happened to lasers and rockets and the Internet and computers in general. So why would this be different? The only thing would be if this is not useful, then not enough smart people will try to use it. And then, yeah, it will rot on the shelves. I really, really hope that's not going to be the case.
Starting point is 01:05:41 And then, you know, you'll have a whole industry and people will use it. And it's going to be all over the place. That's my hope. I'd say, like, if we as the blockchain industry survive, you will find your use. Okay. So, yeah, I mean, the use is dependent on us. What happens to the whole industry that's there today? Yeah, no, I totally agree.
Starting point is 01:06:08 I think it seems necessary, right? If you look at blockchain, you look at smart contracts, then it's just like, well, this has to be sort of an integral part of the technology stack in the future, right? And it's going to be a challenge, too, because the blockchain. blockchain stuff on its own is a challenge. And then you add that, it makes even more challenging. But I think you're totally right.
Starting point is 01:06:33 If the incentive is big enough and there's enough at stake and enough interest in it, people will make it happen. Yes, I agree. So, Ellie, thanks so much for coming on. That was really enjoyable and super interesting to dive into this topic. Thank you so much for having me. I really enjoyed it. And I'm glad you've chosen a field that is generating such enormous interest. It chose me.
Starting point is 01:07:06 Okay, so for those who want to dive a little bit deeper into it, of course, we'll link in the show notes to Ellie's papers. There's a paper on Snarks and also one on Zero Cash. And I think we'll also be doing a podcast episode on Zero Cash in the near future. with Zucco. And there's also some links to, I think, a talk he gave it at a Bitcoin event. And we'll have some of the links in there also to his website where you can find the rest of his research papers. So with that, thanks so much for listening. So we're part of the Let's Start Bitcoin Network. So you can check out lots of great podcasts about cryptocurrencies, blockchain, et cetera, on lessop Bitcoin.com. And of course, you can get eps and a Bitcoin on any
Starting point is 01:07:53 podcast application or you can watch the videos on youtube.com slash epicenter bitcoin and if you're a loyal listener if you want to participate we have this t-shirt contest if you leave us an iTunes review you can send us an email at show at epicenticicon.com and we can send you a t-shirt so thanks so much and we look forward to being back next week

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.