ACM ByteCast - Yael Tauman Kalai - Episode 47

Starting point is 00:00:00 This is ACM ByteCast, a podcast series from the Association for Computing Machinery, the world's largest education and scientific computing society. We talk to researchers, practitioners, and innovators who are at the intersection of computing research and practice. They share their experiences, the lessons they've learned, and their own visions for the future of computing. I am your host, Brooke Kifle. In an era marked by ubiquitous computing and continuously evolving threats to data security,

Starting point is 00:00:34 the importance of cryptography in our digital lives cannot be overstated. It serves as the bedrock of our technological systems, preserving data privacy and ensuring the secure transmission of information. From encrypting and decrypting messages to verifying transactions and authenticating identities, cryptography encompasses a wide array of applications crucial to our digital age. And as technology advances, new challenges and open problems continue to emerge, requiring innovative solutions. Today, we are joined by Dr. Yael Kalai, whose research has not only advanced the theoretical foundations of cryptography, but has also influenced practical applications with many

Starting point is 00:01:15 implications. Dr. Yael Kalai is a senior principal researcher at Microsoft Research and an adjunct professor at MIT. Her main research interests are cryptography, the theory of computation, and security and privacy. She's especially known for her work in verifiable delegation of computation, where she has developed succinct proofs that certify the correctness of any computation. In addition to making breakthroughs in the mathematical foundations of cryptography, her proofs have been useful in areas such as blockchain and cryptocurrency.

Starting point is 00:01:50 Dr. L earned her Bachelor's of Science in Mathematics from the Hebrew University of Jerusalem, an MS in Computer Science and Applied Mathematics from the Weizmann Institute of Science, and a PhD in Computer Science from MIT, and is the recipient of numerous awards, including the 2022 ACM Prize in Computing. Dr. Yael Kalai, welcome to ByteCast. Thank you. Nice to be here. You know, you have such a remarkable background. You've studied under some of the most renowned cryptographers, and you yourself have made major contributions to the field. But what I find interesting is most people have a pretty unique journey or story that has led them to where they are today. So can you highlight some of the key inflection points

Starting point is 00:02:37 within your personal and professional career that have ultimately led you to where you are today and into the field of computing and cryptography as your field of research? Yeah, sure. So, you know, growing up and as a young adult, I really loved mathematics. Actually, I have to admit I was not good at anything else. So I felt like that's the only path that makes sense for me. And I studied, as you mentioned, my undergrad was in mathematics. I loved it deeply. I really fell in love with the subject. And I actually started doing my PhD, like my master's, in mathematics, also in the Hebrew University. I left after one semester.

Starting point is 00:03:25 But I think the reason I moved to computer science was, it's interesting, I still feel like I'm a mathematician in disguise. What really interests me is the beauty of mathematics. However, when you go to just pure math, it's very hard to find questions that are, I felt like, that are not incremental, that I felt are groundbreaking. I mean, there are, of course, groundbreaking questions, but they seem almost impossible to solve. So when I tried to, when I just stepped foot trying to do research in mathematics, it felt like things are either close to impossible or very incremental. And I couldn't find a problem or a direction that excited me.

Starting point is 00:04:12 So studying the field was exciting for me. But doing research was, I struggled to find something that excited me. And that's when I kind of took a step back and started thinking about, you know, where maybe I can move a little bit to the left or a little bit to the right and kind of still do mathematics, but in a way that's more not so incremental and something that will may have some influence and maybe, you know, more than one person will read my paper, you know, a paper that I write or something that will have more significance. So, you know, I really wanted to have real world impact and do base kind of fundamental mathematics at the same time. And

Starting point is 00:04:56 these two seemed a bit contradictory. And I kind of looked around to see whether there's kind of a field, a subfield of mathematics that I can converge to. And that's when I found theoretical computer science. So computer science is a much younger field, of course, and there were still a lot of open, kind of seem fundamental open problems that, you know, did not seem impossible to solve. It was young enough that, you know, open, like new interesting problems came about often. And underneath the actual problems that we're thinking of is really problems in discrete math. It's really mathematical problems.

Starting point is 00:05:37 So that was kind of the journey that took me to the Weizmann Institute. In particular, I went there because actually, I don't even, okay, to be frank, I don't actually know really how to program. I'm not interested in programming. You know, I have to say when I got the ACM Prize in Computing, I told my kids, all three of my kids, even my young girl knows how to program. And my older two kids are really amazing programmers. And the reaction when I told them, it was like, really? Is this a joke? I mean, do they know that you know nothing about computing? So really, my interest is in basic math, but theoretical computer science and cryptography in particular have a lot of that to offer. And I think the reason I went to cryptography within kind of theoretical computer science is many factors. I'm not actually 100% sure. Definitely one main reason is I took a class, but you mentioned my

Starting point is 00:06:40 renowned mentors. I took a class by Adi Shamil in Weizmann Institute, who was a Turing Award winner. And I was just blown away. I was blown away by the subject. I was blown away by him. I was just, I remember my eyes lit every time I walked into his class. And so, you know, I don't know if, you know, how much of it was based on just his personality and how much he captivated me and how much was the subject at hand. My research is not solely in cryptography. So I do also research in other areas of theoretical computer science. So I can, you know, I feel like I'm a bit malleable within theoretical computer science. But one thing I really love about cryptography is the questions that we ask are so fundamental and so philosophically interesting.

Starting point is 00:07:36 What does it mean to know something? What does it mean not to give information? How do we define that something gives no information? Like it's very philosophical terms that we need to state mathematically and rigorously. And I find that really interesting to kind of, and I think that's partly why, you know, so we're actually dealing with things that we want to use and apply. And, you know, so we're actually dealing with things that we want to use and apply. And, you know, like we want to give proofs and we want these to be, not to give information. We want these, we want to be able to verify correctness of computations.

Starting point is 00:08:15 What does that mean? And how do, so I like, I really enjoy the fact that we're dealing with real world problems that are just interesting problems. I can explain them to my mother. I would say my father, but my father happened to be a scientist, so that's not meaningful. But yeah, so I can, you know, and so I feel like I'm working on problems that are interesting, and I can explain them to the public. Of course, how we solve them, I can't explain.

Starting point is 00:08:46 Of course, how we solve them, I can't explain. But at least the problem itself has general interest, in my opinion. You know, it seems like you have a strong excitement and passion for open questions and solving problems. And I love how it seems like every scientist always starts with this love or passion for math and then, you know, coming across an instructor or a mentor or a professor that really helps you further solidify that interest. So, you know, I found it funny when you said you felt like you were a computer scientist in disguise because I remember taking a lot of math courses in undergrad and feeling like a math student in disguise. So I think it goes both ways. So all that has ultimately led you to, and you said, amongst many other research areas, cryptography is one of the segments within the field of computing that really struck your interest. But I want to start off high level because not a lot of people might actually know what cryptography really is. It's a fundamental pillar of modern security, but like you said,

Starting point is 00:09:51 the technical aspects can be very complex. So could you provide a high level explanation of what cryptography is and why it's important in our digital age? Yes, definitely. So most people think of cryptography or what cryptography used to be is a way of securing communication. So in other words, if I want to send a message to someone, I want to make sure, A, nobody else can read this message, that when I send this message over some network, an adversary cannot kind of see what I'm sending. So I want to be able to communicate privately. Another thing I want is to be able to make sure that the message I sent was indeed received without being changed.

Starting point is 00:10:38 So I want to make sure that we have some form of what we call authenticity, that the message that was received, maybe it was dropped. That may be, you know, that just it didn't, you know, an adversary can just perhaps drop a package, a packet. But I want to make sure that if it's altered, then the receiver will be able to know that. So if something happened to it,

Starting point is 00:11:00 it was like, oh, wait, something's wrong. That's not Yael's message. So what I want to ensure is, oh, wait, something's wrong. That's not Yael's message. So what I want to ensure is A, nothing about the message is leaked. And B, if it's tampered with, the receiver will know that it was tampered with. And so that's what mostly kind of what cryptography was like for many, many, many years. Today, as you're saying, with the way the digital age is changing, cryptography is much broader than that. And a lot of things that we deal with today actually have to do with securing computation and not only communication. So what do I mean by securing computation? So today, a lot of large-scale computation is happening.

Starting point is 00:11:50 And, for example, our digital medical data is sitting somewhere. A lot of our private data is sitting in various places, various servers. And we want to make sure it's stored correctly. But moreover, we want to be able to do some computation on this data. For example, maybe we store our data somewhere in some hospital, but we want to allow a researcher to run some computation on this data. How do we run a computation on an encrypted data? So we don't want to give the researcher all the private data.

Starting point is 00:12:36 We want to respect the privacy of the patients. Yet we want to allow the researcher to do these computations and then crypto data. So how do we deal with a lot of data that's private, yet we want to get some utility from this data? Another thing that's happening is that because there's large-scale data, for example, think of, you mentioned blockchains. Today, we have public ledgers, you know, used by Bitcoin, for example, or many other cryptocurrency companies. They have huge amount of data sitting on public ledgers.

Starting point is 00:13:16 And for example, in the case of cryptocurrency, to verify a transaction is a huge computational burden. One needs to make sure that this coin was given to the owner and was not double spent. And whoever gave him the coin got this coin from someone else who did not double spend. And that person got the coin from someone else who did not double spend. It's a huge computational burden. How do we know that things were done correctly? How do we know that it's indeed a valid coin? So ideally, what we want is some proof, some proof that says this computation was correct. So here I'm saying privacy is one aspect. Another aspect is

Starting point is 00:14:01 verification. How do we, integrity of the computation. Someone is doing this computation. Okay, someone told me in the blockchain, yes, this is a valid computation. How do I know that that's the case? So sometimes you manage to incentivize, you know, you do some game theory, use game theory to incentivize kind of the users, to be honest.

Starting point is 00:14:22 But sometimes you want to just have a little proof that tells you, oh, yeah, this computation is correct. Now, here it seems to be unrelated to cryptography, because usually when we think about cryptography, we think secrets. Exactly. I'm saying no secrets. We just want a little proof. You think, oh, proof, that's math. You know, what's cryptography about it? But it turns out that we want, of course, a proof that's very succinct, like a little certificate that certifies the correctness. It turns out that in order to get these succinct certificates, we must rely on cryptographic assumptions. So we must rely on some hardness like that. It's very

Starting point is 00:15:03 hard to factor very large numbers or these kind of hardness assumptions that we use every day in cryptography. And we need to rely on these assumptions to generate kind of these succinct proofs that certify correctness of computation. So just going back to your question, my answer is, you know, what is cryptography about? It used to be only about securing communication, both secrecy and integrity. But today it's much more about securing computation, both secrecy and integrity. You just described that so perfectly. And I think I just could not imagine a better way to capture this idea of securing communication and the advancements that we've seen in the field to now moving towards securing computation. So I think what

Starting point is 00:16:00 you described in the end is this idea of the verifiable computing work, the delegation of computation, which, as I understand, is the breakthroughs that you've pioneered in this space have been a big factor for your 2022 ACM Prize in Computing. So how is this approach different from traditional approaches? So traditional approaches, as I understand, all those computations would have to be done and the system is inefficient. Is there some other sort of traditional approach to computation? Yeah. So, okay. So here's the problem we want to solve, right? Someone that we may not trust did some computation. It ran some program for a very long time and it got an output. Now we want a little proof, a little. It's short. I need it to be short because I need to verify it

Starting point is 00:16:54 efficiently. I want a short proof that indeed this is the outcome of running the program. Now the thing is, most, I don't know most, but many natural programs, there does not exist a succinct proof, a short proof. So for most or many natural kind of functions or programs, how do I prove to you that I did something, that something is, that, you know, this is the outcome? Let me give you an example. For example, take a chess. This is kind of a bit mathematical maybe, but take a chessboard. Okay. Let's say you have a chessboard. You have some pieces on the board. Now I want to prove to you that the black player has a winning strategy. Namely, I want to prove to you, no matter what the white player does, the black player has a move such that no matter what the white player does, the black player has a move such that no matter what white player does, at the end, he will win.

Starting point is 00:17:50 How do I do that? I really don't know. How do I give you a short proof? The only proof I can think of is I'm going to tell you, well, first, the black player will make this move. Then, for each and every possible move of the white, this is what the black player will make this move. Then for each and every possible move of the white, this is what the black player will do. Then for each and every possible move of the white player, it's a huge proof. It's like an exponentially sized proof. So that's an example of a computation I don't have a succinct proof for. And many natural computations, I don't have a succinct proof for. So what we do is we rely on cryptography to do it. And there's a price. The price is, we actually don't offer the guarantee which mathematical proofs give you, which is,

Starting point is 00:18:43 oh, there is no fake proof. Like either the statement is true, in which case you can prove it, or it's false, in which case you simply cannot prove it. Our guarantee is weaker. We have a computational guarantee. We say, if it's true, you can prove. If it's not true, it's hard to prove. It's not impossible. Maybe someone can prove it. But if someone can prove a false claim, then he can break a very hard cryptographic assumption, such as he can factor a very large number. And that we believe you cannot do, because if you could do, then probably the economy would have already crashed, because all our transactions depend on the hardness of this problem. So our proofs are not really proofs. They're what we call computational proofs. It's not like it's impossible to fake a certificate

Starting point is 00:19:33 of correctness. It's just very, very, very hard. And we believe nobody can do it today. I see. I think we'll circle back on this later, but I'd love how some of these underlying assumptions break down in the context of some of the advancements with quantum computing. But I'd love to table that discussion for the end. Okay, yeah. Happy to talk about that. It's very interesting. deep dive a bit more on this. So, you know, you described this verifiable computing essentially as a way of ensuring the correctness of those computations performed by, you know, a server or, you know, some blockchain nodes. So as I understand it, one of the primary objectives here is efficiency, right? You want to ensure that the verification process doesn't, you know, introduce too much overhead or become computationally impractical. Exactly. You do this by minimizing the resources of verifying the proofs, you know, you reduce the proof size, you reduce the computational complexity. On the other hand, there's this

Starting point is 00:20:35 concern of privacy or security, right? So I'm assuming that, you know, in cases where you want to verify these proofs, there's some data or some sensitive information about the computation that has to be disclosed during the verification process. And correct me if that assumption is wrong. Actually, that is incorrect. Let me explain. Yeah. So it's very interesting. There's beautiful, beautiful works on, getting zero knowledge proofs.

Starting point is 00:21:05 Actually, proofs that reveal no information. It's interesting because I'm circling back to your first question, which is, you know, how I got to cryptography. I mentioned Adi Shamir. But another person that I definitely should mention is my most amazing PhD advisor, Shafi Goldwasser. And the reason I'm mentioning her is not because probably she's the person that, you know, because of her, I stayed in the field, but also she invented a, together with a Silvio Michali and Charlie Rakoff, the notion of zero knowledge proofs. And what they showed is they can convert, or what was shown actually after that,

Starting point is 00:21:49 is that one can convert any proof into a zero-knowledge proof. And so one can convert our little succinct proofs into zero-knowledge ones. So one can take, this is kind of a technology that we know from the 80s, really. One can take any proof, long, short, and you can convert it into a zero-knowledge one with actually placing pretty minimal overhead on top. And zero-knowledge refers to? Good.

Starting point is 00:22:22 Zero-knowledge means that proof reveals no information beyond the fact that the statement is true. Oh, wow. It's really no information is revealed. It's pretty amazing. I remember when I studied this notion, I practically couldn't fall asleep at night because I felt like it's like magic. How can you convince someone that something is true while revealing literally no information beyond the validity of the statement? And you're saying, wait, but a proof is information. What do you mean no information? It's a proof you can read and verify. Yeah. Okay. So it's something to think about. It's

Starting point is 00:22:59 actually, I'm hoping maybe I'll get some of my listeners to join us in cryptography. So essentially, you can achieve this verification process while preserving privacy, without, like you said, without revealing any information about the computation or about the input data to actually enable the verification. Exactly. Wow, that's beautiful. It's amazing. Yeah, I agree. privacy. How do you come up with these kinds of ideas? Like, in what context? Is it in lab meetings? Is it, you know, on walks? Sometimes I'm so fascinated by the type of innovative solutions that researchers come up with. So I'm curious, how do you come up with this idea? Yeah. So, you know, there's two types of ideas. There are ideas of which problem you want to study.

Starting point is 00:24:02 And then there's ideas of how you come up with a solution. The reason I'm kind of partitioning the two is because ideas for which problems to study, they're usually much more high level. You know, you kind of, you don't necessarily need to be in a research mindset. You walk around, you just, you know, you can read the paper and up comes an interesting question, you know, just by listening to what's going on in the world. You know, I don't know. Now there's these large language models.

Starting point is 00:24:36 Oh, guess what? It brings a lot of interesting questions to photographers. So just kind of living in, you know, and looking around what kind of problems the world throws at you. The world throws problems at us all the time because it's changing. And like blockchains, wow, it's all this technology, brought with it a lot of challenges. And that's fantastic for us because that's kind of, I felt like the difference between theoretical computer science and mathematics, that we get new challenges all the time. And these are really, really interesting challenges because it's kind of how our world is shaped. So that's about kind of which problems to solve. How to solve the problem. Wow. That is really usually a, you know,

Starting point is 00:25:27 we really need to dive. At least that's the way I work. I think different people work a little differently. But for me, I usually dive, I don't know, like 100 feet deep. And I get so obsessed. You know, I can, I think I'm probably, you know, I remember a problem that I worked on that I literally abused my students, like emailing them every hour. I mean, they abused me back, so it's okay. But like, you know, constantly, you know, here's an email. Oh, wait, this doesn't work.

Starting point is 00:25:55 Oh, it doesn't work. You know, this intense thinking and then waking up in the middle of the night because, you know, I think I have an idea. And then I get up and then, oh, it doesn't work. And then I go back to sleep. And then after two hours, I wake up again because actually, you know, I think I have an idea and then I get up and then, oh, it doesn't work. And then I go back to sleep. And then after two hours, I wake up again because actually, you know, I think it should work. And this kind of, usually the solutions for me come when I'm really obsessed. I breathe the problem. I sleep the problem. I am very, very much obsessed about the problem in a way that I feel like sometimes, oh my God, either I need to solve it or someone else needs to solve it because I need to get

Starting point is 00:26:31 out of this misery, you know? I need to live again. You know, I need to breathe a little bit and not be in, I feel like my head goes so, I'm so intense in thoughts. But that's for me. I know that different people think differently also I I really enjoy collaboration collaborating and so you know when I get intense about like now I'm working very hard on a problem and again you know it's like we meet and then I go home and

Starting point is 00:27:00 like two hours later oh can you hop on zoom again Zoom again? You know, it's like constantly. It's like, you know, my student who's working with me is constantly on the project as well. We're both obsessed. It's like an obsession, but an obsession that I really enjoy. It's really, really fun. But I think, again, different people work differently. And I think it depends also on the type of problem you're working on. So I work really usually on problems that are quite technical.

Starting point is 00:27:32 You know, the solutions are usually quite complex. There are other problems that other people work on that are just kind of a moment of kind of brilliance. You know, it's simple in hindsight. But, and so there's many different ways to do research, you know, even within theoretical computer science, of course, outside of theory, it's very different because a lot of it just requires actually work. You need to sit and do, you know, you need to do the, to write the program.

Starting point is 00:27:56 A lot of it is kind of, you know, that you start the day, you finish the day and you see progress in theory. Often you start the day, you end the day, you just feel like you made negative progress no progress yeah or or you feel like negative because the ideas you thought should work you realize they don't you know which actually is progress but it doesn't feel that way when you so yeah so that's usually my style but again i i want to just make sure you know technology they're very they're a lot of different styles certainly different people work differently yeah so certainly obsession is i i would say probably a key theme if you're

Starting point is 00:28:32 obsessed in love and are up at night thinking about it yeah but to your point i i have met folks like you said uh maybe you know it's a spur of the moment type of thing where you're walking your dog or you're taking a shower and the solution comes to your mind. But in areas where, you know, maybe you're working on very deeply technical or mathematical pursuits, you know, the approach to problem solving can be different. Yeah, I would say I have some of those as well, but it's a small piece. So sometimes, you know, I'll be in the shower and I'm like, oh, I think I can solve this piece. You know, there's some idea, but usually it's one idea out of many. And then the question is, can you piece everything together?

Starting point is 00:29:11 Does it, you know, it's usually, yeah. So you touched on one piece, which was actually the next topic that I wanted to discuss, which is AI and large-scale language models. So some of the stuff that you discussed with this verifiable distributed or delegation of computing, at least I think, has some practical applications to some of the exciting progress that we're seeing with LLMs. Definitely. I'm curious, more generally, how do you see this line of work or broadly your research aligning with some of the current trends with generative AI?

Starting point is 00:29:49 Yeah, so this is something I'm thinking about now, both with my students, with Shafi, who was my PhD advisor, with a bunch of people. I think probably many people in my community are thinking about this. And there's many questions. So, okay, of course, you know, there's these large language models, and we have no idea what they're doing. And we want to make sure they're doing the answers they're giving us. We want to be able to trust them. Okay, we want to be able to trust that they're giving us the answers that they should. The thing is, it's not even clear what this means. And it's not clear what this means in many, many, many levels. So level one, you may not

Starting point is 00:30:35 trust the company that generated these LLMs. I know OpenAI. Maybe you don't trust OpenAI. Okay. In that case, you tell OpenAI, oh, in that case, you can use my technology. You tell OpenAI. Maybe you don't trust OpenAI. Okay. In that case, you tell OpenAI, oh, in that case, you can use my technology. You tell OpenAI, oh, you gave this model. Now, every time, I don't trust the model, but every time the model spits something, just add a little certificate that, oh, sorry, let me start again. I ran ahead of myself. Let's say you don't trust OpenAI. OpenAI gives you a very large LLM. You're saying, what is this thing? How do I know that it's good? Well, in that case, OpenAI can tell you, look, what did I use? I used some neural net. You see, it's very small. The neural net is very small. I'm going to prove to you that this huge LLM is the output of applying this neural net to this huge amount of data. Okay, let's say

Starting point is 00:31:32 you can use my technology to add a little proof that says that this LLM is indeed the result of running the neural net on this data. Okay, question. How do you know the data is good? Where does this data come from? How do you know that it's valid data? That's a question. How do you prove that the data is good? What does it mean for data to be good? Or how do we prove that some data was sampled from the correct distribution? Even simpler. Let's say you want to sample a bunch of random bits. Okay, I sample. How do I prove to you that it's random? What does it even mean? After you sampled, it's, you know, is 0, 1, 0, 0, 0, 1 random? Is 0, 0, 0, 0, 0 random? Is one

Starting point is 00:32:20 more random than the other? I don't know. Each one of them have probability, you know, one over two to the length. So they're all equally likely. Why is one random? Defining what it means even to be sampled from the right distribution is a really interesting question that we're actually currently thinking about. But even suppose you trust the data, you trust OpenAI. We solved all these problems using our technology, let's say. We still have a problem because, okay, OpenAI told you, you trust them. Yeah, I ran this little neural net on all the data from the internet. I got some LLM. Yeah, we believe you.

Starting point is 00:33:01 Still, how do we know that LLM is doing what it's supposed to do? Nobody understands what this thing generated. So now what does it mean? What does it mean to trust this thing? Do we trust it? Well, it depends. What does it mean to trust? You know, when we started the podcast, I told you that cryptography, one thing I like about it is that it deals with kind of philosophical questions and it puts it on mathematical grounds.

Starting point is 00:33:34 And this is one example. I want to say we all know, like just speaking in English, yeah, we all are concerned because we don't trust the LLMs. How do we know that they won't convince us? We'll ask them a question and they'll give us the wrong answer in order to be kind of for the sake of maliciousness. Because they want to take over the world or whatever. How do we know? So we want to be able to trust. What does it mean to trust?

Starting point is 00:34:02 When do you trust? If he gives you the right answer. The right answer is not well-defined. You know, you ask him, you know, should I, what's the best thing I should do to prevent global warming? Is there right? Do we know what's right? So what is it? It's not, and questions where the answer is clear and you can check, fact check.

Starting point is 00:34:24 Okay, he can give you the check, he can certify. He can tell you, okay, this I got from Wikipedia. You see? Okay, fine. That we can fact check. But there are some questions or answers that he'll give us that we can't fact check. So what does it mean that he did it correctly? How is correctly defined in this case?

Starting point is 00:34:41 How do we define trust? So there's a lot of super, super interesting questions that we're now dealing with. And I think it's a very, very exciting era for cryptography. Yeah, these are very large, bold, I think you described them as philosophical questions. And I guess at this point, it's hard for me to even imagine how you can ground these in a mathematical or scientific foundation. But I am curious what lines of research or work emerge as a result of some of these open questions. Yeah, we should talk in the air from now.

Starting point is 00:35:17 ACM ByteCast is available on Apple Podcasts, Google Podcasts, Podbean, Spotify, Stitcher, and TuneIn. If you're enjoying this episode, please subscribe and leave us a review on your favorite platform. Awesome. So one other sort of innovation that you have been attributed for co-inventing is ring signatures, right? So can you provide an overview of what ring signatures are and what exactly was their motivation when you pursued or developed? Yeah, yeah. So actually, this I did when I was in Weizmann under the guidance of Adi Shamir.

Starting point is 00:35:59 I co-invented with Adi Shamir and with Ron Rivest, who's also a Tring Award winner, also one of my mentors throughout the years. So the goal there was the following. So as I said, a lot of cryptography is about authentication. We want to make sure, you know, so for example, when I send you a message, I'm going to digitally sign it so that now you can verify that it's me who sent the message. And the question we were asking is the following.

Starting point is 00:36:34 What if I want to be able to sign a message, but I want to keep myself private? So it seems like anti, it seems, what do you mean sign, keep myself private? So I want to say, oh, it's me, but I want to keep me private. So it seems like anti, it seems, what do you mean keep myself private? So I want to say, oh, it's me, but I want to keep me private. What does it mean? So an example, what I meant, what we want to say, for example, we actually called the paper, How to Leak a Secret. So the idea we had in mind is, let's say I want to tell, you know, my professor in class that, you know, some of the students cheated in the test,

Starting point is 00:37:08 and therefore it's not fair. Now, I don't want to be a tattletale and tell him, no, it's me, and now everybody maybe will leak to someone that I was the one who tattletaled. So what I'm going to do, I'm going to write him, the professor, a message and tell him exactly why I believe that the cheat happened. Okay. So we'll be convinced that there was a cheat and I'm going to sign it. Okay. That's not the best example, but the idea that I will sign it on behalf of someone in the class. So I'm going to say someone in the class sent this message and this is what it says. So in other words, think about, you know, people in the NSA want to leak a secret.

Starting point is 00:37:48 They don't want people to know it's them. So they say, okay, this is what I learned while I was in the NSA. I'm signing it by someone from the NSA. But nobody knows who from the NSA signed it. So we were inspired. One of the reasons we were inspired because there is such a notion of group signature where you can sign,

Starting point is 00:38:09 but that's on behalf of an entire group. It's like a group that you got together and you decided that you are in a group, you agree together and kind of a key and a secret key for you guys. And you think of yourself as an entity. So now you're one group and you're one entity and you can sign on behalf of the group. But in ring signatures, we don't have nobody,

Starting point is 00:38:29 I don't need to, I can just sign on behalf, you or me. Of an individual. Of an individual. I don't need his consent. Actually, you know, when we did it, one of the things I remembered is I was a bit scared by it. Because I was thinking, oh, that's a little scary that we have these signature schemes and these signature schemes allow us, they give us the technology. That's kind of what we did in the ring signatures to allow me to send a message on behalf of someone of, you know, me or someone else. So I can send a message, you know, me or you murdered someone. You wouldn't want a message like that being sent because nobody knows if it's me or you. So I can kind of, I can't really frame someone, but I can, you know, I can say, I can put them in a group that one person in the group is framed.

Starting point is 00:39:19 So, you know, we kind of, we put this, this paper was more of a, like a, is it a good thing, is it a bad thing? It was an interesting... So I am curious, now that you've raised that, beyond the use case for, as you described it, tattletales, or more nicely for whistleblowers, are there practical applications, or are there practical scenarios, real-world world scenarios where you think that this might be useful or relevant? So, okay, let me say, I know that there were company, blockchain companies that used these ring signatures. And I'm pretty sure it wasn't, you know, for whistleblowing. I think Monero used it wasn't for whistleblowing. I think Monero used it and maybe there were more.

Starting point is 00:40:13 I'm now blinking and I don't really know exactly their use case for it. Like how, for what sake did they use it? My guess is that probably they used it to get some kind of privacy on the chain. So currently, when you do, well, there are various now, of course, currencies, but for example, in Bitcoin, there's no privacy. So every transaction, you have some public key and your transaction, you say, okay, I'm, you know, your name is not given, but your public key is given.

Starting point is 00:40:40 Your public key is like associated with your name. So you say, okay, I public key is like associated with your name so you say okay i public key this gave money to whatever you know whole foods to whatever whoever takes the and it's written there on the chain so that's a bit worrisome you know that all your private transactions are written in publicly on the public ledger now it's not that easy to see because, as I said, it's all public keys. It's not written, you know, what the public key corresponds to. But it's very easy to de-anonymize this. So, you know, if someone really wants to de-anonymize and understand where, you know, what your

Starting point is 00:41:20 transactions are, they can, unless you work really, really, really hard to keep anonymous by each time using a different name, a different public key. It's actually quite difficult to do correctly. But it gives this... So I think they used it to enhance privacy.

Starting point is 00:41:41 They didn't get privacy, but to enhance. So you're not really telling, am I giving. So kind of, you're not really telling, you know, am I giving it to, I'm giving it to one of them or there's some, I think it was used to increase privacy, but I'm not a hundred percent sure. I see. I see. So I think even before we started the call, you mentioned something earlier, which is you develop this technology and, you know, once it's out there, there are many, you know, practical applications, some of which you may not even know of or be aware of, right? And I think this kind of

Starting point is 00:42:13 underlies a lot of foundational research where some of the work that you put out into the world can be, whether it be research papers, whether it be technologies, can be adopted and used in many settings, oftentimes that you may not be, you know, fully aware of. How does that make you feel as a researcher? Does it make you excited that some of your work is out there and, you know, individuals are finding or organizations are finding practical applications? Does it cause some concern for you? What are your thoughts as you see the adoption, the public adoption of some of your work as a researcher? Yeah. So let me say first that it's interesting. My research, as you said, is very fundamental. When I got the ACM prize and they told me for my work,

Starting point is 00:43:01 that has had so many applications. It takes a village to do this application. You know, it's not really by the time it's actually applied, it's quite far removed from my work. They use my basic ideas, but there's so much more ideas and kind of making things more efficient. There's a lot more that go in until things are actually adopted. You know, tons of work, more work. So it really takes a village to go from kind of fundamental ideas all the way to deployment.

Starting point is 00:43:34 How does it make me feel? I'm very, very excited. I think, you know, if I needed to say, how do I value my research? I think I would say I value it if it has an impact. Now, it may not have an impact today. It may take time. But if at the end of the day, it's in some drawer and nobody uses it, what's the point?

Starting point is 00:43:59 So I would love, I love it when I see it, you know, used in, for example, with this verification delegation. Now we have a big community around it. We actually have this effort called ZK Proof. ZK stands for zero knowledge. Because as I said at the end, we always put kind of zero knowledge on top. But these kind of succinct proofs. And this effort is led by fundamental researchers all the way to people. It's really a collaborative effort. And once a year, we have workshops. And sometimes it's so diverse. We have bankers coming to this workshop asking us questions. It's really fun to see because they want to use it in their banks.

Starting point is 00:44:47 And how do they use it? And so I really, really love it. Though I have to say it's funny. I used to joke when I was younger that people asked me, so you sit all day, but what do you do? Nobody will ever use this stuff anyway because you're a mathematician so why why are you doing it and i remember i always joked said well at least i'm not doing any harm you know there's so many people are doing harm i'm at least sitting quietly and doing my research

Starting point is 00:45:15 uh so going back to your question you know is sometimes i'm you know i have this feeling worry kind of oh i hope my work is not used for my... Or even, I don't know, think of it, there were physicists sitting and doing physics, and now there's atomic bombs. I don't know how they feel about that. Or nuclear weapon, it's thanks to physics research. So sometimes you do fundamental research, and you're like, I hope it will not make things worse.

Starting point is 00:45:45 I know, I think today around the LLM, there's a lot of confusion, you know, is this a good thing? Is it a bad thing? So I have nothing to do with LLMs. My research is not related to that. But, you know, I can imagine that if, you know, I contributed to that space, I would be today, I don't know, I don't know if I'd be very happy or concerned or, you know, probably both. I don't think that's too much of a concern,

Starting point is 00:46:12 maybe with the exception of ring signatures where we'll have more tattletales in the world. But, you know, you touched on this idea of your, like, working with different stakeholders and, you know know collaborations with fundamental research but also practical applications so you have this very unique role where you have a post at MSR you know where you're a researcher but you also have an appointment as adjunct faculty at MIT right yes and I'm sure that comes with engagements at CSAIL at the computer science and AI laboratory so how do you balance your role, this interesting role, this dual role between academia and industry at an organization specifically like Microsoft?

Starting point is 00:46:53 And I'm specifically curious, what are some of the benefits and challenges of working in both settings? Yeah, so let me start by saying I love working in both settings. So I've been at Microsoft for 15 years. Microsoft has been an amazing place for me to work. It was throughout this time, I felt a lot of support for basic research, for my, in particular, my research. I don't know to talk about Microsoft research as a whole, because I'm not that familiar, but at least in our lab, I can say there was, in our lab, there was always from day one, a huge support for basic research. And I feel like I could not have done better work anywhere else. So that's fantastic for me. It's a great

Starting point is 00:47:35 place to do, you know, it's been very good for me. I do love working with students. Students is what kind of gets me excited to see the spark in their eyes. It's really fun for me. And they're so vibrant and tons of energy and I love it. So that and I'm really proud of the, you know, I look at them, all my interns throughout these last 15 years and I look at where they are, you know, top academic institutions and I'm very proud. But I also really enjoy having a longer, intern is a three months engagement and PhD students, you're there, you know, it's a real strong relationship for five years. And that's something that I really, really enjoy as well. So I love being at MIT. Working with the students is just so, so, so much fun. They're so vibrant and energetic and brilliant.

Starting point is 00:48:38 And I also really love the group, the theory group at MIT, just on a personal level. So that is great for me. And in terms of the relationship between them, I think I actually get a lot of being in both places because MIT is large enough that the theory group is kind of an entire floor in the building. So where, you know, I come to MIT, I go to the sixth floor, I go up in the elevator, I go to sixth floor, and I'm just surrounded by theory people. So we can all play in our little sandbox together with our little toys, but we're all just theorists. In Microsoft, on the other hand, it's an extremely diverse lab.

Starting point is 00:49:20 In my floor, right next to me, I have Mary Gray, who is an ethnographer. I have Henry Cohn, who's a mathematician. I have, you know, I have economists. I have social scientists. I have everything, you know, like a game theorist. From all kind of different disciplines. And now I need to explain to them actually what I work on. You know, we have conversations, we talk. And that's when I kind of, it keeps me true to myself.

Starting point is 00:49:55 Am I really working on problems that I'm interested in? Because it's very easy to convince a fellow theorist, oh, this is super interesting. It was an open problem in the previous paper. We all buy the same, it's very easy to sell to us why our poems are interesting. Much harder to sell to an ethnographer. So it really keeps you on your toes. And I think that has been really great for me, kind of having both, having, you know, to interact very broadly with people in computer science and outside of computer science, you know, economics, social science. And at the same time, I have my little group of friends, you know, my playmates at MIT. So I think having both was really fantastic for me. So I'm curious, beyond the dialogue and conversation with some of your colleagues who are in different areas of research, you described earlier how cryptography or theory of computation has a lot of philosophical questions or a lot of game theory,

Starting point is 00:51:07 do you actually find yourself collaborating on research initiatives to address some of these complex problems? Or is it primarily open discussions, coffee chats that help you frame or rethink some of the way you approach your work? You're talking Microsoft right now, right? That's correct. Actually. So usually it's the latter.

Starting point is 00:51:30 I talk to people and these kinds of conversations, sometimes just water cooler conversations, but sometimes it's actually, we give talks to one another, you know, it's a more, you know, more formal than that. And, and that's usually where kind of i gain a lot of insight i i even though our lab is very interdisciplinary and people collaborate you know across various disciplines together i'm less of an interdisciplinary person just because i tend to dive kind of so deep into thought that it's hard for me to get up and like when you collaborate with someone outside your field you need to stay

Starting point is 00:52:12 in the surface a little bit because they don't know you know it's um yeah so my research is typically not is less collaborative in that with people outside my field of course all my papers except for one during my phd is in calibration with collaboration with other people in my area but not so much actually i have one paper with my husband and and machine learning and cryptography with shafi as well but uh most of my papers are collaboration only within theory. I see. Yeah. Actually, the machine learning paper is also in theory,

Starting point is 00:52:51 but I guess my husband is not, he used to be a theorist, but now he's more applied. So he's my only example of someone that, but it was also happened during quarantine where, you know, collaborating with anyone else was a bit difficult. Hard.

Starting point is 00:53:05 So it was out of necessity. I was like, okay, I guess it's you. I see. And then, so being at a company like Microsoft, organizationally, I know MSR sort of has a different charter and sort of a different organizational structure compared to sort of the consumer or product arm of the company. But do you find opportunities for engagements with the product side of the consumer or product arm of the company, but do you find opportunities for engagements with the product side of the company? Do you find product use cases driving some of

Starting point is 00:53:33 your research directions? So I do have some engagement occasionally, but typically the way they end is by me pointing them to someone more applied. Like they'll ask me a question and, you know, I'll hear what they have to say. And I'm like, okay, so I think the expert you really want is so-and-so because again, my work is so far, I'm not an expert in the application regime, but you know, I feel like often when I, my involvement is more as a middle name, kind of, they hear my name, but I actually just kind of point them to the correct person they want within research. I see.

Starting point is 00:54:11 But at the end of the day, the fundamentals are core to the application, right? Yes, the fundamentals are core. You're right. You're right. That's awesome. So I want to wrap with a couple questions around future directions. I know I tabled this question earlier, and it was because it was probably one of the more interesting

Starting point is 00:54:32 questions that I wanted to get your thoughts on. You know, earlier you talked about with the verifiable delegation of computation, the idea that you do not offer a guarantee. Rather, this whole thing is based on the assumption that the problem is hard or it's computationally hard, right? However, we're seeing a lot of, over the years, over the past decade, and I'm sure in the coming decade, a lot of rapid advancements in quantum computing. So with some of these advancements, there's growing concern about how it'll impact traditional cryptographic systems. But even in cases like the one that you described with the verifiable delegation of computation, these attacks on encryption methods that would normally take years because they're so hard could now theoretically be

Starting point is 00:55:25 done in days with quantum computers. So how do you envision the future of cryptography in the face of quantum computing? Yes, okay, that's a very, very good question. We're now really working hard, actually, and upgrading cryptography to be what we call post-quantum secure. So traditionally, the assumptions, the hardness assumptions that we used are all known to be broken using quantum computers. So if we will have large-scale quantum computers, these will be able to break our cryptographic assumptions. So that's a huge concern. Actually, there's a big initiative by NIST, which is the standardized National Institute of Standardization. And they have a huge call on trying to upgrade all the standards to be post-quantum secure. So that's something we're working on a lot,

Starting point is 00:56:19 is getting everything to be secure under assumptions that we believe quantum computers cannot break. But I want to say that with that, I think quantum computers, if they will exist, indeed, I mean, large scale quantum computers, then it brings with it a lot of promise. So another type of research that's happening a lot now in cryptography is what could we do? How can we use not just so your, your question was the attacker is more powerful. The attacker can use a quantum computer. Oh, we better watch out. Yes. So indeed that's a concern and we were working hard to address it.

Starting point is 00:56:59 However, we, the honest people, namely the people who generate all this cryptography, can also use now quantum computers. And that's a big, you know, it's a big hammer. So what can we do with it that will make our life easier? So that's also something that a lot of people are thinking about, including myself, you know. But I want to mention one last thing about, again, the attacker having quantum power. It's much harder than actually, some of our schemes, let's say, are secure, we prove our secure against assumptions that we do not know how to break, let's say, by quantum

Starting point is 00:57:38 computer. The way we prove security assumes the adversary is classical. And it's not, in many cases, it's not clear that the proof goes through, that we can upgrade the proof. So when I say proof, I mean, let's say I have some scheme, some signature scheme. And I prove to you it's secure. Namely, if the adversary cannot break the assumption, then he cannot fake a signature. The way I prove this to you, my proof strategy, the way I argue security, so far, assumes that the adversary is classical. If the adversary is quantum, some of our proof techniques fail. So for example, sometimes the way we argue, we say, well, if there's an adversary that succeeds in generating signatures, I will run

Starting point is 00:58:31 him again and again and again and again to generate many signatures, and then I will use that to break the assumption. But a quantum adversary, you can't run more than once. He measures a state, a state collapses. These weird quantum phenomenas happen in the quantum world that do not happen in the classical world. So the situation is much more difficult than just upgrading the assumption to be post-quantum secure. We need to upgrade our proof, our guarantees, like how we prove security to be post-quantum secure it's quite the challenge and we myself and many many others are working really hard to get there

Starting point is 00:59:14 but there's a lot of progress in the space right now in cryptography and we're making fast progress so um well yeah as quantum computing continues to rapidly advance, like you said, it seems like there will be a core requirement for some of the underlying assumptions, proofs and cryptographic systems to also equally make the same progress or else we might be in trouble. Exactly. So just as a closing question, I'm curious, what are some emerging or exciting areas that you believe will shape the future, both in cryptography, inMs are just unbelievable. And we need to, this raises so many challenges, as I mentioned before. And I think, thinking forward, we need to think of verification in this setting of LLMs, large language models. I think that's kind of the new kid in the block that's creating a lot of noise. And it's really, it's a revolution. I'm, you know, I really, sometimes I feel like I can't believe I'm alive to see this. It's super exciting. And I think there's a lot of challenges that we as a community

Starting point is 01:00:45 need to solve. Of course, I have my own, I have a lot of problems that I'm interested in that I was kind of, that I was obsessed with and still obsessed with before the LLM came along. So I'm not dropping them, but this is kind of, you know, if you ask me to step back from my own obsession and kind of look at the world and see where it's going, that's definitely, I think, where we need to focus our attention on trying to ensure security of these large language model. How do we ensure that they're doing what they're supposed to? How do we generate? How do we get some verification from them? How do we instill some trust in these systems?

Starting point is 01:01:27 Certainly, I think it's imperative. It's groundbreaking technology and it's changing the way everything is done in this world. So researchers, organizations are pivoting their strategy in many ways to be AI or LLM focused in many ways. And I think it's the right thing to do because the technology is here and it's here to stay. Exactly. So, you know, just to wrap up, we have a lot of listeners who may be early in career, who may be students or who may be, you know,

Starting point is 01:01:57 computing professionals who are looking to explore, you know, a new area of computing. So, you know, generally what nugget or bite or advice would you give to these folks interested in pursuing a career in computing and research more broadly? Yeah. So the truth is my advice is find something that keeps you up at night. I think, you know, I've worked with so many students over the years, and I see the difference between those who are successful, very successful, and those who are less. And the difference, people think, oh, it's because they're so much smarter. Actually, I'm not sure that's the case.

Starting point is 01:02:37 I think the successful people are those who found their passion, who found something they really want to solve. They're so excited by it. And what I would encourage, you know, anyone, my kids, students, anyone, any human being is really find something, find your passion inside your work. First, it's fun. It'll make your life so colorful and fun and engaging. And second, I believe that that's, you know, the way to success, to do something that you're really, really interested in. And yes, you know, I pivoted a little bit. I'm now in computer science. So I'm not saying like, just don't, no matter what you want something and you're going to just, but don't lose your passion. Find something that you enjoy, that

Starting point is 01:03:38 you're passionate about, that you wake up in the morning excited to do. That would be my, and I think if you find that, you're golden. From here, success is just, it will follow. I think that's a very great piece of advice and certainly a great principle to live by. Find what you love, do what you love, and you'll never have to work a day in your life. Exactly. Well, Dr. Kalai, thank you so, so much for joining us on ByteCast and looking forward to some of the amazing work you will continue to do. Thank you so much. Thank you for having me. ACM ByteCast is a production of the Association for Computing Machinery's Practitioner Board.

Starting point is 01:04:25 To learn more about ACM and its activities, visit acm.org. For more information about this and other episodes, please visit our website at learning.acm.org. That's learning.acm.org slash ByteCast.

Pet Camera - EBO Air 2

ACM ByteCast - Yael Tauman Kalai - Episode 47

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Pet Camera - EBO Air 2

ACM ByteCast - Yael Tauman Kalai - Episode 47

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.