Computer Architecture Podcast - Ep 8: Durable Security and Privacy-enhanced Computing with Dr. Todd Austin, University of Michigan

Episode Date: May 8, 2022

Dr. Todd Austin is a Professor of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor. His research interests include robust and secure system design, hardware and s...oftware verification, and performance analysis tools and techniques. Todd has donned multiple hats, being a senior processor architect at Intel’s Microprocessor Research Labs, a professor at the University of Michigan, serving as the director of research centers like C-FAR, and more recently serving as the CEO and co-founder of the startup Agita Labs. He is also an IEEE Fellow and received the ACM Maurice Wilkes Award for his work on SimpleScalar, and the DIVA and Razor architectures.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi and welcome to the Computer Architecture Podcast, a show that brings you closer to cutting-edge work in computer architecture and the remarkable people behind it. We are your hosts, I'm Subinay Subramanian. And I'm Lisa Xu. Today we have with us Professor Todd Austin, who is a professor of electrical engineering and computer science at the University of Michigan in Ann Arbor. His research interests include robust and secure system design, hardware and software verification, and performance analysis tools and techniques. Todd has donned multiple hats, being a senior processor architect at Intel's microprocessor research labs, a professor at the University of Michigan, serving as the director of research centers like CIFAR,
Starting point is 00:00:40 and more recently serving as the CEO and co-founder of the startup Ajita Labs. He is also an IEEE fellow and received the ACM Morris Wilkes Award for his work on SimpleScaler and the Diva and Razor architectures. Today, he's here to talk with us about durable security and privacy enhanced computing. A quick disclaimer that all views shared on the show are the opinions of individuals and do not reflect the views of the organizations they work for. Todd, welcome to the podcast. We're so happy to have you. Thank you. It's good to be here. I enjoy this podcast. I've been listening to it since episode two. Great, great. That's great to hear.
Starting point is 00:01:26 Thanks for your efforts on it, by the way. It's fun for us. Hopefully it's fun for the listeners too. So if you've been an early listener, you probably know that we always kick things off with a question of, you know, what's getting you up in the morning these days. Yeah. I'm all about security and privacy today. In fact, more privacy more recently. I'm really interested in building systems that have more durable security than existing systems. How do we get rid of the zero-day bugs and the vulnerabilities and all that stuff? And that's led me on a long path of building systems, working with DARPA to put those systems into red teaming environments, learning what their vulnerabilities are.
Starting point is 00:02:07 And this has led to some more recent work that I've got in my startup, which is building a very durable security system in the Azure cloud. That sounds good to me as a Microsoft person. Maybe for our listeners who are maybe not the most familiar with this particular field, you've used the word durable now twice. Is there a specific meaning for what it means to be durable? So non-durable security and a great example of non-durable security is patch Tuesday for Microsoft Windows. Every second Tuesday of the month, I believe, the patches come rolling in. And what they're doing is they're fixing bugs in the software that people that they've either found themselves or that people have exploited in the software.
Starting point is 00:02:52 So that's not a very durable way to build security. In fact, relying on your software to not have bugs in it is probably the least durable form of security because anyone who's ever written software before knows that it's basically impossible to get rid of those bugs. In fact, I am of the belief that there will never be a program that cannot be hacked. It just doesn't exist. And if you tell me one exists,
Starting point is 00:03:18 I will say, we're going to have to wait till the end of time to see if any new vulnerabilities are discovered in the future that will hack this particular piece of software. And then the other aspect you have to wait until the end of time to see if any new vulnerabilities are discovered in the future that will hack this particular piece of software. And then the other aspect you have to worry about in security is side channels, which are all the rage in the computer architecture world today. And these are when a system has observable properties that reveal secrets about what's inside the system. And building durable defenses against those are very challenging.
Starting point is 00:03:48 To give you an example, Intel has this environment, they call it a trusted execution environment, SGX, software extension guards. And it's designed to basically take a process and encapsulate it inside a shell inside the CPU that nobody else can penetrate because of all the cryptography and other stuff that they have. But they put it on the same microarchitecture as all the untrusted software. And people find lots and lots of ways to manipulate the cache, the BTVs, the floating point, power on, power off, et cetera, to basically sneak information in and out of that very protected execution environment. So by durable, I want to solve two problems. I want to make sure that the software gets hacked, my security doesn't fall,
Starting point is 00:04:37 because you always hack my software. And two, I want to get rid of side channels. And I think if I can solve those two problems, then I can build durable security. With durable security, you don't get a call in the middle of the night saying, you got to fix this bug. You know, like when Log4J happened and then the government told all the employees, you are not allowed to go to Christmas until you fix this problem. They're still trying to fix it, by the way. It's way past Christmas. In this particular case, it's like, you know, RSA key size is a durable mechanism, right? Because we were all using 1024 bits for RSA keys. And then
Starting point is 00:05:13 we knew like within two to three years, we should probably be upgrading to 2048 bits where most of us are today. So that that's durable security. Yeah, thanks for the really quick overview on the anatomy of some of these security attacks and how they intersect with the hardware and the execution model. Maybe we can expand on some of the themes that you brought up. You talked about these channels that reveal information. And also, software keeps evolving and bugs are always going to be there. Or people will find new ways to introduce bugs or exploit vulnerabilities and so on.
Starting point is 00:05:47 So what are the properties or mechanisms or model that you need to have to tackle both of these challenges? So it's really interesting. Let's talk about side channels. There's three things you need for a side channel. One, you need sharing of resources. Two, you need to optimize common cases. And three, you need fine grain
Starting point is 00:06:07 timing analysis. Okay. And there's three things you need for high performance architecture. It's the same three things. So there's this fundamental tension between high performance design, optimizing the common case, sharing resources so you don't need more transistors than you need. And then giving your programmers and your developers really fine grain understanding over how much time you're spending in various pieces. When you have those features, it's possible when an untrusted and a trusted entity share the same piece of hardware, the untrusted entity can infer information about what the trusted entity is doing. And just to give you sort of the classic case, it's called Kocher's RSA attack. RSA is an algorithm that it's a crypto algorithm, and it's based on exponentiation of large numbers.
Starting point is 00:06:58 And inside the kernel of the algorithm, in older versions of the algorithm, there's a loop. And inside that loop, it looks at the private key, the message you want to send, it treats it as a number, and it takes it to the power of the private key. And so there's an algorithm inside there that's basically looking, is this private key bit a 1, or is it a 0, or is it 1, 0? And it uses this classic algorithm from like a million years ago called square and multiply.
Starting point is 00:07:23 When the bit is a zero, you square the result and add it to the previous result. And then when it's a one, you see me multiply and then square. So you do two things instead of one. You can either measure the timing of that, right? Cause it takes longer to do a square and multiply than it does to do a square. And so the time dilation will be proportional to the number of ones and zeros within that part of the key.
Starting point is 00:07:45 So that leaks information. Alternatively, if you're sharing the same instruction cache, the adversary can displace the code that runs as the true part of the if statement. Is this key bit a one? And then it can run the algorithm repeatedly kicking that information out of the code. And then the amount of extra time it takes to run will be proportional to the number of iCache misses that you forced that other algorithm to do. So those interactions are the sharing of the cache and the optimizing of latency when you hit in the cache is revealing lots of information about what that other algorithm is doing. It's one thing we figured out in the last few years is it's really hard to stop those things. You can focus on the cash and people move the BTP. Focus on the BTP, people focus on the store buffers.
Starting point is 00:08:35 There's a paper, I think it's in the next Asplos. It's by Xu Wen Wang from Yale University. It's a great paper. I recommend to everybody. It's titled Leaky Front Ends. I tell people that this paper should be the 50th exclamation point on the back end of the sentence. You will never, ever, ever stop microarchitectural side channels because it looks at the front end of x86 processors, creating front ends to try and figure out what code another application is running.
Starting point is 00:09:06 And it does this in three different ways in the front end on many different processors, and it does it for power and timing. It's hard. And all the defenses people are proposing are based on cryptography. But the problem is there's just not enough state in the microarchitecture to really hide stuff well. So for example, we've seen a lot of papers on what are called randomized caches, where you basically take the index into a cache and you encrypt it before you go into the cache. So now I don't know where any of my data is in the cache. Well, it's in one of 256 places. So with a little bit of patience, you're going to actually find that information. So it's really hard to eliminate those microarchitecture sites. In terms of side
Starting point is 00:09:51 channels, the most durable thing we have is either cryptography or isolation. So if you want something to be secure, it makes a lot of sense to me to move it out into a coprocessor. And then that coprocessor is built in a way where it doesn't share caches with untrusted code and stuff. And it can be built to be performant, but it has to be built to not have any of these vulnerabilities. And I think we see this happening in projects like OpenTitan from Google, an open source trusted platform
Starting point is 00:10:20 manager. When you choose to put your trusted execution environment on the same microarchitecture as all the other code, I think it's really a challenge to stop those side channels. live with that or have a system design approach that is cognizant of the fact that you always have leaky channels no matter what you do. And I think you mentioned two things about mechanisms, like isolation and encryption are two effective ways or mechanisms to build durable security into your systems. And the last thing was expanding on the isolation theme, sort of having a dedicated or specialized engine that is responsible for providing the security.
Starting point is 00:11:05 Maybe it also overlaps with a common theme in security, which is security verification. All of these have this common theme, which is what is the root of trust that you have? Even with OpenTitan at Google, the objective is to sort of build this root of trust. Maybe you can expand on maybe one of these themes, including the root of trust and how that plays into the philosophy of designing secure systems and the intersection with hardware as well. I'll do that by way of talking about how do you deal with the software vulnerabilities issues. So a few years ago, I had a project with Mohit Tiwari at University of Texas and Sherrod Malik
Starting point is 00:11:41 at Princeton and Valeria Bertacco here at Michigan. And we designed this processor called Morpheus. And Morpheus was trying to stop software hacking. And the way it did it was by encrypting pointers. And the idea was that it's really interesting when you look at security attacks, it's really, really hard look at security attacks it's really really hard to find a security attack where someone doesn't need to know a pointer value or manipulate a pointer value so if you can encrypt pointers then you've really thrown a roadblock in the face of any security attack that needs access to pointers and so we designed this Morpheus machine and we started to build it.
Starting point is 00:12:26 When you work for DARPA, you have to build stuff and deploy it. And this particular program I was in called SIF, it was a program at DARPA. We had to build a RISC-V version of this processor, and then we had to deploy it into a commercial red teaming environment. And what red teaming is, is where you just basically put programs up on these systems, you make security claims about them, and then they put these large bounties, you know, $50,000 bounties, where if anybody can penetrate your security defenses, they get that bounty, they get that money. And so people come from all over the place to break those things. And so we put up this server and we put this medical database on it. We said, come get all the juicy bits inside the medical database and nobody penetrated. It was incredible. It was a real great success.
Starting point is 00:13:10 In fact, I had talked to the people at the company that had done that red teaming. They said, it's incredibly rare that they'll... It was a three month red teaming effort and there were no penetrations. That's amazing. Very cool. It was exciting. And I've been talking to the RISC-V Foundation about possibly adopting. It's a very simple little optimization to the system. But that said, Morpheus is definitely hackable. It's definitely hackable. It's just really, really hard to hack
Starting point is 00:13:39 because you've got to side channel out information about these pointers. And there's a lot of ways to side channel information out of these pointers. But I really liked the idea of cryptography as a mechanism to provide some durable security. And so where we went after Morpheus was, we just said, let's not encrypt the pointers.
Starting point is 00:13:57 Let's just encrypt the social security numbers and the strings and the things we care about. And that's where we are today. So today, the way we're building really durable defenses, and these are the roots of trust inside the system, are we are developing compute engines that execute directly on encrypted data. And what's really interesting about this approach is if somebody penetrates your system, then they have not penetrated your data defenses. Once they
Starting point is 00:14:25 penetrate your system, they can steal your ciphertext and then they can try to break that ciphertext. And this is so incredibly powerful that our threat model is no longer attacker, do not penetrate my system. Our threat model is here's a pile of data. It's encrypted under a key that only I know. and you, the programmer, can write any program you want, tell me what that data is. But the way the system works is as you compute on the data, the inputs that you're computing on are encrypted and the results of the computation are encrypted as well.
Starting point is 00:14:58 And so you can run whatever algorithm you want on that data, but only I, the holder of the key that encrypted the original data, can see the end results of that. For me, what's incredibly exciting about that, and the reason why I'm in a startup right now, is that's not really security. That's a privacy technology because it becomes possible for the encryptor of that data to have no relation to the person that is running that CPU on the encrypted data. So now I can take your encrypted genome. I can run my algorithm on it, which is doing disease analysis, for example, and the results of that computation are encrypted. I send it back to
Starting point is 00:15:40 you and you can see it. What the startup is focused on now is commercializing that technology in initially Azure and then eventually in Amazon EWS. Yeah, that does sound super powerful. Cause that right, right now, and this kind of came up a little bit with an episode with Jim, Jim Larris, but you know, we have a heavy reliance on software now for, for everything. And even from, you know you know for me for certain things i'm like oh i want to have a nas at home because i don't want to put it in any cloud not even my own company's cloud even though i trust microsoft but like i'm just that kind of person like i want
Starting point is 00:16:14 certain things like in my house only and then you you buy this nas and there's software on the nas you're like i don't know what you're you know know, like what this thing is doing. And this is, I bought this specifically so I could have like certain things like in the house. And so it almost feels like you have nowhere to turn. And what you've just kind of described there is enabling any software anywhere to do its thing. Because right now it's a sort of twofold choice. If you want the benefit of the software, you have to give up the info.
Starting point is 00:16:44 And so then in many ways, you know, some people may make choices one way or another, but you're kind of splitting that choice to being able to get the benefits of the software without having to give up the info. Yes. And in addition, you know, the whole internet is built on this idea that I give you my data, you monetize my data and provide me a free service for that data. Right. And I, you know, and maybe companies haven't always been the best behaved in how they utilize our data. Maybe they lose our data. I read a very interesting statistic the other day. What fraction of American adults have had their social security numbers stolen in the last 10 years is NPR stat. Any guesses? More than half. Yeah. It's about almost 80%. And half of us had our social security number stolen in a single breach
Starting point is 00:17:40 in 2017, the Equifax breach, which was a run, you know, runtime code injection, you know, standard kind of thing, 76 days, they were able to exfiltrate data from that i mean it's bad so if we build privacy oriented architectures the permissions are no longer with the hardware the hardware just processes stuff the permission is actually embedded in the data the example i always like to give is a voting machine if i build a voting machine for privacy technology there's a lot of things you got to solve. First, you want to be confidential, right? So I want to be able to put my vote into the machine, but the machine cannot see my vote. There's technologies that do that. Multiparty computation, homomorphic encryption, edge of the lab sequestered encryption technology. We do this kind of stuff,
Starting point is 00:18:18 but that is a useless technology for voting, right? Because how do you know that I added your vote to your candidate or added your vote to my candidate? You have no voting, right? Because how do you know that I added your vote to your candidate or added your vote to my candidate? You have no idea, right? Nobody solves it. SGX doesn't solve that. Homework encryption, that's an integrity problem. That's a computational integrity problem. So our technology solves that. So it gives you an end-to-end receipt. You actually put a secret in with your data, you then put it through the algorithm, and then it comes back with a computational result. We call it a fingerprint, which actually tells you, did you execute the algorithm I expected with the data that I passed to you?
Starting point is 00:18:58 And that's an end-to-end receipt. So now you know that you ran the approved algorithm. In fact, we didn't even have other people sign those fingerprints. And then there's one more problem you got to solve. And this problem is what you need to solve. And I believe to actually make the technology actually useful in a commercial sense, you need the ability to render unencrypted results off that data in a safe manner. And let's go back to that voting machine example. If you vote, you want your vote to be private. Do you want me to run my algorithm
Starting point is 00:19:31 without being perturbed? And you want a receipt of that, that it was run on your vote. But what if your vote didn't match any of the candidates? You'd sure like the machine to tell you, you need to cure your ballot. I didn't match any of these candidates. So in that case, you need the ability to decrypt a Boolean value and the machine to tell you, I don't know who you voted for, but I know this much about your vote. It wasn't any of these candidates. Can you please fix your vote? That problem is the hardest of all the problems, because once you give a system the ability to decrypt a piece of information, now you have to prevent the clever attacker from putting other pieces of information into that decryption capability. So ultimately, it's an integrity problem. So if you're really good at
Starting point is 00:20:16 solving integrity problems, you can now build this system. If you have all those features, now it's possible for companies that take your data and then harvest information off of it to do it in a way where they can't actually see your data. We call this form of computation, we call it secret computation. And there's a benchmark suite that we made. It was Michigan at Ajita Labs, NYU, and Addis Ababa University in Ethiopia. We've been working together to build this benchmark suite. It's called VIP Bench. You can go to Bitbucket and download it. Inside there, we have this benchmark, which is, it's one of the Netflix challenge candidates. But what we've done is we privatized what movies you've watched and how much did you like them for all the candidates. And it runs this matrix
Starting point is 00:21:02 factorization algorithm across the entire database, completely encrypted, and then produces a set of encrypted results, which are the movies that we think that you would like to watch. But never does the CPU ever get to see what movies you like. Never does it get to see what movies you recommend, but it gets to accomplish the goal of doing recommendations. Basically, it creates an opportunity to do computation and computation services, and then even render statistics off that data in a way that's safe. I haven't seen any systems able to do that yet. So it's super exciting to be able to sort of be first movers in that space. So this may be because I'm by no means a cryptography expert at all. In fact, I would say I know very little about it.
Starting point is 00:21:51 So it seems hard for me to imagine how you can potentially aggregate and reason about encrypted data. So for example, like on a voting machine, like if you have a whole bunch of encrypted stuff passed through, like how are you actually then able to say on the inside, say like, okay, this guy actually got 32 votes when all I got was a bunch of garbles. Yeah, yeah, yeah, yeah. How does that work? So there's really three ways to do this today. encryption where they actually design a cryptographic algorithm where you can encrypt
Starting point is 00:22:25 data, you can decrypt data. And then there's also an algorithm to add two ciphertexts together and multiply two ciphertexts together. So you get add and multiply. You don't get less than, so good luck sorting your data. You don't get, you know, it's really limited because you have to be able to come up with these homomorphisms in the ciphertext space. And they're also very expensive, these algorithms. Microsoft SEAL is an example of actually pretty high-performance implementation of this. But still, you're talking 10,000 times slower than native execution. But it is possible to do that.
Starting point is 00:22:59 The second approach is what's called multi-party computation, where you take a piece of information that you want to perform secure computation on, you bust it up into little chunks. So for example, if you want to add a bunch of values together, I give part of that value to different entities, and then everybody adds their components,
Starting point is 00:23:20 and then there's a final phase. It's not decryption, but it's sort of the aggregation phase where I finally get the final results. And then the idea is since all those systems are different, you've got to hack into all those systems at the same time to get that data, like log4j, for example. It's certainly possible, but it's harder than hacking into one system today. Our approach is we just have a functional unit that's got a public private key pair. It can expose a key inside the functional unit and inside the functional unit,
Starting point is 00:23:50 it does AES decrypt. It does an op on an ALU and then it does a re-encrypt of the high entropy cipher and sends it out. It's just basically, there's no software in the box and there's no side channels in the box. And so it's just a piece of hardware that does that. As an architect you have to contend with that latency so you know we got a thousand ways to get rid of that latency and they all you got to do it without adding any more side channels but it's certainly possible. I mean it's just an architecture problem but it's a bunch of latency you know AES it takes about 40 ALU cycles to decrypt and 40 ALU cycles to encrypt. So, you know, you take your one cycle add, and when you make an encrypted add, it's now 81 cycles. You have to tolerate
Starting point is 00:24:31 those latencies. Fortunately, there's a lot of locality in what we do, so we can, it's pretty easy to tolerate those latencies. Gotcha. Okay. Makes sense. So then for this kind of like special engine that you have on the side, you've made it isolated. So now you don't have potential speed advantages of say, sharing with a bunch of other stuff. And then you've also essentially put like a chastity belt around the CPU or the ALU or whatever. Exactly. Yeah. The CPU throws high entropy ciphertext at it, and it comes back out of the function
Starting point is 00:25:00 unit high entropy ciphertext. So in fact, you want a random number generator, you just assign one to an encrypted variable over and over again. The ciphertext of that variable, it'll pass NIST randomness tests. Because what you can't do is you can't, one can't be encrypted a certain way, right? Because if somebody discovers a value is true or false,
Starting point is 00:25:19 all of a sudden they'll know how to evaluate all the relationals. So you have to have what's called a high entropy cipher which is like if i want to encrypt a one there's two to the 64 ways or two to the 128 ways to encrypt a one so you get a little bit of expansion in the data size but in exchange for that you get this sort of very durable you know to you know the systems we are building today to penetrate the defenses it's It has nothing to do with hacking. Hacking is not even on our plate because hacking just gets you to the ciphertext.
Starting point is 00:25:52 Our hacking experience, you're already at the ciphertext. You're the programmer. So hacking is about side channels. Can you find a side channel either in the cryptography or in the timing? We don't do physical yet, but we will at some point.
Starting point is 00:26:08 Because, you know, physical is a little harder in the cloud because, you know, you can't take your oscilloscope into the Microsoft data center. They'd probably stop you at the door. But, you know, if you look at like Jakub Zephyr's research on FPGA-based side channels, I mean, he's been able to do some remote side channels by being co-resident on the same FPGA. So there are some ways to do physical measurement. That's where you look at power, voltage, leakage current, and then that can tell you some information about the computation that's going on.
Starting point is 00:26:41 So you got to address that eventually. Once you get rid of the side channels and once you get rid of software hacking, then the strength of the system is the strength of the cryptography. So you want to get this data, you got to break the crypto or you got to steal the key from whoever encrypted the data. And that's a way different game than we're playing today, right? That's not about looking for turn-oriented programming hacks or who has an updated log for J. That's like, can you break a yes, yes, no? To me, that's super exciting. That's really exciting.
Starting point is 00:27:17 Right. It's definitely a very interesting approach to use encryption in the computation pipeline. I just wanted to expand a little bit on that. I mean, in general, when you look at the trade-off space for any mechanism that you build for security or privacy, you have what's the functionality that it provides, what's the security characteristics
Starting point is 00:27:34 or what are the security definition, and then how efficient it is, like what are the overheads involved in enabling this particular mechanism. And you talked about a few different methods by which you could enable that. There was fully homomorphic encryption or FHE. Then there was multi-party communication. And then you have your encryption-based methods, which are sort of in the pipeline. And I think you rightly pointed out that homomorphic encryption has a pretty strong security guarantee or
Starting point is 00:28:01 security definition, but it also turns out to be quite expensive from a compute standpoint. I push back on that a little. What is the oldest HE cipher? 2009. It can't be older than that because that's when they figured out how to do FHE. An 11-year-old cipher? Who wants to use an 11-year-old cipher? That's a young cipher. Durability in cryptography comes with 20, 30 years of... No, I'm being somewhat facetious. These ciphers could be very strong, but there was, for example, there was an attack on BFE, which is a very common homomorphic encryption site.
Starting point is 00:28:35 There was a sort of, there was an attack on the algorithm itself, I think in 2019. Part of durability for ciphers is that they get old and that they get a lot of exposure to cryptanalysis by the community. So one nice thing about hardware-based approach is you don't invent your own ciphers. That's definitely a valid point,
Starting point is 00:28:55 but I did want to sort of expand on the overheads of the computation itself. Like FHE also, we know, like regardless of the durability of the cipher itself, assuming it's durable and secure, it nevertheless is quite expensive to do homomorph. I think there's been a bunch of accelerator papers as well, which propose accelerators for FHE, figure out what is the right way to speed up those computations. It looks like you're also doing encryption in the processor pipeline or within your secure processor pipeline. And can you comment a little bit on the overheads of doing that?
Starting point is 00:29:26 You said it takes like maybe 40 cycles. How do you sort of fix that? Because normally in the processor pipeline, we are used to one cycle, five cycles, and we start getting 10 cycles. People are like, okay, that is too many cycles. So can you talk a little bit about the overheads of the encryption-based approaches itself? Yeah, well, even FHE finds interesting applications. It kind of depends on the application to what overheads you can tolerate. But if you're doing machine learning training,
Starting point is 00:29:53 you're going to want those overheads to be super low because you know, you don't want your two week run to be a two year run. So currently, we're deployed in FPGAs in the cloud. So our overheads are sort of artificially high because we have to get, you know, we go through, you know, our ops are going through PCIe. But even then we can find some pretty interesting applications. We're about 50X off native in our initial product right now, which is literally thousands of times faster than other technologies. If you integrate this, we've run GEM5 simulations. You integrate this tech into the pipeline, you can get within 2x, no problem.
Starting point is 00:30:31 And here's what's really interesting. Of that 2x, only a quarter of that is exposed encryption latency. The rest is exposed algorithm changes. Because there's two things you have to do to algorithms when you basically go secret, you have to eliminate the memory and the control side channels. So that means you need to, you can't use a secret variable to influence an if condition in the program. So that means you need to do predication.
Starting point is 00:30:59 If everybody remembers predication from your architecture class or if conversion, you're going to, you're going to execute both the true and the false side of a, of a condition test. And then you're going to select the value using an encrypted primitive. That way there's no variation in the underlying algorithm based on the data that you're actually running on and doing decisions on. And the second thing is you,
Starting point is 00:31:24 you have to have oblivious memory accesses of some form. We have a kind of ORAM that we deploy in our framework and various optimizations around that as well. But what it basically means is that as you change the inputs of an algorithm, the way it accesses memory does not change in any statistical manner. It can be random, for example, or it can be fixed, but it cannot change based on the input changing of the data. Because if that's true, then you have a memory side channel and you can do all your specters and meltdowns and all those other things to try to figure out what's coming out of memory.
Starting point is 00:32:02 Now, of that 2x overhead for your most fastest private computation one quarter that is exposed ciphertext latency weren't able to get rid of and and three quarters that are those algorithmic changes but had a really cool result i had two summer students work for me last summer and in vip there was, we picked in VIP the benchmark that was hit hardest on its order complexity because of if conversion in ORAM. And this was an algorithm called flood fill. If you remember, if you ever played Candy Crush, when you, you know, crush a candy, flood fill algorithm goes to all the other candies that are connected.
Starting point is 00:32:41 And that algorithm's ordered N squared for an n by n array and it's completely recursive right so it's it's totally control driven and then i rewrote the algorithm to be what we call data oblivious and it was order n to the fourth which is pretty bad i'm not a great programmer okay and so i challenged them both to come up with their own algorithm that beat order end of the fourth and they both came up with their own algorithm that beat order and the fourth. And they both came up with an order and squared algorithm, which is super exciting to me because that means that if you, if there's an algorithm that's very inefficient when it's running privately,
Starting point is 00:33:15 just redesign the algorithm to not look at its own data. And it was so clever. Both of them had such a interesting, clever algorithm that he came up with. So I'm very bullish on even beating this 2x one day in terms of performance. And I think if we can get within 2x or 1.5 for fully encrypted computation, and it's really cheap, I got to say, it's really cheap to deploy as well. then you can create a world where people can start to have control over their data, but you're not hamstringing companies so that they can't render value off of your data. So I'm super excited about what these technologies can do for the state of privacy and security in the future. Yeah, that's really exciting. I think it's also a common theme in some of our other conversations with folks. Like for example, even in the case of accelerators for specific domains,
Starting point is 00:34:11 you often need to rewrite the algorithm or change the algorithm so that it's amenable to acceleration. And it sounds like even for security or privacy enhanced computing, you probably want to tweak the algorithm so that it's more amenable to providing you the properties that you want. That's definitely very interesting. And I think it's a good theme to take home. Yeah, I was just going to say, for this whole system to work, it sounds like all the pieces have to come together, right?
Starting point is 00:34:39 Because you've got this hardware layer that essentially is this isolated encrypted mechanism for computation. And then you've got the software layer, which does need to potentially be rewritten to sort of mask any sort of potential code-based side channels. And the two of them need to come together. So for the stuff that Souvenay was just talking about, where for domain-specific acceleration, you might want to rewrite your algorithms, there's usually
Starting point is 00:35:10 some high amount of motivation to do that rewriting, because it's like the person who is adding the value to the data is the person who is rewriting the code. But now in this case, the rewrite of the code sounds like it might come from potentially elsewhere than the company that's providing the value via the software. Is that the case?
Starting point is 00:35:32 And so then now there's potentially less motivation to actually do that rewrite. Do you have any thoughts on that? Is that true? It depends. So part of the commercialization of this tech is figuring out how do you get into people's hands with the least amount of friction. So today the technology is it's a C++ or Python library where you actually declare variables encrypted if you care about them not being visible.
Starting point is 00:35:57 And then you can port code into it or you can write fresh code. Longer term, I think we want to, you know to put this into databases. So I can have a cryptic columns and I can do all the things I want to do on my cryptic columns. I want to put this into the TensorFlow framework so I can have private models or I can have private inputs on my machine learning models. I think the key to sort of reducing that friction of adoption is to find frameworks that it lends itself to well, where privacy is important and there's enough information
Starting point is 00:36:37 in the compilation framework that you can automate the process. We don't automate the transformations of the programs today. We can at some point. Today, what we do is we, you define the variables you want to be encrypted in your program, and then we tell you what you can't do. So if you're going to express the computation, it is safe. It is safe.
Starting point is 00:36:57 But you know, if you use it, you know, if you're porting, I ported tons of code in this framework, when you port stuff over and it's an in a statement, it doesn't compile it and make it unsafe. It just says, I can't do this. So it's, unfortunately, if you want a system to be really safe, you have to eliminate the bad programming practices that are so prevalent in the world today. But fortunately, we can automate that process eventually.
Starting point is 00:37:23 And then can you automate that process without taking an order complexity hit on your algorithm? Probably not. So I just sort of the longer term vision is really to, you know, what are the important algorithms that we don't have that we took big order complexity hits on and how do we redesign those algorithms so that they can be data oblivious and, and don't create side channels and how they, how they do that. And then another thing, I'm just, I'm trying to figure out what else can we do with data oblivious algorithms? So I'm sorry, I'm on the, I'm now walking the earth, going to all my old places in computer architecture to figure out where I can use this technology.
Starting point is 00:37:59 Cause I want to create pull instead of push. You know, I want to, I want people to say, Oh, this is a very interesting property in an algorithm. How can I use this to, you know, solve whatever problem that I have that has nothing to do with security? And I think I can find a couple opportunities for that, just to sort of build more momentum around this idea of these very special algorithms
Starting point is 00:38:22 that don't have these side channels in them. Cool. That sounds exciting. So I think at this point, maybe we want to switch gears a little bit. So Todd, I've known you quite a while, and you're one of the most kind of like naturally enthusiastic people that I know. And you've done a whole lot of different things, you know, across different domains, both academically in terms of subject matter, as well as you know like leading cifar at this this startup and and i remember the last time we spoke you were talking
Starting point is 00:38:49 about this the work you were doing with this university in ethiopia maybe you can talk a little bit about how you move between these different domains always with such a smile on your faces too yeah well i i like having lots of stuff to work on, you know, especially when you're doing research, there's a lot of fits and starts in a project. And I find that I get my best ideas on project X when I'm working on project Y. So I like to have a lot of different things. You know, my PhD students all tend to be working on their own thing. And I like to take on even non-technical projects like the stuff in Ethiopia. Almost 13 years, I've been working with the university there to help them build their capacity for educating computer science students. Computer science is hot here at Michigan.
Starting point is 00:39:43 It's just as hot in Ethiopia too everybody wants to be a computer scientist and they need more more and more profs so I've been you know as part of that effort I've been taking on research scientists here from that institution and sort of helping them get over the hump on their PhDs and then help them build up capacity there. And then, and in the process, I mean, I've recruited so many amazing students from Ethiopia into my PhD group here. And now they're, Souvenay and I were talking about, there's one that's in his group at Google now. So it's, it's really exciting. And the way you keep excited about it is, you know, it's like, there's always a taller mountain to climb and interesting problems to work on. And you're, you know,
Starting point is 00:40:30 you're always surrounded by all these young people that are just trying to figure out how to do research and how to be successful in their careers. And, you know, really being a part of that process and helping mentor those people into that is super exciting. You know, I, I tell people I'm institutionalized, you know, don't make any offers below six figures, seven figures, uh, for industry because, uh, you know, I, I'm, I really like being an academic. I really, I really like the, the life that we have. I wish I could convince more of my students to go into academia. My senior PhD student, she is definitely interested in academia, so I have to keep nurturing that thought. But I love it,
Starting point is 00:41:13 and I love just bopping around to different projects and just having something interesting to think about all the time. Great. So how did you get into architecture at all? And then you talked about being institutionalized at Michigan. Maybe you can talk a little bit about your origin story, I suppose. Yeah, sure. I come from southern Wisconsin, a little tiny town. Early on, I discovered computing from a teacher I had at school who was really into computers, and I got into computers and then
Starting point is 00:41:46 I where I came from nobody really went to college either you went into farming or the military this is like mid-70s I thought computers were pretty cool so I thought I'm gonna go to college and so I went to University of Wisconsin and studied there to get my electrical engineering degree at the time, because that's where I was also interested in hardware too. I'm really interested in how things work, how do things function. When I got my electrical engineering degree from University of Wisconsin, I went to work. What was interesting, I didn't even know what grad school was at that time. And I feel like no one ever talked to me about grad school when I was an undergrad. I mean, it was, and I
Starting point is 00:42:30 see in my students today, a lot of them have parents that went to grad school, right? I think grad schools, a lot of people learn from their parents. I certainly didn't. I always, I make it a point now, whenever I teach an undergraduate class, to always spend at least a half a lecture every semester talking about what is grad school? Why would you want to go to grad school? And I'm very pro grad school. I mean, I always, when students come to me and say, should I go to grad school? I said, no, that's not the right question. The right question is, can I go to grad school? Because grad school is the best. And the master's degree, I think, is probably the very best degree, you know, in terms of how it can give you access to better jobs and such. And then students say, they always ask me, what about money? And I say, well,
Starting point is 00:43:15 if you're worried about money, you already made a bad decision, right? You're an engineer, right? You should have been a lawyer or a doctor. So don't worry about the money. You'll be fine. So I didn't know about grad school i went to work at at xerox was my first position after my bachelor's degree i was on this ginormous copier team uh we were building this a 3090 copier and it was this huge embedded system i was a real-time programmer on paper path which means that I was part of the programming team that was making the paper go around. And so you need to debug these systems.
Starting point is 00:43:51 We had one, for 200 programmers, we had one system that we could debug on, just one at the time. And so, of course, I'm a junior member of the team. I'm like debugging Tuesday night at 3 a.m. This is when i'm debugging my code and i i got this idea i'm going to build a simulator for this entire machine you know i built a z80 simulator z80 processors on it and i built network ports i built simulators for all
Starting point is 00:44:18 the you know basically emulators for all that stuff and i got to the point where you could literally boot the entire image of the system. And I built a little backend that would describe the physics of the, of the photocopier so that you could, you know, to a first order, you could tell if your stuff was actually going to run. Then you could just get onto the real hardware and do the tweaks that you need to, to make sure you were capturing everything. That was so successful. It started shrinking the schedule of this ginormous project and so my manager said yeah Todd's going to lead a team to develop that for other projects as well and they plunked me into what was called the Webster Research Center which was the dual it was the east coast dual to
Starting point is 00:44:59 the Palo Alto Research Center so now all of a sudden i'm like in meetings with mark weiser and butler lampson and like all these people i'd never heard of actually but now that i'm a computer scientist phd you know these are super famous people and i'm like looking around i'm like what do these people do you know you know what they do they do whatever they want and so that i figured okay so to do this they need a phd uh and so then i thought okay i'm gonna go back and get a phd and so i went back to wisconsin i i'd taken a class with guri sohi computer architecture class i really like hardware and software i like being at that boundary so i went back and studied with guri i was with guri for six years learned soed so much there.
Starting point is 00:45:46 That was a really cool experience. I don't know how many super awesome people come through Wisconsin, but it was just like, I mean, I was like Bobak Felsafi, Sarita Advay, Doug Berger, Scott Breach, and Vijay Kumar. I mean, it was just like, you were just like tripping over people that are really well known today. It was a really cool experience. And then as I ended towards my graduation, I thought like, oh, you know, I want to go into academia, but I haven't built a processor
Starting point is 00:46:17 or anything. Right. I'm going to be a computer architect that has never built a processor. And so I thought like, well, I should probably go to like industry for a little bit. At that point I went to Intel and that was for me, was really my postdoc. And I got on, I was on the Pentium 4 advanced technology development team. That was so interesting. What was so interesting to me at Intel was that it,
Starting point is 00:46:44 the challenges they had were not so much in architecture. They had all the solutions in architecture. Their challenges were in verification cost. It was incredibly expensive to verify their hardware. And so when I left Intel to take a position at Michigan, I wanted to work on verification costs. And I am not a mathematician, so I can't do formal verification. I'm just a, you know, hardware guy. I just build interesting architecture. So my first project at Michigan was this project called Diva, which was a processor that could tolerate its own design bugs, which was, it would, it had a front end processor, which was really big and capable and had a backend processor, which was very correct and simple. And then it used the front
Starting point is 00:47:32 end processor as a predictor for branches and stuff. And what was really interesting about that project and what really changed my career was it was so different than anything I'd ever had an opportunity to work on. And the thing that was most exciting about it is when I would go and present it, people thought it was either a really good idea or a really bad idea. I remember one time I gave a talk at Intel and somebody came up to me afterwards and says, gee, you're telling me I'm going to release a processor with bugs in it? I go, well, not in the back end just in the front end I can't do that and so I learned I learned my philosophy for research really got developed there and it it's what I call a rule-breaking approach to research which is very simple find a rule nobody breaks
Starting point is 00:48:22 and these rules are they're very easy to find they're like in the front of textbooks you know there's stuff that's always always taught to people and so one of the things you know you're taught in computer architecture is you can't release a processor until there's no bugs in it you know but here's the reality is they all have bugs in them i gotta tell you that they all have bugs in them. I got to tell you that they all have bugs in them. They just don't, hopefully you don't have bugs that are going to cause people a lot of real big problems. So, you know, I started working on designs that would try to break that rule. And sometimes when you break that rule, you find something really, really, really interesting. That was my first project. And now I'm working on these encrypted systems, which is like,
Starting point is 00:49:03 you know, there's a, there's a rule in security, right? You will never build a system that is unhackable. Now that may be true, but I'll do my best to come as close to possible as I possibly get. It is pure heresy to suggest you could build a system that cannot be hacked. So again, as a rule, I would love to break one day. It'd be really exciting. Yeah. And then, and then that's your, that's your way I would love to break one day. It'd be really exciting. Yeah, and then that's your way to live forever, to find out if it's true, right? Yeah, exactly. So you've done several hats in academia and industry, and you talked to us about the journey
Starting point is 00:49:37 in which you learned a lot of different things from different places, but a very high level based on your experience. What do you think the architecture community is doing well? What are things that we are not maybe focusing on as much, either in the teaching or in the research side? And how can we sort of improve that in the coming years, especially as the landscape of computing is also evolving, right? That's a really great question. Well, I'll say one thing I think we're doing pretty well is we're focusing on climate in the community,
Starting point is 00:50:07 which I really value those efforts and try to be a part of them whenever I can. Because I think climate is super important. We don't want to be places where we don't feel welcome. We don't feel valued. That's something I've been really involved with here at University of Michigan. It's absolutely something I've been involved with in my startup. And I'm really glad to see that the architecture community is involved in that as well. I think that's incredibly important.
Starting point is 00:50:35 Something that comes to mind here is I think I always say that I think I wish the hardware community was more like the software community. And I feel very much like we're starting to do that. I'm super excited about RISC-V and the open source movement in the computer architecture community. I think that's an incredibly powerful development. The work that I did on the Morpheus project that I talked about earlier was completely enabled by the CPU work that came out of Berkeley that, you know, rolled into the Sci-5 startup, et cetera. That was completely enabled by that. So I love to see this movement of open source hardware. People don't understand, you know, people in industry often have a difficulty understanding
Starting point is 00:51:24 why open source hardware is valuable. And I just want to take a moment to explain to that person who's listening to this podcast and says, how do I make money on open source hardware? When you create things in industry, you have a certain amount of money to spend on those things. Nobody has infinite money except maybe a few of the bigger companies. But when you only have a limited amount of money, you want that investment to be as valuable as it can be. So what the software community does is they say, there's certain things that everybody needs that don't provide value to the products we're building. So let's make them free. And they're not really free, but what they are is
Starting point is 00:52:02 they're maintained by everybody and everybody shares and nobody owns or charges for those things. And that makes a lot of sense. To me, that makes a lot of sense even for an instruction set, right? Like what instruction set does your phone use? You know, well, probably ARM. But if it used RIGS 5, would you even know it? No. Why?
Starting point is 00:52:22 Because instruction set provides no value to your phone. So why do we have to pay for it? When that happens, all of a sudden, you can take that fixed amount of resources that you would have spent building instruction sets and component layers and operating systems, all this stuff that provides no value to your end product, And you can start to spend it on the accelerators and the, you know, and the, and the new instructions and the, you know, the other things you're going to add that are going to provide lots of perceivable value to your project, to your end project. And so what open source really does is it provides a means for reducing the cost of producing innovation. And that is something that we definitely need in the hardware community.
Starting point is 00:53:06 We need to focus on making it less expensive to innovate. And I think if we could do that, that would be really beneficial. I've had this interesting experience where now I'm kind of in a software org in many ways. And the approach to building things between software and hardware is so so so different with hardware it's like you know you have to do things so far in advance get it all figured out and get it all set up and then it's very rigid and then you have to you know spend forever verifying it with software it's just like let's just you know deploy a little something and then we can fix that we can continuously be agile about it so that's like
Starting point is 00:53:44 one thing it's like the software development is very agile and the hardware development is pedantic and get it figured out along in advance do you feel like the open source ness so i get that that kind of like reduces the cost of innovation because it provides everybody like a substrate where it's just like okay the, the ground is, you know, not everybody has to dig their own like sewer pipes or whatever. They're just there. But does it also enable more agility similar to the software world, do you think? Or is it just that kind of providing everybody a base ground floor?
Starting point is 00:54:18 That's a great question. I would say, I think agility is less of a feature of the design, more of a feature of the design language. Most of the RISC-V stuff is specified in Chisel. And Chisel is, it's flexible once you build that flexibility into it, but building that flexibility into it is really, really challenging. It requires you to build a program, a generator for what it is you want. And if you want to change the things that it creates, you need to modify the generator.
Starting point is 00:54:51 And that takes a lot of background. You need to understand Scala. You need to understand Fertile. You need to understand Chisel. It's very, very challenging. And pushing myself and students over that hump was really, really difficult. But when we did, we got a lot of value out of that infrastructure. I, I think that the challenge there is really just, you know, what is the Python of HDLs? Well, Chris Batten would say it's PyMetal, which is his Python based HDL,
Starting point is 00:55:23 which is actually pretty awesome i must say i think the key there is really innovation in in the hardware description language so i look like you know chris batten cornell adrian samson at cornell they're those two folks are doing really interesting work on taking advanced language features from programming and putting them into hardware description languages. I've got a project here at Michigan called Twine, which enhances Chisel to support heterogeneous design in a more sort of Lego block fashion. So yeah, I think that's where those benefits will eventually come from. I would love to see, that's another reason why I'm really excited about FPGAs on everything,
Starting point is 00:56:11 because if FPGAs are on everything and everybody can benefit from programming hardware in consumer class CPUs, then it makes a lot of sense for the open source software community to really address the programmability of those systems. It's interesting to look at like Saman Amarasinghe's work at MIT, where he's kind of taking these Jonathan Reagan Kelly's halide and putting hardware backends on it. That's really interesting stuff because now you can learn a language that is in a particular domain and generate hardware from it you know it could be you know tensorflow or halide or any of these any of these domain specific languages and then generate hardware from the back end that's that'll probably be when things really get moving along so that i would say the cost of designing stuff is probably the worst thing that we do. But things like open source hardware are really helping in that particular case.
Starting point is 00:57:12 There's also a movement towards open source FPGAs and open source EDA tools. That's really incredibly exciting. And I think I see a lot more hope for the hardware community with this kind of stuff. As a hardware guy, every year, you know, the software world gets a little bit bigger and the hardware world gets a little bit smaller. Maybe that makes our talents more valuable, but it also makes it a little more lonely. And it makes them on an innovation on this side of the fence, the architecture fence, a little less interesting. So I really like the idea of all the open source stuff that's happening. I'm super excited about that.
Starting point is 00:57:50 Also, I'm really excited about Intel buying Xilinx and AMD buying, no, AMD buying Xilinx and Intel buying Altera. Thank you. I'm really excited about that because that, in a sense, if they're smart, which I hope, well, smart by my standards, they're going to slap FPGAs everywhere. I think that's super exciting for a number of reasons. One is now, you know, anybody that can write software can also design hardware as well.
Starting point is 00:58:16 And then the other thing that is really exciting to me is like, what are the accelerators of tomorrow, right? Today, everything is all about heterogeneity and you know how many uh you know neural net accelerators does the world need probably a couple more so if you're on a project like that it's okay yeah i want to know what are the what are the next accelerators and i i think that you know having fpgas everywhere will help answer that that question and i think that's really exciting. And again, that just really reduces the cost of design because I think design costs is really the biggest challenge for our community. And I want to see more startups, more open source projects, that kind of stuff.
Starting point is 00:59:01 I think we're moving in that direction. I feel very positive about that trend, even though the overall picture is one of shrinking hardware community, shrinking sense of what we can deliver, shrinking amount of value we can get out of our future silicon. So I do like the open source trend to counter that. I think that's a very valuable and timeless message on both on the technical side and the broader ecosystem or the people side, which is how do you enable innovation, both in terms of
Starting point is 00:59:38 ideas and the technical ecosystem, but also in terms of the people, the climate and the values in our community. So I think that's a message that's very well taken. And hopefully we can push forward on all of these themes and, you know, ensure that we continue innovating, continue bringing the best people into this very exciting space. Absolutely. And unfortunately, despite the fact that we've had so much fun talking to you, Todd, I think we're about out of time.
Starting point is 01:00:06 I want to close with saying, Professor Todd Austin, thank you so much for joining us today. It's been a total delight to speak with you as usual, and we're glad you were able to join us. Thank you. Thank you for having me. It was a pleasure. And if anybody has any more questions, feel free to reach out and contact me. I'm Austin at umich.edu. And thank you so much for the invitation. I appreciate it. Yeah, thank you, Todd. It was great speaking to you. And to our listeners, thank you for being with us on the Computer Architecture Podcast. Till next time, it's goodbye from us.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.