Computer Architecture Podcast - Ep 8: Durable Security and Privacy-enhanced Computing with Dr. Todd Austin, University of Michigan
Episode Date: May 8, 2022Dr. Todd Austin is a Professor of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor. His research interests include robust and secure system design, hardware and s...oftware verification, and performance analysis tools and techniques. Todd has donned multiple hats, being a senior processor architect at Intel’s Microprocessor Research Labs, a professor at the University of Michigan, serving as the director of research centers like C-FAR, and more recently serving as the CEO and co-founder of the startup Agita Labs. He is also an IEEE Fellow and received the ACM Maurice Wilkes Award for his work on SimpleScalar, and the DIVA and Razor architectures.
Transcript
Discussion (0)
Hi and welcome to the Computer Architecture Podcast, a show that brings you closer to
cutting-edge work in computer architecture and the remarkable people behind it. We are your hosts,
I'm Subinay Subramanian. And I'm Lisa Xu. Today we have with us Professor Todd Austin,
who is a professor of electrical engineering and computer science at the University of Michigan in
Ann Arbor. His research interests include robust and secure system design, hardware and
software verification, and performance analysis tools and techniques. Todd has donned multiple
hats, being a senior processor architect at Intel's microprocessor research labs,
a professor at the University of Michigan, serving as the director of research centers like CIFAR,
and more recently serving as the CEO and co-founder of the startup Ajita Labs.
He is also an IEEE fellow and received the ACM Morris Wilkes Award for his work on SimpleScaler and the Diva and Razor architectures.
Today, he's here to talk with us about durable security and privacy enhanced computing.
A quick disclaimer that all views shared on the show are the opinions of individuals and
do not reflect the views of the organizations they work for.
Todd, welcome to the podcast. We're so happy to have you.
Thank you. It's good to be here. I enjoy this podcast. I've been listening to it since episode
two. Great, great. That's great to hear.
Thanks for your efforts on it, by the way.
It's fun for us. Hopefully it's fun for the listeners too. So if you've been an early
listener, you probably know that we always kick things off with a question of, you know,
what's getting you up in the morning these days.
Yeah. I'm all about security and privacy today. In fact, more privacy more recently. I'm really interested in building systems that have more durable security than existing systems.
How do we get rid of the zero-day bugs and the vulnerabilities and all that stuff?
And that's led me on a long path of building systems, working with DARPA to put those systems into red teaming environments,
learning what their vulnerabilities are.
And this has led to some more recent work that I've got in my startup,
which is building a very durable security system in the Azure cloud.
That sounds good to me as a Microsoft person.
Maybe for our listeners who are maybe not the most familiar with this particular
field, you've used the word durable now twice. Is there a specific meaning for what it means to be
durable? So non-durable security and a great example of non-durable security is patch Tuesday
for Microsoft Windows. Every second Tuesday of the month, I believe, the patches come rolling in.
And what they're doing is they're fixing bugs in the software that people that they've either found themselves or that people have exploited in the software.
So that's not a very durable way to build security.
In fact, relying on your software to not have bugs in it is probably the least durable form of security because anyone who's ever written software before
knows that it's basically impossible
to get rid of those bugs.
In fact, I am of the belief
that there will never be a program that cannot be hacked.
It just doesn't exist.
And if you tell me one exists,
I will say,
we're going to have to wait till the end of time
to see if any new vulnerabilities
are discovered in the future
that will hack this particular piece of software. And then the other aspect you have to wait until the end of time to see if any new vulnerabilities are discovered in the future that will hack this particular piece of software.
And then the other aspect you have to worry about in security is side channels, which are all the rage in the computer architecture world today.
And these are when a system has observable properties that reveal secrets about what's inside the system.
And building durable defenses against those are very challenging.
To give you an example, Intel has this environment,
they call it a trusted execution environment, SGX, software extension guards.
And it's designed to basically take a process and encapsulate it inside a shell inside the CPU that nobody else can
penetrate because of all the cryptography and other stuff that they have. But they put it on
the same microarchitecture as all the untrusted software. And people find lots and lots of ways
to manipulate the cache, the BTVs, the floating point, power on, power off, et cetera, to basically sneak information in and out
of that very protected execution environment. So by durable, I want to solve two problems.
I want to make sure that the software gets hacked, my security doesn't fall,
because you always hack my software. And two, I want to get rid of side channels.
And I think if I can solve those two problems, then I can build durable security.
With durable security, you don't get a call in the middle of the night saying, you got to fix this bug.
You know, like when Log4J happened and then the government told all the employees, you are not allowed to go to Christmas until you fix this problem.
They're still trying to fix it, by the way.
It's way past Christmas.
In this particular case, it's like, you know, RSA
key size is a durable mechanism, right? Because we were all using 1024 bits for RSA keys. And then
we knew like within two to three years, we should probably be upgrading to 2048 bits where most of
us are today. So that that's durable security. Yeah, thanks for the really quick overview on the anatomy of some of these security attacks
and how they intersect with the hardware and the execution model.
Maybe we can expand on some of the themes that you brought up.
You talked about these channels that reveal information.
And also, software keeps evolving and bugs are always going to be there.
Or people will find new ways to introduce bugs
or exploit vulnerabilities and so on.
So what are the properties or mechanisms
or model that you need to have
to tackle both of these challenges?
So it's really interesting.
Let's talk about side channels.
There's three things you need for a side channel.
One, you need sharing of resources.
Two, you need to optimize common cases. And three, you need fine grain
timing analysis. Okay. And there's three things you need for high performance architecture. It's
the same three things. So there's this fundamental tension between high performance design,
optimizing the common case, sharing resources so you don't need more transistors than you need.
And then giving your programmers and your developers really fine grain understanding over how much time you're spending in various
pieces. When you have those features, it's possible when an untrusted and a trusted entity
share the same piece of hardware, the untrusted entity can infer information about what the
trusted entity is doing. And just to give you sort of the classic case, it's called Kocher's RSA attack.
RSA is an algorithm that it's a crypto algorithm, and it's based on exponentiation of large numbers.
And inside the kernel of the algorithm, in older versions of the algorithm, there's a loop.
And inside that loop, it looks at the private key,
the message you want to send, it treats it as a number,
and it takes it to the power of the private key.
And so there's an algorithm inside there that's basically looking,
is this private key bit a 1, or is it a 0, or is it 1, 0?
And it uses this classic algorithm from like a million years ago
called square and multiply.
When the bit is a zero,
you square the result and add it to the previous result.
And then when it's a one, you see me multiply and then square.
So you do two things instead of one.
You can either measure the timing of that, right?
Cause it takes longer to do a square and multiply than it does to do a square.
And so the time dilation will be proportional to the number of ones and zeros
within that part of the key.
So that leaks information.
Alternatively, if you're sharing the same instruction cache, the adversary can displace the code that runs as the true part of the if statement.
Is this key bit a one?
And then it can run the algorithm repeatedly kicking that information out of the code. And then the amount of extra time it takes to run will be proportional to the number of iCache misses that you forced that other algorithm to do.
So those interactions are the sharing of the cache and the optimizing of latency when you hit in the cache is revealing lots of information about what that other algorithm is doing.
It's one thing we figured out in the last few years is it's really hard to stop those things.
You can focus on the cash and people move the BTP.
Focus on the BTP, people focus on the store buffers.
There's a paper, I think it's in the next Asplos.
It's by Xu Wen Wang from Yale University.
It's a great paper.
I recommend to everybody.
It's titled Leaky Front
Ends. I tell people that this paper should be the 50th exclamation point on the back end of the
sentence. You will never, ever, ever stop microarchitectural side channels because it looks
at the front end of x86 processors, creating front ends to try and figure out what code another application is running.
And it does this in three different ways in the front end on many different processors,
and it does it for power and timing. It's hard. And all the defenses people are proposing are
based on cryptography. But the problem is there's just not enough state in the microarchitecture
to really hide stuff well.
So for example, we've seen a lot of papers on what are called randomized caches, where you basically take the index into a cache and you encrypt it before you go into the cache. So now
I don't know where any of my data is in the cache. Well, it's in one of 256 places. So with a little
bit of patience, you're going to actually find that information. So
it's really hard to eliminate those microarchitecture sites. In terms of side
channels, the most durable thing we have is either cryptography or isolation. So if you want
something to be secure, it makes a lot of sense to me to move it out into a coprocessor. And then
that coprocessor is built in a way where it doesn't share caches with untrusted code and stuff.
And it can be built to be performant,
but it has to be built to not have
any of these vulnerabilities.
And I think we see this happening in projects
like OpenTitan from Google, an open source trusted platform
manager.
When you choose to put your trusted execution
environment on the same microarchitecture as all the other code,
I think it's really a challenge to stop those side channels. live with that or have a system design approach that is cognizant of the fact that you always have leaky channels no matter what you do.
And I think you mentioned two things about mechanisms, like isolation and encryption
are two effective ways or mechanisms to build durable security into your systems.
And the last thing was expanding on the isolation theme, sort of having a dedicated or specialized
engine that is responsible for providing the security.
Maybe it also overlaps with a common theme in security, which is security verification.
All of these have this common theme, which is what is the root of trust that you have?
Even with OpenTitan at Google, the objective is to sort of build this root of trust.
Maybe you can expand on maybe one of these themes, including the root of trust and how
that plays into the
philosophy of designing secure systems and the intersection with hardware as well.
I'll do that by way of talking about how do you deal with the software vulnerabilities issues.
So a few years ago, I had a project with Mohit Tiwari at University of Texas and Sherrod Malik
at Princeton and Valeria Bertacco here at Michigan. And we designed this
processor called Morpheus. And Morpheus was trying to stop software hacking. And the way it did it
was by encrypting pointers. And the idea was that it's really interesting when you look at security
attacks, it's really, really hard look at security attacks it's really
really hard to find a security attack where someone doesn't need to know a
pointer value or manipulate a pointer value so if you can encrypt pointers
then you've really thrown a roadblock in the face of any security attack that
needs access to pointers and so we designed this Morpheus machine and we started to build it.
When you work for DARPA, you have to build stuff and deploy it. And this particular program I was
in called SIF, it was a program at DARPA. We had to build a RISC-V version of this processor,
and then we had to deploy it into a commercial red teaming environment. And what red teaming is,
is where you just basically put programs up on these systems, you make security claims about them, and then they put these large bounties, you know, $50,000 bounties, where if anybody can penetrate your security defenses, they get that bounty, they get that money.
And so people come from all over the place to break those things.
And so we put up this server and we put this medical database on it.
We said, come get all the juicy bits inside
the medical database and nobody penetrated. It was incredible. It was a real great success.
In fact, I had talked to the people at the company that had done that red teaming. They said,
it's incredibly rare that they'll... It was a three month red teaming effort and there were
no penetrations. That's amazing. Very cool. It was exciting. And I've been talking to the RISC-V Foundation
about possibly adopting.
It's a very simple little optimization to the system.
But that said, Morpheus is definitely hackable.
It's definitely hackable.
It's just really, really hard to hack
because you've got to side channel out information
about these pointers.
And there's a lot of ways to side channel information
out of these pointers.
But I really liked the idea of cryptography
as a mechanism to provide some durable security.
And so where we went after Morpheus was,
we just said, let's not encrypt the pointers.
Let's just encrypt the social security numbers
and the strings and the things we care about.
And that's where we are today.
So today, the way we're building
really durable defenses, and these are the roots of trust inside the system,
are we are developing compute engines that execute directly on encrypted data.
And what's really interesting about this approach is if somebody penetrates your system,
then they have not penetrated your data defenses. Once they
penetrate your system, they can steal your ciphertext and then they can try to break that
ciphertext. And this is so incredibly powerful that our threat model is no longer attacker,
do not penetrate my system. Our threat model is here's a pile of data. It's encrypted under a key
that only I know. and you, the programmer,
can write any program you want, tell me what that data is.
But the way the system works is as you compute on the data,
the inputs that you're computing on are encrypted
and the results of the computation are encrypted as well.
And so you can run whatever algorithm you want on that data,
but only I, the holder of the key
that encrypted the original data, can see the end
results of that. For me, what's incredibly exciting about that, and the reason why I'm in a startup
right now, is that's not really security. That's a privacy technology because it becomes possible
for the encryptor of that data to have no relation to the person that is running that CPU on the encrypted
data. So now I can take your encrypted genome. I can run my algorithm on it, which is doing
disease analysis, for example, and the results of that computation are encrypted. I send it back to
you and you can see it. What the startup is focused on now is commercializing that technology in
initially Azure and then eventually in Amazon EWS.
Yeah, that does sound super powerful. Cause that right, right now,
and this kind of came up a little bit with an episode with Jim, Jim Larris,
but you know, we have a heavy reliance on software now for,
for everything. And even from, you know you know for me for certain things i'm like
oh i want to have a nas at home because i don't want to put it in any cloud not even my own
company's cloud even though i trust microsoft but like i'm just that kind of person like i want
certain things like in my house only and then you you buy this nas and there's software on the nas
you're like i don't know what you're you know know, like what this thing is doing. And this is, I bought this specifically so I could have like certain things like in the
house.
And so it almost feels like you have nowhere to turn.
And what you've just kind of described there is enabling any software anywhere to do its
thing.
Because right now it's a sort of twofold choice.
If you want the benefit of the software, you have to give up the info.
And so then in many ways, you know, some people may make choices one way or another, but you're kind of splitting that choice to being able to get the benefits of the software without having to give up the info.
Yes.
And in addition, you know, the whole internet is built on this idea that I give you my data, you monetize my data and provide me a free service for that data.
Right. And I, you know, and maybe companies haven't always been the best behaved in how
they utilize our data. Maybe they lose our data. I read a very interesting statistic the other day.
What fraction of American adults have had their social security numbers stolen in the last 10 years is NPR stat. Any guesses?
More than half. Yeah. It's about almost 80%.
And half of us had our social security number stolen in a single breach
in 2017, the Equifax breach, which was a run, you know, runtime code injection,
you know, standard kind of thing, 76 days, they were able to exfiltrate data from that i mean it's bad
so if we build privacy oriented architectures the permissions are no longer with the hardware
the hardware just processes stuff the permission is actually embedded in the data the example i
always like to give is a voting machine if i build a voting machine for privacy technology
there's a lot of things you got to solve. First, you want to be confidential, right? So I want to be able to put my vote into the machine, but the machine cannot
see my vote. There's technologies that do that. Multiparty computation, homomorphic encryption,
edge of the lab sequestered encryption technology. We do this kind of stuff,
but that is a useless technology for voting, right? Because how do you know that I added
your vote to your candidate or added your vote to my candidate? You have no voting, right? Because how do you know that I added your vote to
your candidate or added your vote to my candidate? You have no idea, right? Nobody solves it. SGX
doesn't solve that. Homework encryption, that's an integrity problem. That's a computational
integrity problem. So our technology solves that. So it gives you an end-to-end receipt.
You actually put a secret in with your data, you then put it through the algorithm,
and then it comes back with a computational result. We call it a fingerprint, which actually
tells you, did you execute the algorithm I expected with the data that I passed to you?
And that's an end-to-end receipt. So now you know that you ran the approved algorithm. In fact,
we didn't even have other people sign those fingerprints.
And then there's one more problem you got to solve.
And this problem is what you need to solve.
And I believe to actually make the technology actually useful in a commercial sense,
you need the ability to render unencrypted results off that data in a safe manner.
And let's go back to that voting machine
example. If you vote, you want your vote to be private. Do you want me to run my algorithm
without being perturbed? And you want a receipt of that, that it was run on your vote. But what
if your vote didn't match any of the candidates? You'd sure like the machine to tell you,
you need to cure your ballot. I didn't match any of these candidates. So in that case, you need the ability to decrypt a Boolean value and the machine to tell you,
I don't know who you voted for, but I know this much about your vote. It wasn't any of these
candidates. Can you please fix your vote? That problem is the hardest of all the problems,
because once you give a system the ability to decrypt a piece of information,
now you have to prevent the clever attacker from putting other pieces of information into that
decryption capability. So ultimately, it's an integrity problem. So if you're really good at
solving integrity problems, you can now build this system. If you have all those features,
now it's possible for companies that take your data and then harvest
information off of it to do it in a way where they can't actually see your data. We call this form of
computation, we call it secret computation. And there's a benchmark suite that we made.
It was Michigan at Ajita Labs, NYU, and Addis Ababa University in Ethiopia. We've been working
together to build this benchmark suite. It's called VIP Bench.
You can go to Bitbucket and download it. Inside there, we have this benchmark, which is,
it's one of the Netflix challenge candidates. But what we've done is we privatized what movies you've watched and how much did you like them for all the candidates. And it runs this matrix
factorization algorithm across the entire database, completely encrypted, and then produces a set of encrypted results, which are the movies that we
think that you would like to watch. But never does the CPU ever get to see what movies you like.
Never does it get to see what movies you recommend, but it gets to accomplish the goal of doing recommendations.
Basically, it creates an opportunity to do computation and computation services,
and then even render statistics off that data in a way that's safe. I haven't seen any systems
able to do that yet. So it's super exciting to be able to sort of be first movers in that space. So this may be because I'm by no means
a cryptography expert at all.
In fact, I would say I know very little about it.
So it seems hard for me to imagine
how you can potentially aggregate
and reason about encrypted data.
So for example, like on a voting machine,
like if you have a whole bunch of encrypted stuff passed through, like how are you actually then able to say on the inside, say like, okay, this guy actually got 32 votes when all I got was a bunch of garbles.
Yeah, yeah, yeah, yeah.
How does that work?
So there's really three ways to do this today. encryption where they actually design a cryptographic algorithm where you can encrypt
data, you can decrypt data. And then there's also an algorithm to add two ciphertexts together and
multiply two ciphertexts together. So you get add and multiply. You don't get less than, so good luck
sorting your data. You don't get, you know, it's really limited because you have to be able to come
up with these homomorphisms in the ciphertext space.
And they're also very expensive, these algorithms.
Microsoft SEAL is an example of actually pretty high-performance implementation of this.
But still, you're talking 10,000 times slower than native execution.
But it is possible to do that.
The second approach is what's called multi-party computation,
where you take a piece of information
that you want to perform secure computation on,
you bust it up into little chunks.
So for example, if you want to add
a bunch of values together,
I give part of that value to different entities,
and then everybody adds their components,
and then there's a final phase.
It's not decryption, but it's sort of the aggregation phase where I finally get the final results.
And then the idea is since all those systems are different,
you've got to hack into all those systems at the same time to get that data,
like log4j, for example.
It's certainly possible, but it's harder than hacking into one system today.
Our approach is we just have a functional unit that's got a public
private key pair. It can expose a key inside the functional unit and inside the functional unit,
it does AES decrypt. It does an op on an ALU and then it does a re-encrypt of the high entropy
cipher and sends it out. It's just basically, there's no software in the box and there's no
side channels in the box. And so it's just a piece of hardware that does that. As an architect you
have to contend with that latency so you know we got a thousand ways to get rid of that latency
and they all you got to do it without adding any more side channels but it's certainly possible.
I mean it's just an architecture problem but it's a bunch of latency you know AES it takes about 40
ALU cycles to decrypt and 40 ALU cycles to encrypt. So, you know, you take your
one cycle add, and when you make an encrypted add, it's now 81 cycles. You have to tolerate
those latencies. Fortunately, there's a lot of locality in what we do, so we can, it's pretty
easy to tolerate those latencies. Gotcha. Okay. Makes sense. So then for this kind of like special
engine that you have on the side, you've made it isolated. So now you don't have potential speed advantages of say, sharing with a bunch of other stuff.
And then you've also essentially put like a chastity belt around the CPU or the ALU or
whatever.
Exactly.
Yeah.
The CPU throws high entropy ciphertext at it, and it comes back out of the function
unit high entropy ciphertext.
So in fact, you want a random number generator, you just assign one to an encrypted variable
over and over again.
The ciphertext of that variable,
it'll pass NIST randomness tests.
Because what you can't do is you can't,
one can't be encrypted a certain way, right?
Because if somebody discovers a value is true or false,
all of a sudden they'll know how to evaluate
all the relationals.
So you have to have what's called a high entropy cipher which is like if i want to encrypt a one there's two to the 64 ways or two
to the 128 ways to encrypt a one so you get a little bit of expansion in the data size but
in exchange for that you get this sort of very durable you know to you know the systems we are
building today to penetrate the defenses it's It has nothing to do with hacking.
Hacking is not even on our plate
because hacking just gets you to the ciphertext.
Our hacking experience,
you're already at the ciphertext.
You're the programmer.
So hacking is about side channels.
Can you find a side channel
either in the cryptography
or in the timing?
We don't do physical yet, but we will at some point.
Because, you know, physical is a little harder in the cloud
because, you know, you can't take your oscilloscope
into the Microsoft data center.
They'd probably stop you at the door.
But, you know, if you look at like Jakub Zephyr's research
on FPGA-based side channels, I mean, he's been able to do some remote side channels by being co-resident on the same FPGA.
So there are some ways to do physical measurement.
That's where you look at power, voltage, leakage current, and then that can tell you some information about the computation that's going on.
So you got to address that eventually. Once you get rid of the side channels and once you get rid of software hacking, then the strength of the system is the
strength of the cryptography. So you want to get this data, you got to break the crypto or you got
to steal the key from whoever encrypted the data. And that's a way different game than we're playing
today, right? That's not about looking for turn-oriented programming hacks
or who has an updated log for J. That's like,
can you break a yes, yes, no?
To me, that's super exciting.
That's really exciting.
Right.
It's definitely a very interesting approach
to use encryption in the computation pipeline.
I just wanted to expand a little bit on that.
I mean, in general, when you look at the trade-off space
for any mechanism that you build for security or privacy,
you have what's the functionality that it provides,
what's the security characteristics
or what are the security definition,
and then how efficient it is,
like what are the overheads involved
in enabling this particular mechanism.
And you talked about a few different methods by which you could
enable that. There was fully homomorphic encryption or FHE. Then there was multi-party communication.
And then you have your encryption-based methods, which are sort of in the pipeline. And I think
you rightly pointed out that homomorphic encryption has a pretty strong security guarantee or
security definition, but it also turns out to be quite expensive from a compute standpoint. I push back on that a little.
What is the oldest HE cipher? 2009. It can't be older than that because that's when they figured
out how to do FHE. An 11-year-old cipher? Who wants to use an 11-year-old cipher?
That's a young cipher. Durability in cryptography comes with 20, 30 years of...
No, I'm being somewhat facetious.
These ciphers could be very strong, but there was,
for example, there was an attack on BFE,
which is a very common homomorphic encryption site.
There was a sort of,
there was an attack on the algorithm itself,
I think in 2019.
Part of durability for ciphers is that they get old and that they get a lot of exposure
to cryptanalysis by the community.
So one nice thing about hardware-based approach
is you don't invent your own ciphers.
That's definitely a valid point,
but I did want to sort of expand on the overheads
of the computation itself.
Like FHE also, we know,
like regardless of the durability of the cipher itself,
assuming it's durable and secure,
it nevertheless is quite expensive to do homomorph. I think there's been a bunch of accelerator papers as well, which propose accelerators for FHE, figure out what is the right way to speed up those computations.
It looks like you're also doing encryption in the processor pipeline or within your secure processor pipeline.
And can you comment a little bit on the overheads of doing that?
You said it takes like maybe 40 cycles.
How do you sort of fix that?
Because normally in the processor pipeline, we are used to one cycle, five cycles, and
we start getting 10 cycles.
People are like, okay, that is too many cycles.
So can you talk a little bit about the overheads of the encryption-based approaches itself?
Yeah, well, even FHE finds interesting applications. It kind of depends on the
application to what overheads you can tolerate. But if you're doing machine learning training,
you're going to want those overheads to be super low because you know, you don't want your two week
run to be a two year run. So currently, we're deployed in FPGAs in the cloud. So our overheads are sort of artificially high because we have to get,
you know, we go through, you know, our ops are going through PCIe.
But even then we can find some pretty interesting applications.
We're about 50X off native in our initial product right now,
which is literally thousands of times faster than other technologies.
If you integrate this, we've run GEM5 simulations.
You integrate this tech into the pipeline, you can get within 2x, no problem.
And here's what's really interesting.
Of that 2x, only a quarter of that is exposed encryption latency.
The rest is exposed algorithm changes.
Because there's two things you have to do to algorithms when you basically go secret,
you have to eliminate the memory and the control side channels.
So that means you need to,
you can't use a secret variable to influence an if condition in the program.
So that means you need to do predication.
If everybody remembers predication from your architecture class or if conversion,
you're going to,
you're going to execute both the true and the false side of a,
of a condition test.
And then you're going to select the value using an encrypted primitive.
That way there's no variation in the underlying algorithm based on the data
that you're actually running on and doing decisions on.
And the second thing is you,
you have to have oblivious memory accesses of some form.
We have a kind of ORAM that we deploy in our framework
and various optimizations around that as well.
But what it basically means is that as you change the inputs of an algorithm,
the way it accesses memory does not change in any statistical manner. It can be
random, for example, or it can be fixed, but it cannot change based on the input changing of the
data. Because if that's true, then you have a memory side channel and you can do all your
specters and meltdowns and all those other things to try to figure out what's coming out of memory.
Now, of that 2x overhead for your most fastest
private computation one quarter that is exposed ciphertext latency weren't able to get rid of
and and three quarters that are those algorithmic changes but had a really cool result i had two
summer students work for me last summer and in vip there was, we picked in VIP the benchmark that was hit hardest on its order complexity
because of if conversion in ORAM.
And this was an algorithm called flood fill.
If you remember, if you ever played Candy Crush, when you, you know, crush a candy,
flood fill algorithm goes to all the other candies that are connected.
And that algorithm's ordered N squared for an n by n array and it's completely recursive
right so it's it's totally control driven and then i rewrote the algorithm to be what we call data
oblivious and it was order n to the fourth which is pretty bad i'm not a great programmer okay and
so i challenged them both to come up with their own algorithm that beat order
end of the fourth and they both came up with their own algorithm that beat order and the fourth.
And they both came up with an order and squared algorithm,
which is super exciting to me because that means that if you,
if there's an algorithm that's very inefficient when it's running privately,
just redesign the algorithm to not look at its own data. And it was so clever.
Both of them had such a interesting, clever algorithm that he came up with.
So I'm very bullish on even beating this 2x one day in terms of performance.
And I think if we can get within 2x or 1.5 for fully encrypted computation, and it's really cheap, I got to say, it's really cheap to deploy as well. then you can create a world where people can start to have control over their data,
but you're not hamstringing companies so that they can't render value off of your data.
So I'm super excited about what these technologies can do for the state of privacy and security in the future.
Yeah, that's really exciting. I think it's also a common theme in some of our other conversations
with folks. Like for example, even in the case of accelerators for specific domains,
you often need to rewrite the algorithm or change the algorithm so that it's amenable
to acceleration. And it sounds like even for security or privacy enhanced computing,
you probably want to tweak the algorithm so that it's more amenable to providing you
the properties that you want.
That's definitely very interesting.
And I think it's a good theme to take home.
Yeah, I was just going to say, for this whole system to work,
it sounds like all the pieces have to come together, right?
Because you've got this hardware layer that essentially
is this isolated encrypted mechanism for computation.
And then you've got the software layer, which does need to potentially be rewritten to sort
of mask any sort of potential code-based side channels.
And the two of them need to come together.
So for the stuff that Souvenay was just talking about,
where for domain-specific acceleration,
you might want to rewrite your algorithms, there's usually
some high amount of motivation to do that rewriting,
because it's like the person who is adding the value
to the data is the person who is rewriting the code.
But now in this case, the rewrite of the code
sounds like it might come from potentially elsewhere
than the company that's providing the value
via the software.
Is that the case?
And so then now there's potentially less motivation
to actually do that rewrite.
Do you have any thoughts on that?
Is that true?
It depends.
So part of the commercialization of this tech
is figuring out how do you get into people's hands with the least amount of friction.
So today the technology is it's a C++ or Python library where you actually declare variables encrypted if you care about them not being visible.
And then you can port code into it or you can write fresh code.
Longer term, I think we want to, you know to put this into databases. So I can have a cryptic columns and I can do all the things I want to do on
my cryptic columns. I want to put this into the TensorFlow framework so I can have private
models or I can have private inputs on my machine learning models. I think the key to sort of reducing that friction of adoption
is to find frameworks
that it lends itself to well,
where privacy is important
and there's enough information
in the compilation framework
that you can automate the process.
We don't automate the transformations
of the programs today.
We can at some point. Today, what we do is we,
you define the variables you want to be encrypted in your program,
and then we tell you what you can't do.
So if you're going to express the computation, it is safe. It is safe.
But you know, if you use it, you know, if you're porting,
I ported tons of code in this framework,
when you port stuff over and it's an in a statement,
it doesn't compile it and make it unsafe.
It just says, I can't do this.
So it's, unfortunately, if you want a system to be really safe, you have to eliminate the bad programming practices
that are so prevalent in the world today.
But fortunately, we can automate that process eventually.
And then can you automate that process without taking an order complexity hit
on your algorithm? Probably not.
So I just sort of the longer term vision is really to, you know,
what are the important algorithms that we don't have that we took big order
complexity hits on and how do we redesign those algorithms so that they can be
data oblivious and, and don't create side channels and how they, how they do that. And then another thing, I'm just, I'm trying to figure out what
else can we do with data oblivious algorithms? So I'm sorry, I'm on the, I'm now walking the earth,
going to all my old places in computer architecture to figure out where I can use this technology.
Cause I want to create pull instead of push. You know, I want to, I want people to say,
Oh, this is a very interesting property in an algorithm.
How can I use this to, you know,
solve whatever problem that I have
that has nothing to do with security?
And I think I can find a couple opportunities for that,
just to sort of build more momentum around this idea
of these very special algorithms
that don't have these side channels in them.
Cool. That sounds exciting.
So I think at this point, maybe we want to switch gears a little bit.
So Todd, I've known you quite a while,
and you're one of the most kind of like naturally enthusiastic people that I know.
And you've done a whole lot of different things, you know, across different domains, both academically in terms of subject matter,
as well as you know
like leading cifar at this this startup and and i remember the last time we spoke you were talking
about this the work you were doing with this university in ethiopia maybe you can talk a
little bit about how you move between these different domains always with such a smile
on your faces too yeah well i i like having lots of stuff to work on, you know,
especially when you're doing research, there's a lot of fits and starts in a project. And I find
that I get my best ideas on project X when I'm working on project Y. So I like to have a lot of
different things. You know, my PhD students all tend to be working on their own thing.
And I like to take on even non-technical projects like the stuff in Ethiopia.
Almost 13 years, I've been working with the university there to help them build their capacity for educating computer science students. Computer science is hot here at Michigan.
It's just as hot in Ethiopia too everybody wants to
be a computer scientist and they need more more and more profs so I've been you know as part of
that effort I've been taking on research scientists here from that institution and sort of helping
them get over the hump on their PhDs and then help them build up capacity there. And then, and in the process,
I mean, I've recruited so many amazing students from Ethiopia into my PhD group here. And now
they're, Souvenay and I were talking about, there's one that's in his group at Google now.
So it's, it's really exciting. And the way you keep excited about it is, you know, it's like,
there's always a taller mountain to climb and interesting problems to work on. And you're, you know,
you're always surrounded by all these young people that are just trying to figure out how to do
research and how to be successful in their careers. And, you know, really being a part of that process
and helping mentor those people into that is super exciting. You know, I, I tell people I'm
institutionalized, you know, don't make any offers below six figures, seven figures, uh, for industry
because, uh, you know, I, I'm, I really like being an academic. I really, I really like the,
the life that we have. I wish I could convince more of my students to go into academia.
My senior PhD student, she is
definitely interested in academia, so I have to keep nurturing that thought. But I love it,
and I love just bopping around to different projects and just having something interesting
to think about all the time. Great. So how did you get into architecture at all? And then you
talked about being institutionalized at Michigan.
Maybe you can talk a little bit about your origin story, I suppose.
Yeah, sure.
I come from southern Wisconsin, a little tiny town.
Early on, I discovered computing from a teacher I had at school
who was really into computers, and I got into computers and then
I where I came from nobody really went to college either you went into farming or the military
this is like mid-70s I thought computers were pretty cool so I thought I'm gonna go to college
and so I went to University of Wisconsin and studied there to get my electrical engineering
degree at the time, because that's where I was also interested in hardware too. I'm really
interested in how things work, how do things function. When I got my electrical engineering
degree from University of Wisconsin, I went to work. What was interesting, I didn't even know
what grad school was at that time. And I feel
like no one ever talked to me about grad school when I was an undergrad. I mean, it was, and I
see in my students today, a lot of them have parents that went to grad school, right? I think
grad schools, a lot of people learn from their parents. I certainly didn't. I always, I make it
a point now, whenever I teach an undergraduate class,
to always spend at least a half a lecture every semester talking about what is grad school? Why
would you want to go to grad school? And I'm very pro grad school. I mean, I always, when students
come to me and say, should I go to grad school? I said, no, that's not the right question. The
right question is, can I go to grad school? Because grad school is the best. And the master's degree, I think, is probably the very best degree, you know, in terms of how it can give you access to better
jobs and such. And then students say, they always ask me, what about money? And I say, well,
if you're worried about money, you already made a bad decision, right? You're an engineer, right?
You should have been a lawyer or a doctor. So don't worry about the money. You'll be fine.
So I didn't know about grad
school i went to work at at xerox was my first position after my bachelor's degree i was on this
ginormous copier team uh we were building this a 3090 copier and it was this huge embedded system
i was a real-time programmer on paper path which means that I was part of the programming team
that was making the paper go around.
And so you need to debug these systems.
We had one, for 200 programmers,
we had one system that we could debug on,
just one at the time.
And so, of course, I'm a junior member of the team.
I'm like debugging Tuesday night at 3 a.m.
This is when i'm debugging
my code and i i got this idea i'm going to build a simulator for this entire machine you know i
built a z80 simulator z80 processors on it and i built network ports i built simulators for all
the you know basically emulators for all that stuff and i got to the point where you could
literally boot the entire image of the system.
And I built a little backend that would describe the physics of the, of the photocopier so that
you could, you know, to a first order, you could tell if your stuff was actually going to run.
Then you could just get onto the real hardware and do the tweaks that you need to, to make sure
you were capturing everything. That was so successful. It started shrinking the schedule of this ginormous project and so my manager said
yeah Todd's going to lead a team to develop that for other projects as well and they plunked me
into what was called the Webster Research Center which was the dual it was the east coast dual to
the Palo Alto Research Center so now all of a sudden i'm like in meetings with mark weiser and butler
lampson and like all these people i'd never heard of actually but now that i'm a computer scientist
phd you know these are super famous people and i'm like looking around i'm like what do these
people do you know you know what they do they do whatever they want and so that i figured okay so to do this they need a phd
uh and so then i thought okay i'm gonna go back and get a phd and so i went back to wisconsin i
i'd taken a class with guri sohi computer architecture class i really like hardware
and software i like being at that boundary so i went back and studied with guri i was with guri
for six years learned soed so much there.
That was a really cool experience.
I don't know how many super awesome people come through Wisconsin,
but it was just like, I mean, I was like Bobak Felsafi,
Sarita Advay, Doug Berger, Scott Breach, and Vijay Kumar.
I mean, it was just like, you were just like tripping over people
that are
really well known today. It was a really cool experience. And then as I ended towards my
graduation, I thought like, oh, you know, I want to go into academia, but I haven't built a processor
or anything. Right. I'm going to be a computer architect that has never built a processor.
And so I thought like, well, I should probably go to like industry for a little bit.
At that point I went to Intel and that was for me,
was really my postdoc.
And I got on, I was on the Pentium 4
advanced technology development team.
That was so interesting.
What was so interesting to me at Intel was that it,
the challenges they had
were not so much in architecture. They had all the solutions in architecture. Their challenges
were in verification cost. It was incredibly expensive to verify their hardware. And so
when I left Intel to take a position at Michigan, I wanted to work on verification costs. And I am not a mathematician,
so I can't do formal verification. I'm just a, you know, hardware guy. I just build interesting
architecture. So my first project at Michigan was this project called Diva, which was a processor
that could tolerate its own design bugs, which was, it would, it had a front end processor, which was really big and capable
and had a backend processor, which was very correct and simple. And then it used the front
end processor as a predictor for branches and stuff. And what was really interesting about
that project and what really changed my career was it was so different than anything I'd ever
had an opportunity to work on.
And the thing that was most exciting about it is when I would go and present it,
people thought it was either a really good idea or a really bad idea. I remember one time I gave a talk at Intel and somebody came up to me afterwards and says, gee, you're telling me
I'm going to release a processor with bugs in it? I go, well, not in the back end just in the front end I can't do that
and so I learned I learned my philosophy for research really got developed there and it it's
what I call a rule-breaking approach to research which is very simple find a rule nobody breaks
and these rules are they're very easy to find they're like in
the front of textbooks you know there's stuff that's always always taught to people and so
one of the things you know you're taught in computer architecture is you can't release a
processor until there's no bugs in it you know but here's the reality is they all have bugs in
them i gotta tell you that they all have bugs in them. I got to tell you that they all have bugs in them. They just don't, hopefully you don't have bugs that are going to cause people a lot of real
big problems. So, you know, I started working on designs that would try to break that rule.
And sometimes when you break that rule, you find something really, really, really interesting.
That was my first project. And now I'm working on these encrypted systems, which is like,
you know, there's a, there's a rule in security, right? You will never build a system that is unhackable. Now that may be true, but I'll do
my best to come as close to possible as I possibly get. It is pure heresy to suggest you could build
a system that cannot be hacked. So again, as a rule, I would love to break one day. It'd be
really exciting. Yeah. And then, and then that's your, that's your way I would love to break one day. It'd be really exciting.
Yeah, and then that's your way to live forever,
to find out if it's true, right?
Yeah, exactly.
So you've done several hats in academia and industry, and you talked to us about the journey
in which you learned a lot of different things
from different places,
but a very high level based on your experience.
What do you think the architecture
community is doing well? What are things that we are not maybe focusing on as much, either in the
teaching or in the research side? And how can we sort of improve that in the coming years,
especially as the landscape of computing is also evolving, right? That's a really great question.
Well, I'll say one thing I think we're doing pretty well is we're focusing on climate in the community,
which I really value those efforts and try to be a part of them whenever I can.
Because I think climate is super important.
We don't want to be places where we don't feel welcome.
We don't feel valued.
That's something I've been really involved with here at University of Michigan.
It's absolutely something I've been involved with in my startup.
And I'm really glad to see that the architecture community is involved in that as well.
I think that's incredibly important.
Something that comes to mind here is I think I always say that I think I wish the hardware community was more like the software community.
And I feel very much like we're starting to do that.
I'm super excited about RISC-V and the open source movement in the computer architecture community.
I think that's an incredibly powerful development.
The work that I did on the Morpheus project that I talked about earlier was completely enabled by the CPU work
that came out of Berkeley that, you know, rolled into the Sci-5 startup, et cetera. That was
completely enabled by that. So I love to see this movement of open source hardware.
People don't understand, you know, people in industry often have a difficulty understanding
why open source hardware is valuable.
And I just want to take a moment to explain to that person who's listening to this podcast and says, how do I make money on open source hardware?
When you create things in industry, you have a certain amount of money to spend on those things.
Nobody has infinite money except maybe a few of the bigger companies.
But when you only have a limited amount of money,
you want that investment to be as valuable as it can be. So what the software community does is
they say, there's certain things that everybody needs that don't provide value to the products
we're building. So let's make them free. And they're not really free, but what they are is
they're maintained by everybody and everybody shares and nobody owns or charges for those things.
And that makes a lot of sense.
To me, that makes a lot of sense even for an instruction set, right?
Like what instruction set does your phone use?
You know, well, probably ARM.
But if it used RIGS 5, would you even know it?
No.
Why?
Because instruction set provides no value to your phone. So why do we
have to pay for it? When that happens, all of a sudden, you can take that fixed amount of resources
that you would have spent building instruction sets and component layers and operating systems,
all this stuff that provides no value to your end product, And you can start to spend it on the accelerators and the,
you know, and the, and the new instructions and the, you know, the other things you're going to
add that are going to provide lots of perceivable value to your project, to your end project. And
so what open source really does is it provides a means for reducing the cost of producing
innovation. And that is something that we definitely need in the hardware community.
We need to focus on making it less expensive to innovate.
And I think if we could do that, that would be really beneficial.
I've had this interesting experience where now I'm kind of in a software org in many ways.
And the approach to building things between software and
hardware is so so so different with hardware it's like you know you have to do things so far in
advance get it all figured out and get it all set up and then it's very rigid and then you have to
you know spend forever verifying it with software it's just like let's just you know deploy a little
something and then we can fix that we can continuously be agile about it so that's like
one thing it's like the software development is very agile and the hardware
development is pedantic and get it figured out along in advance do you feel like the open source
ness so i get that that kind of like reduces the cost of innovation because it provides everybody
like a substrate where it's just like okay the, the ground is, you know, not everybody has to dig their own like sewer pipes or whatever.
They're just there.
But does it also enable more agility similar to the software world,
do you think?
Or is it just that kind of providing everybody a base ground floor?
That's a great question. I would say,
I think agility is less of a feature of the design,
more of a feature of the design language.
Most of the RISC-V stuff is specified in Chisel.
And Chisel is, it's flexible once you build that flexibility into it,
but building that flexibility into it is really, really challenging.
It requires you to build a program, a generator for what it is
you want. And if you want to change the things that it creates, you need to modify the generator.
And that takes a lot of background. You need to understand Scala. You need to understand
Fertile. You need to understand Chisel. It's very, very challenging. And pushing myself and
students over that hump was really,
really difficult. But when we did,
we got a lot of value out of that infrastructure. I,
I think that the challenge there is really just, you know,
what is the Python of HDLs? Well,
Chris Batten would say it's PyMetal, which is his Python based HDL,
which is actually pretty awesome i must say i think the
key there is really innovation in in the hardware description language so i look like you know
chris batten cornell adrian samson at cornell they're those two folks are doing really interesting
work on taking advanced language features from programming and putting them into
hardware description languages. I've got a project here at Michigan called Twine,
which enhances Chisel to support heterogeneous design in a more sort of Lego block fashion.
So yeah, I think that's where those benefits will eventually come from.
I would love to see, that's another reason why I'm really excited about FPGAs on everything,
because if FPGAs are on everything and everybody can benefit from programming hardware in consumer class CPUs,
then it makes a lot of sense for the open source software community to really address the programmability of those systems.
It's interesting to look at like Saman Amarasinghe's work at MIT, where he's kind of taking these Jonathan Reagan Kelly's halide and putting hardware backends on it.
That's really interesting stuff because now you can learn a language that is in a particular domain and generate hardware from it you know it could be
you know tensorflow or halide or any of these any of these domain specific languages and then
generate hardware from the back end that's that'll probably be when things really get moving along
so that i would say the cost of designing stuff is probably the worst thing that we do.
But things like open source hardware are really helping in that particular case.
There's also a movement towards open source FPGAs and open source EDA tools.
That's really incredibly exciting.
And I think I see a lot more hope for the hardware community with this kind of stuff.
As a hardware guy, every year, you know, the software world gets a little bit bigger and the hardware world gets a little bit smaller.
Maybe that makes our talents more valuable, but it also makes it a little more lonely.
And it makes them on an innovation on this side of the fence, the architecture fence, a little less interesting.
So I really like the idea of all the open source stuff that's happening.
I'm super excited about that.
Also, I'm really excited about Intel buying Xilinx and AMD buying,
no, AMD buying Xilinx and Intel buying Altera.
Thank you.
I'm really excited about that because that, in a sense,
if they're smart, which I hope, well, smart by my standards,
they're going to slap FPGAs everywhere.
I think that's super exciting for a number of reasons.
One is now, you know, anybody that can write software can also design hardware as well.
And then the other thing that is really exciting to me is like, what are the accelerators of tomorrow, right?
Today, everything is all about heterogeneity
and you know how many uh you know neural net accelerators does the world need probably a couple more so if you're on a project like that it's okay yeah i want to know what are the what
are the next accelerators and i i think that you know having fpgas everywhere will help answer that
that question and i think that's really exciting.
And again, that just really reduces the cost of design because I think design costs is really the biggest challenge for our community.
And I want to see more startups, more open source projects,
that kind of stuff.
I think we're moving in that direction.
I feel very positive about that trend,
even though the overall picture is one of shrinking hardware community,
shrinking sense of what we can deliver,
shrinking amount of value we can get out of our future silicon.
So I do like the open source trend to counter that.
I think that's a very valuable and timeless message on both on the technical side and the
broader ecosystem or the people side, which is how do you enable innovation, both in terms of
ideas and the technical ecosystem, but also in terms of the people, the climate and the values
in our community. So I think that's a message that's very well taken.
And hopefully we can push forward on all of these themes and, you know,
ensure that we continue innovating,
continue bringing the best people into this very exciting space.
Absolutely. And unfortunately,
despite the fact that we've had so much fun talking to you, Todd,
I think we're about out of time.
I want to close with saying, Professor Todd Austin, thank you so much for joining us today.
It's been a total delight to speak with you as usual, and we're glad you were able to join us.
Thank you. Thank you for having me. It was a pleasure.
And if anybody has any more questions, feel free to reach out and contact me.
I'm Austin at umich.edu. And
thank you so much for the invitation. I appreciate it. Yeah, thank you, Todd. It was great speaking
to you. And to our listeners, thank you for being with us on the Computer Architecture Podcast.
Till next time, it's goodbye from us.