Computer Architecture Podcast - Ep 3: Privacy-preserving Covid Tracing and the Hardware-Software Stack with Dr. James Larus, EPFL

Starting point is 00:00:00 Hi, and welcome to the Computer Architecture Podcast, a show that brings you closer to cutting-edge work in computer architecture and the remarkable people behind it. We are your hosts. I'm Suvainai Subramanian. And I'm Lisa Xu. Today, we have the pleasure of having Professor Jim Larris with us today. He is the Professor and Dean of the School of Computer and Communication Sciences at EPFL in Lausanne, Switzerland.

Starting point is 00:00:26 Prior to this, he was a researcher, manager, and director in Microsoft Research for over 16 years, and prior to that, a professor at the University of Wisconsin. Professor Laris has made contributions to several fields spanning programming languages, compilers, computer architecture, and computer systems. He has co-led the Wisconsin Wind Tunnel Project, started the Singularity Project at Microsoft, created the Orleans Framework for Cloud Programming as Director of the Extreme Computing Group at MSR, Microsoft Research, just to name a few of his notable projects. Today, he's here with us to talk about rethinking the hardware-software system stack and his views on research in industry and academia, and we're so excited to have him here today.

Starting point is 00:01:06 Before we begin, a quick disclaimer that all views shared on the show are the opinions of individuals and do not reflect the views of the organizations they work for. Jim, welcome to the podcast. Thank you very much for the invitation. Well, to start off, I think we generally start in broad questions, which is just now, how are you doing? What are you up to at EPFL? What gets you up in the mornings these days? We're doing as well as any of the universities are these days. It's an interesting world. It's an interesting world.

Starting point is 00:01:45 It's a challenging world. I actually spend most of my days working out of my apartment. We're still doing what we've always done, which is that we're very fortunate in computer science. We can still do research. We can teach. We still have a lot of challenging problems to work on. In particular, these days we're starting to think a lot about issues related to security

Starting point is 00:02:08 and privacy. In particular, we did some work with the COVID to come up with the DP3T contact tracing protocol. And that led us to a number of issues related to security, privacy, the platforms that people run on these days, which are pretty much phones, and the ownership of those phones and the control that the manufacturers have over what actually runs on those phones and how much you can trust the software that runs on those platforms. It's an interesting world. It's quite different than when I started, which was, you know, you had a computer, it was on your desk, it was your computer, you controlled what was on it. That's not true anymore. The world has changed quite a bit.

Starting point is 00:02:57 Yeah, absolutely. I think one of those things that I wonder about oftentimes is when I have a new app on my phone and it's saying, I need access to your camera. I don't really know if it's saying I need access to your camera when you say take a picture or like in the background, it just can always have access to my camera. And I expect it to hit yes in order to use the app. And I don't know what scope or what a duration this access really lasts. Do you have any insights on that based on what you've been doing lately? You know, it's a strange question because if you look at it, I have an iPhone, but if you look at it on the iPhone, they ask you a binary question. We know from experience that almost everybody is going to say yes to that question because you're trying to do something and this pop-up comes and you want to just get it out of the way.

Starting point is 00:03:47 And you think like the easiest way to get it out of the way is just to say yes. This is a terrible way of doing security. We've known this for a very long time. The interesting thing is that on the iPhone, it's actually a little bit more differentiated than that. If you look at the permissions for something like your camera or GPS, it says, is this allowed to use this resource? Let's say GPS is more interesting. While the app is running, all the time or never. So they actually give you a three-way choice, but they don't show that to you originally.

Starting point is 00:04:18 And I'm never quite sure what they do with it. You know, Apple is pretty serious about privacy, pretty concerned about privacy. They do try to preserve it. And I think that they've pushed fairly hard on that. But, you know, what you give up is quite a bit of control over the software that you can write for a phone and the software that runs on your phone. In particular, you know, to write an application and distribute it on an iPhone or an Android phone, you have to go through the App Store. And the App Store comes with a very large set of rules about what you can or can't do as a program's author.

Starting point is 00:05:03 And I'm going to ignore the whole question of whether, you know, charging money and everything else like that. But beyond that, there are rules that say, you know, you can't write this kind of app. You can't use your own browser. You have to use the browser that's installed on the phone. You can't have an app that has an interpreter in it. All the code has to be statically compiled. You can't, you know, download your own code onto the machine. You can't have really dynamic code generation on your app so that Apple can run all sorts of tests before they allow your app to run in the app store in the interest of security. And, you know, you could argue for it, but the consequences of it is that

Starting point is 00:05:46 we've given up control over the software that runs on the platform that the entire world uses to two companies. And they have a very large amount of say. And so, you know, this, with the COVID tracking, this really came to the forefront because Apple and Google decided that they were going to use decentralized contact tracing. We developed an EPFL and ETH, the GP3T protocol, which we think is the best protocol and we were happy that they decided to do it. But there were other competing protocols in Europe, in particular Germany had a centralized protocol, which they eventually gave up and went to the Apple-Google protocol. France had their own protocol, which they have not given up. And they're still insistent

Starting point is 00:06:30 that the right way to do it is the centralized protocol. And as a consequence, they built an app in France, which is incompatible with the ones in the rest of the world. It doesn't use the Apple-Google framework. And moreover, it doesn't work very well because they can't get access to the Bluetooth stack as well as the Apple and Google protocol can. And so they couldn't build an app that did what they wanted. And if you think about it, this is a very strange thing. Here's a national government deciding that they're going to build an app to address a health crisis in their country. And they have top scientists working at INRIA, the French National Computer Science Research Organization, develop a protocol and they want to implement it.

Starting point is 00:07:19 And the company that sort of supplies software that distributes the phones, makes the phones, says, no, you can't do that. I mean, they literally told a government that you're not allowed to do this. We have decided that there's one way in which contact tracing is going to be done and you're not doing it our way, so you cannot do it and we're not going to help you. We're not going to give you access to the Bluetooth radios on the phone. This is an interesting world where a company can stand up to a government. It's quite interesting. The French digitalization minister was talking to Parliament about the situation. It was earlier in the spring when the UK was in a similar position as France, they had their own contact tracing app, which didn't use the Google Apple framework. And what he said was basically, you know, here are these two sovereign countries, both of which have the ultimate symbol of national sovereignty, nuclear weapons. And we can't get these companies to do what we want to do. You know, the way in which you said it in French

Starting point is 00:08:26 comes across very much like a threat that we would use nuclear weapons on these two companies, which obviously they're not going to do. But if you think about it, like, you know, these are companies, countries that could start World War III, destroy the world, but they can't get Apple and Google to let them develop an app on their phones. Yes, I don't think I realize that. That's pretty incredible, in particular because this is a public

Starting point is 00:08:50 health issue and these are technology companies where public health is not necessarily their domain or isn't their domain at all. Both companies see health as a very big market. I don't know if you have an Apple Watch, but it's very health oriented. You know, it works well. It's good. But it's sort of a golden handcuffs, right? I mean, you know, you either do it that the way that to talk a little bit about what these contact tracing protocols are doing and why they need access to the Bluetooth stack. Because I'm not sure that the audience to this podcast necessarily knows that kind of stuff off the top of their head. Well, hopefully you're all using them. But let me, in case you're not,

Starting point is 00:09:46 which I think is still probably true in most of the United States, what it is, it's a sort of very simple idea, which is that, you know, classic contact tracing works that when somebody is diagnosed with an illness, say you have a COVID test and you're positive, then you would like to trace the people that you were close to while you were contagious, which for COVID is about two days before you show the symptoms. So that's actually bad because you're likely to have been outside of the world doing whatever it is you're doing, talking to people, and you have no symptoms. So you didn't even realize that you were spreading COVID at that time. So what classic contact tracing does is it tries to reproduce what you've done, talk about who you met over the past couple of days, where you were, who you were close to, how long you were close to them. The World Health Organization basically said

Starting point is 00:10:41 if you're within two meters or 1.5 meters of a person for 15 minutes or more, then there's a chance that they've been exposed to enough COVID that they should quarantine themselves and get a COVID test. So that's the standard that pretty much everybody looks for. You know, it's obviously difficult. It's difficult sometimes to remember who you were in close contact with. You were in situations like being on a public bus or a train't we use the phone to record who you are nearby so that if we need this information, we can inform them of the fact that they were close to somebody who came down with COVID and they might be potentially contagious themselves

Starting point is 00:11:35 and they should get a test and quarantine themselves. You want to do this in a way that's privacy preserving. Obviously, you don't want to sort of keep track of the identity of everybody because you don't actually need that information. You just need to know that you are within a certain distance for a certain period of time with somebody who came down and was diagnosed with COVID. So the protocol that we built, DP3T, is privacy preserving.

Starting point is 00:12:03 It uses random identifiers. It keeps all of the information about who you are in contact with on your phone. And when somebody is diagnosed with COVID, they upload their set of identifiers for the past couple of days onto a server, which broadcasts to everybody else's phone. The matching is done on your phone. We use the identifiers to determine whether you're within, say, two meters for a period of time. And if you are, it depends on the country at this point, but in Switzerland what happens is you get a pop-up that says you may have been exposed to somebody who was diagnosed with COVID.

Starting point is 00:12:44 You should call this number and get some counseling and you're entitled to a free COVID test in Switzerland. So that's the basic idea. That's how it works pretty much in all of the countries. The last I heard that 10 or 15 US states had deployed it so far. So it's interesting. It uses the Bluetooth low energy stack. The phones have this broadcasting mechanism that sends out a beacon every once in a while. And you can use the beacon to identify that, you know, for instance, you're near your headphones or something, which are occasionally sending out a beacon and you can pair with it. We don't use the pairing facilities. We just use the fact

Starting point is 00:13:26 that phones can send out these Bluetooth beacons at very low cost. There's a certain beacon that's sent up. It's identified with the COVID tracking app. Every five minutes or so, your phone wakes up and listens to the set of beacons around it. It also records the signal strength, which is used as a proxy for how far the two phones are apart from each other, which is obviously very problematic because there are a lot of things that influence signal strength besides the distance between the two phones. It could depend on the orientation. You know, the phone that's in your back pocket is different than the phone that you're holding in your hand if you're facing

Starting point is 00:14:09 another person because, you know, the radio signals going through your body are attained quite a bit more because of the water in your body. So, you know, it's an approximate measure, but the good thing is that the actual COVID measure itself is two meters at 15 minutes is a very approximate measure as well. It's not like if you were, you know, 14 minutes or, you know, 1.9 meters, you would be absolutely safe. And, you know, if you were 15 minutes, you're in danger. Both of them are approximate. You know, we have enough evidence now in Switzerland of people being notified by the app in Switzerland, going and getting a COVID test and finding out that they were COVID positive to know that it does work.

Starting point is 00:15:05 You know, it's not a replacement for regular contact tracing. It's not a replacement for all the other things that you have to do, wearing masks, social distancing, washing your hands, but it's useful. It helps to identify a set of people that conventional contact tracing would not have identified. And it does it in a very scalable way. You know, in a situation like now, when the number of cases has gone up quite fast, it's hard for annual contact tracing to stay on top of it. But the digital scales very well. Right. Yeah, thank you so much for that context and for the explanation.

Starting point is 00:15:40 Just picking up from one of the threads that you mentioned earlier on the ecosystem for developing these kind of applications and how they're sort of tied into these large companies like Google and Apple. I was just wondering if would there be a different organization that would have a different set of tradeoffs that would be more appealing, especially in the context of building secure systems? You typically want some kernels that are either trustworthy or verifiable that they're trustworthy. Who builds these things? Who maintains these things? Is there a different organization

Starting point is 00:16:13 that would lead to a different result or make things easier? So this is interesting. Let's pull it apart a little bit. There's two parts to it that are sort of crucial. One is the Bluetooth itself. So, you know, getting access to the Bluetooth to broadcast these beacons and listen for these beacons, it's something you used to be able to do. And then both companies, Apple and Google,

Starting point is 00:16:40 realized that there's a serious privacy problem here because companies were using it to track people. So, you know, you would have an app from, I don't know, pick a store or something like that. And you would go into the store and the app would be broadcasting a beacon from your phone and they would be listening for it in the store and they would be able to identify you. So there's a real serious privacy problem with this happening in the background. So in particular, Apple does not allow applications to sit in the background and listen for Bluetooth beacons. The operating system, the iOS system, does this for you and it controls very much what the applications are allowed to do. And this is why it's very difficult to build a digital contact tracing app without the support of Apple, because you cannot build an app that will sit on your phone all the

Starting point is 00:17:37 time. So Singapore tried. They were the first country to build one of these, and you had to basically keep your phone unlocked and the app running all the time which exhausted the battery and was very insecure and nobody did it so it didn't work very well so you needed the involvement of both manufacturers both google and apple to make this work you know they put it into their bluetooth stack they basically controlled exactly what was broadcast and they provided a very, very limited interface to it. Initially, the interface was so limited that it was very

Starting point is 00:18:13 difficult to implement the medical protocols that we wanted. So, you know, Switzerland, for instance, had a standard that said 15 minutes of exposure to within two meters of any number of people who were diagnosed with COVID. This was not a situation that Apple or Google had anticipated, and so their interface did not really allow you to do this. They had the notion that you would have 15 minutes of exposure over multiple days, not per day standard. But the Swiss standard was written into the law that authorized the use of these contact tracing apps. And so we have to find a workaround to implement it.

Starting point is 00:18:51 Again, another situation where the control over the phone makes it very difficult to do what you need to do. So that's one level. And then the next level is the storage of the beacons. So, you know, the phones have secure storage on them, you know, cryptographically secure storage. So all of that code is basically running at a privileged level as part of the operating system on the phone.

Starting point is 00:19:21 And we have to trust it. No one's ever seen what this looks like. No one's ever seen this. Apple and Google say, well, all these IDs are kept on your phone and they're kept securely and we have to believe this. The next level, of course, is the apps themselves. The apps are all open source. You can go onto the web and get our app or the Irish app or a whole bunch of other apps that have been developed and they're all open source you can look at them but the sort of crucial part of the functionality is not

Starting point is 00:19:53 there it's part of the what's called Google and Apple exposure notification framework and that's running at a privileged level on the phone and we have to trust that it does what it's supposed to do. And it's an interesting situation because it's obviously a very secure device compared to a computer because the manufacturers are so controlling. But it's also a device that has a lot of personal information on it. And people, I don't think, have a really very good sense about the sort of boundary of this. So, you know, to give you an example, the apps built on the Google Apple exposure notification, they're about as privacy preserving as anything that has been developed and certainly anything that's on most people's phones. But people have a very strange sense that because this is dealing with medical information, because it might actually cause them to go into quarantine, they're very concerned about

Starting point is 00:20:54 the privacy of this. And so, you know, I talked to people who were reluctant to put this app on their phone because they're concerned about the privacy part. And I asked them, well, do you have WhatsApp? Do you have Facebook? Do you have, you know, any number of other apps on their phone? They're like, of course, everybody has those. And they're right.

Starting point is 00:21:14 Everybody does have those. But those apps collect all sorts of information. They upload all sorts of information to these large corporations. And people don't give it a second thought. But, you know, with the privacy and health and medical and quarantine and so forth, it forced people to think very differently about it. In this sort of social sciences, public, you know, political world, oftentimes security and privacy are sort of on different sides of the coin, meaning you give up your privacy in order to get security.

Starting point is 00:21:50 But then in this particular discussion, it sounds kind of like the way in tech we think about it is that better security gives you better privacy. And they're not sort of flip sides of the coin. Would you agree with that characterization and what's different about tech versus the sort of public sector, I suppose? Yeah, I don't think the trade-off is as simple as coin with two sides. You know, we couldn't have built the app without the security because, you know, one of the sort of guarantees was that you would keep track of the contacts of people that you were near for the past 10 days or 14 days, depending on the country. And these would be secure on your phone and you would never actually get access to them.

Starting point is 00:22:37 So you really couldn't go and try to reverse engineer who these people were, even though they were broadcasting random IDs. So this is security. On the other hand, the sort of bad side of security is that the control that Apple and Google have and they exercise through their app store is in the name of security. They basically say, look, you know, we're trying to keep malicious apps from being loaded onto people's phones because there's a lot of information on those phones and you wouldn't want an app an app that steals all this information and i think that that's absolutely true and they've done a pretty good job of it i think that you know apps for phones are probably more secure than apps running on computers for the for the

Starting point is 00:23:22 reason that they have to get past the gatekeeper to get on the phone. You know, on the other hand, privacy has its downsides. So, you know, one of the things that has definitely come up with this app that we've built is that it has very, very little information in it by design. You know, we don't keep track of where you were when this contact occurred, because that would require us to sort of keep track of where you were over the past two weeks, which is very, you know, a great deal of information about your personal life. And we don't want that. We don't actually think it's necessary for contact tracing. On the other hand, you know, health authorities do want this information. They would like to be able to say, well,

Starting point is 00:24:09 you know, in this bar, there were a lot of people in this bar and, you know, somebody came down with COVID and, you know, maybe we should notify everybody in this bar, maybe we should, you know, use this as an indication that bars should be closed. And so they come and they say, where is this information about where this person was? And we have to tell them, no, this is not part of the app by design. So security and privacy are these sort of very rich areas that don't have this binary quality to them.

Starting point is 00:24:40 There's sort of many degrees to it. We tried very hard to design a system that was absolutely minimal in terms of the information that it collected. It's called privacy by design, and it leads you to build a system that does one thing. In this case, it does contact notification. And, well, that's all it does. It doesn't collect epidemiological information. You know, the epidemiologists were sort of appalled that we had no information that would be helpful for them in terms of doing their papers. You know, they're envisioning, you know, decades of graduate students writing papers about COVID-19, and we're not giving them the information they need for their dissertations.

Starting point is 00:25:27 And, you know, we have to say, yeah, that's absolutely true. We don't have it. But, you know, it has problems for us because, you know, one of the arguments that we wanted to be able to make is that this form of contact tracing is effective. And how do we get the numbers for it? How do we demonstrate that we've actually found people that have COVID and that it's turned out to be useful? And it turns out you can't really do it directly. You have to have the hotline that people call, ask questions. Did you get a notification from the phone? We have to have surveys of people and ask them whether they're using the COVID app. We have to see whether they come down with it.

Starting point is 00:26:09 We have to sort of have a larger group of people that aren't part of this limited privacy area to get the information the epidemiologists want. And it's a really interesting world to be in, this world of designing an app for, you know, extremely widespread use in a very medically sensitive area in a time of crisis. Because you're sort of getting pressures from all the different sides. And, you know, I absolutely believe Apple and Google are getting the same pressures, if not more. And they're trying their best to come up with the solution that sort of meets the largest number of needs while preserving what they believe are people's privacy. It's a challenge.

Starting point is 00:26:53 But from my point of view, what it made very clear is that things that we think of as sort of net positives, security, for instance, we think more security has got to be good because, you know, security is this thing that has this positive aura around it. But we've always known that security also makes things harder to use. But what we didn't realize is that security is also a mechanism of control. You know, maybe we should have realized this, digital rights management,

Starting point is 00:27:27 the DVDs that you get used to have digital rights management that you couldn't play them in different parts of the world because the studios wanted to control them. So they weren't just there to keep you from copying them. They're also a mechanism of commercial control so they can release at different times in different parts of the world. You know, Apple and Google definitely use the security mechanisms they have as mechanisms for control over their ecosystem. Right now, there are huge fights over Apple's requirements for people, companies to do in-product sales using Apple's payment mechanisms and pay Apple a 30% fee on this.

Starting point is 00:28:15 And they're able to do this because the only way you can distribute apps for the iPhone is through the App Store. And this is one of the contractual requirements for distributing an app on the App Store, and you can't get around it. And, you know, it started with an argument for security. If you go back to sort of the Steve Jobs presentation on the App Store, it's like, you know, we'll have a curated set of apps. We'll make sure they're secure. It wasn't that we will make sure that we make a lot of money off of the app store. It was much more like we're doing this so that you will have a good quality set of apps. But, you know, once you have the mechanisms in place, it's very tempting to use them for other purposes. You know, let's get back to the computer architecture and programming systems community.

Starting point is 00:29:03 You know, when we start building these mechanisms into hardware, and I think there are going to be a lot more of these mechanisms built into hardware, we're giving somebody control. You know, it used to be, you know, I had a computer, I could run whatever I wanted on it. I don't think that that's going to be true in the future. We are definitely going to move to the world where what runs on your computer is going to be very, very much controlled by maybe a small number of entities, but definitely not by the person whose computer it is. And it's all going to be in the interest of security. We're going to be able to say, look, you know, we want to prevent malware.

Starting point is 00:29:42 We want to prevent attacks on your computer, so we need to be able to control it. And companies have tried to do this for a long time. If you work for certain companies, they give you a computer and it's very locked down. It's very hard to install apps on it. But that's entirely a software mechanism that the operating systems support. And if you tried hard enough, you could always work around it. But once we start building hardware mechanisms for this, it becomes harder and harder for somebody to work around it.

Starting point is 00:30:17 So what should hardware people think about as they do this? Because I totally see what you're saying. I recently bought a NAS myself to have sort of my own personal cloud in the house. But then I realized that I still was dealing with this company that had preloaded this whole thing with software. And the thing was only usable if I relied upon and trusted the software that was running on this NAS. And so now it's almost like there's a layer between the final user and the hardware of this software. And as you say, there's basically these several behemoths, which are your gateway to hardware.

Starting point is 00:30:59 So what can the computer architecture community, the hardware community, who's been thinking more about security and privacy than ever before do, given that there's almost going to be this big buffer layer between the people who we want to be helpful of protecting their security and privacy and the domain that we actually have control over, which is the hardware design.

Starting point is 00:31:22 So I don't think any one level of the hardware software stack can solve this problem. You know, the hardware architecture community is going to build mechanisms that enable us to protect the systems better than we can do it now. You know, maybe more isolation between parts of the systems, maybe more cryptography built into the system, maybe, you know, more of a trusted computing base built into the hardware. All of which I think is probably justifiable and probably understandable in the context of providing better security. The trouble is that, you know, it has a dual edge, right? You know, it will, in the end, give control over it by design there to prevent certain types of software from running on the system, which is easy to characterize as the malware, the bad stuff

Starting point is 00:32:21 running on your system. But what if you recharacterize malware as anything that wasn't approved by the manufacturer of your PC, anything that hasn't been sold to you by the manufacturer of your PC? Then the world becomes quite different, right? There's a gatekeeper. We already have this on the phones. I personally think that in the long run it's a gatekeeper. We already have this on the phones. I personally think that in the long run, it's a real challenge

Starting point is 00:32:47 because the software industry is going to be very much limited in terms of what it can do because the gatekeepers have set up rules for what kind of software you can run on there, what kind of apps you can do. For instance, they don't allow a lot of games to run on the phone.

Starting point is 00:33:07 They really, for some reason, don't seem to like game arcades, which lets you play many different games. Microsoft has run into this. Facebook has run into this. They literally cannot get Apple to allow them to

Starting point is 00:33:23 build the kind of software to distribute the games in the way in which they want to. This is a very different world. And part of it is that the phone is a very closed ecosystem. It has a very strong trusted computing base built into it, which I think most of us would think of as a good thing. We certainly would like, when we lose lose our phone for it to be secure, it'd be very difficult for people to get into.

Starting point is 00:33:50 But do we want one company telling us what the software is that we're going to be able to run in there? So I think there has to be a better balance. I think that we do need mechanisms that control what software runs on there. But the control of those mechanisms cannot be given just to one company. Really, the controls

Starting point is 00:34:13 should, in the end, reside on the owner of the computing device. I should be able to say, sure, I'm fine with Microsoft SAP that allows me to load multiple games on my phone because I want to play those games. I trust Microsoft. I'm fine with Microsoft SAP that allows me to load multiple games on my phone because I want to play those games. You know, I trust Microsoft. I'm pretty sure that they're not going to load malware onto my phone. You know, maybe I'm not going to do this from, you know, brand X computing that I've never heard of before.

Starting point is 00:34:40 But, you know, reputation matters a lot in the real world. And I think it matters in the cyberspace as well. So, yeah, you know, I think that's a long answer. But basically, the mechanisms that are built are really going to have to be used in a way that's respectful of the ownership of the device. And this is not just a technical problem. In the end, it's going to come down to a legal and political problem, whether governments are able to force the companies to use these mechanisms in an appropriate way. Because the natural thing to do is for the company to say,

Starting point is 00:35:18 hey, it's our phone, we built it, we're doing this as a service to everybody and we deserve to make money for providing this service. So we're going to exercise a lot of control and make a lot of money. Right. So switching gears, you've donned several hats, ranging from being a professor in academic universities and also working in industrial research labs.

Starting point is 00:35:43 Could you tell us about what are the differences you see between these environments? What are the challenges, opportunities, especially in light of, I guess, emerging trends and newer paradigms looking into the future? I just wanted to add on to that because I think we can tie it into this previous conversation a little bit. One of the things about this whole contact tracing thing is it seems stunning to me how rapidly it's been developed, considered, deployed, and used. And coming from, you know, I work at Microsoft now, this master of industrial complex. I wonder if it's tied into the fact that you're in academia now, or, you know,

Starting point is 00:36:18 there's this whole notion of projects, rapidity, agility. So maybe you can touch on that as well. Sure. So usually people ask me a much simpler question, which is what's better, you know, industry or academia? And I have to say they're both good. You know, I very much enjoyed being a professor. I went to Microsoft on sabbatical with no intention of staying. And as I joke, it became a 16-year sabbatical at Microsoft. And then I went back to academia, and I'm super happy to be back in academia as well. So the answer is very much that both of them have different characteristics, and it's important to find the place that corresponds to what you're looking for. So, you know, in a university, as a professor, you have a great job. You know,

Starting point is 00:37:14 it may not seem that way to professors, because somebody told me when I became a professor, enjoy it. It's the five best jobs you'll ever have. That really is true. I wish I remember who told it to me because it really was a good insight. You know, you have a million things to do. You have a lot of responsibilities, but it's not boring. And the most important thing is that you're your own boss. I mean, literally, you know, it's one of the few jobs in the world where you legitimately are your own boss. And now that I'm dean, I can say that with even more assurance because, you know, none of the professors in my school will say that I'm the boss. You know, and I certainly don't try to behave that way. If I did, they would just laugh.

Starting point is 00:37:57 At a company, Microsoft or any company, you know, if you're someone's manager, you have a fair amount of say over their work environment, what they do, how they get compensated. And that just doesn't exist in the university. And so as a result, you know, as a professor, you know, I can get up in the morning and say, oh, I've got a great idea. I can go into the lab and sort of start working on it. Don't have to ask anybody. I can tell my grad students, hey, I got this great idea. Drop what you're doing. You know, let's do DP3T. Let's do a COVID tracking app because we need to.

Starting point is 00:38:30 And, you know, I don't have to ask permission, which is different than a corporate environment. You know, on the other hand, you know, one of the reasons I went to Microsoft and stayed there is because you can get access to tremendous resources in a company. You know, at the time, I was very interested in software correctness and software development. And Microsoft was obviously very interested in software correctness and software development. They were very challenged in the late 90s. The software quality was bad. The software development process was bad. The bugs were there, and everybody knew it. And so they were willing to provide a lot of resources.

Starting point is 00:39:16 I also felt like if we made advances inside Microsoft, we could deploy them on a very large scale, and they would influence a lot of people very quickly. So it was a very attractive place to be. Microsoft Research was a fantastic place. It was fairly new when I joined. It was growing extremely rapidly. And there's really nothing more fun than being part of an organization that's growing rapidly.

Starting point is 00:39:45 You've got lots of resources. You've got lots growing rapidly. You've got lots of resources. You've got lots of problems. You've got lots of new people. Everybody's excited about what they're doing. You get a lot of say about the long-term plans. I think it's a fantastic situation to be in. I don't regret for a second going to MSR and working there. The place was fantastic.

Starting point is 00:40:06 The people I worked with were great. No problem at all. You know, 16 years at any place is a long time. So I was happy to come back to academia and then realize, you know, academia has a lot of advantages too. You can do what you want without being part of a larger corporation.

Starting point is 00:40:23 You have students around, which I see as a sort of really big plus, particularly grad students. You know, I very much enjoy working with somebody over a period of years, like a PhD, because, you know, when they graduate, when they get a PhD, they're researchers, they produce a substantial piece of research. And, you know, you get a real satisfaction by looking at that and saying, you know, I remember when they came and they didn't know how to do any of this.

Starting point is 00:40:52 So I really taught them a lot. And so both are very good in that sense. Both are quite different. I was also in the part of the industry that was closest to research. So Microsoft Research, particularly when I was there, was a rather unique organization. There had been other research labs, Bell Labs for many years, IBM Research for many years. They were very similar where they basically hired PhDs and they said, do research. And they didn't really care too much on what people did research on.

Starting point is 00:41:29 They trusted the researchers to do research on the right topics and to find things that were interesting to the company. And beyond that, there was not a huge amount of control. That's unusual. There aren't too many companies that have that kind of freedom, have that kind of money to do that. There are a lot of companies where PhD researchers get put into product groups. In some cases, particularly let's look at Google back 10, 20 years. That was the most exciting place to be if you were a PhD researcher in systems, because Google was building the world's largest scale distributed systems, far none.

Starting point is 00:42:10 And they were doing things that were absolutely unprecedented that you couldn't do anywhere else in the world. You certainly couldn't do it in academia. So for a PhD researcher in systems, that was the place to be at the time. And you were in a product group, you were building a product, but you were building a product that was orders of magnitude bigger and more ambitious than what anybody had ever built before. So it was an exciting time to do research there. And, you know, obviously everybody knows the classic papers that came out of there, Big Table, MapReduce, and so forth. I mean, you know, these were real advances by researchers

Starting point is 00:42:47 who were faced with solving problems that hadn't been solved at that scale. So that answers part of the question. So your question, Lisa, you know, how did we do it so fast? The answer was that, you know that we were in a university, and when we decided we were going to do this, I basically was working with a couple of other professors at the time. We were all locked in our apartments. We were using Zoom and Slack.

Starting point is 00:43:21 And what we did is basically what I call a startup in our apartments. Everybody said it was there. We got a bunch of the grad students and postdocs from the labs. And we said, you know, are you interested in dropping what you're doing and working on this? It might help get you out of your apartment sooner. And pretty much every, not everybody we asked said yes, I would be happy to do that. And we were able to get collaborators from all over the world, in particular from many other countries in Europe, because of the tools like Zoom and Slack. And so, you know, we had, I would say, a core group of about 30 people working, and we never sort of got together face to face. We never had any physical facility but on the other hand we were able to ramp up extremely fast because we had a very good very

Starting point is 00:44:12 talented group of people available to us who were able to drop what they're doing without getting permission and the other thing is that you know i, I'm Dean. And so, you know, when we needed money to pay for software development, I just said, sure, we'll pay for that. And we'll figure out how to straighten this out. And, you know, the university administration was great. They said, sure, that's the right thing to do. And the university got behind us a hundred percent as well. So yeah. You know, could you have done this in a company? I don't think so. You know, I think with a company, you know, you have a lot more issues, particularly around something like COVID and health and being in all the countries of the world. You know, we didn't

Starting point is 00:45:00 think through any of those issues. We just basically said, here's an interesting challenge. We want to do something that makes a difference and tries to help solve this crisis a little bit. And we weren't in it to make money. We weren't in it to really do a startup in the classic sense of, you know, trying to get any revenue for it. You know, it's cost us quite a bit of money. But right from the beginning, we said the whole thing was going to be open source.

Starting point is 00:45:30 We were going to give away the software we built. We were going to give away all of the IP. We never thought of trying to do this. And Google and Apple came to us and said, we want to use your protocol. Actually, they came and said, we are using your protocol because it's clear they could. We said, great, you know, fantastic. So, you know, this was a sort of classic academic thing.

Starting point is 00:45:56 Take our ideas, use them, because for us, that's success. Right. It also sounds like this was a very good multi-domain collaboration because you had domain experts from health, from systems, folks from public policy. You've often talked about how doing cross-domain work requires people with multiple domains of expertise. So what are the challenges here? Are there any principles that you've had either in this project or even looking back at your experience in other places? You know, how do you foster good multi-domain collaborations, either in academia or industry? That's a great question. You know, the answer is that you should actively work to do this because it makes for the most interesting projects. So, you know, this DP3T project, it turns out that,

Starting point is 00:46:44 you know, one of the faculty members in my school is an epidemiologist. And he was sort of very much involved in Switzerland's response to the crisis. And so, you know, we had a person who had the sort of background and training in epidemiology, knew about contact tracing, knew about sort of the spread of disease, working with us closely. And so we were able to sort of draw on his expertise. We had Carmelo Troncozzo, who was a privacy expert, who was sort of really pushing us very hard to make sure that the app was as private as it possibly could be. You know, we had collaborations with Surgeon Capstone at ETH, who's an expert in Bluetooth. You know, Ed Bagnone and I are systems experts.

Starting point is 00:47:34 We understood the systems part of it. Yes, we all brought the pieces that we had together. And, you know, much more interestingly, we all learn from each other. To me, a good project is a project where you sort of really don't know what the other people know, and you can learn from them. So I like to joke, I know a whole lot more about epidemiology than I ever wanted to know. And I think probably a lot of people know more about epidemiology these days than they want to know, but I'm more than the average. You know, I know more about privacy than I knew before, and about privacy by design, and how hard it is to do it correctly. I learned something from doing this project.

Starting point is 00:48:07 You know, I think that my most successful collaborations all have this characteristic. The argument I always have for collaboration is that you, first of all, learn something from it and you learn a lot and it's interesting. And second of all, you typically come up with solutions that because they span the boundaries of the intellectual disciplines are much better solutions. They're sort of more complete. They solve the problem better. They're more appealing than they would have been

Starting point is 00:48:37 if you just did it within the skills that you had. So, you know, at Microsoft, I had Singularity, which was an attempt to rethink how you would build an operating system using safe languages. And so, you know, I worked with Galen Hunt, who was a stellar systems researcher, stellar operating systems person, and Ralph Von Reck, who was a really talented programming languages type theory person.

Starting point is 00:49:05 And, you know, we were able to sort of bring the pieces together and think about how you would do protection using the type systems of languages and whether that would actually lead you to a different architecture for a system and what the advantages of it are. This is the kind of thing that you can do. You know, I've been doing this. I did this when I was at Wisconsin, the Wisconsin Wind Tunnel, which was a computer architecture simulation that three assistant professors built to try to simulate parallel computers running on a parallel computer. We had a CM5, which was a collection of machines connected

Starting point is 00:49:41 with a factory network, and we used it to simulate other parallel architectures doing a very fast parallel simulation. But even before that, the three professors involved with this, Mark Hill and David Wood and I, all came out of Berkeley. We were luckily there, beginning of the Berkeley tradition of doing large multidisciplinary projects around the Spur project, which no one has ever heard of, but was one of Dave Patterson's projects to build a

Starting point is 00:50:13 multiprocessor LISM workstation. We were only about 20 years early. I had a good laugh with Dave in 2004 when the multi-core chips started coming out and becoming desktop machines because we had done this in the 1980s. So we were about 20 years ahead of our time in terms of building a multiprocessor workstation for your desk. But that's what academia should do is that they should go and look at these problems ahead of time. And, you know, we didn't just build the chips. We actually did build the chips. There were three chips to build sort of a multiprocessor work system. John Osterhout built a new operating system for it.

Starting point is 00:51:01 We built a parallel LISP system for it. I wrote my dissertation on a parallelizing compiler for LISP for this type of stuff. So there's a lot of really interesting work that came out of that project. You know, it never became the commercial success of Spur or RAID, so it's not that well known. But you know, there are a lot of great computer architects that worked on this. Mark and David and Susan Eggers were all on this project. There were a bunch of other people who worked on this project, Garth Gibson. So people got a great education. They learned a lot about many different

Starting point is 00:51:36 areas. And they went off to sort of extremely successful careers with this kind of background. And it's also a style of work that I think many people find extremely attractive. And once you've done it, this is what you want to do. And so I sort of joke that this is what I've gone through my career doing. And I think many other people have done this as well for this type of large project. So there are a lot of people that have come out of Berkeley directly. There are a lot of people that have come out of Berkeley directly. There are a lot of people who sort of are, you know, one or two generations off the people that came out of Berkeley who are experienced with this type of project.

Starting point is 00:52:13 It's fairly unusual, actually. I think in many different areas, there are not this tradition of doing multidisciplinary projects. You have sort of very focused research projects where everybody's working on the same type of thing. But, you know, we're fortunate that we were able to do it. We're fortunate that we have problems where there is a lot of diversity of solutions and we can get the people together still and do this kind of challenge. So DP3T is just the latest example of it. Yeah, I love it.

Starting point is 00:52:46 I love hearing how your career has really spanned a lot of different things. And at each stage, you're learning something new about something different. I mean, sometimes there are people who are like, they're the interconnect person or they're the power person. And I was trying to think of how to characterize it. I had a really hard time. I have a short attention span of like five or 10 years. I could not imagine the classic academic where you become the world expert on sort of one small area. And that's what you do for your entire career. You just sort of write paper after paper after paper on it. That just has zero appeal

Starting point is 00:53:24 for me. And I'm not telling people that that is not the right way to do it because there is enormous intellectual benefit in having experts of that sort who really sort of work at a problem and sort of solve it well and then, you know, continue it and sort of can deal with the new technologies that come and things like that. Not saying that that's not the right thing to do. It certainly is if that's what you want to do. But that has just never appealed to me. You know, I always told my students, you know, I want to write the first paper or the last paper on the subject. I really have no interest in sort of writing the intermediary papers in there that sort

Starting point is 00:54:02 of make the incremental advances. Well, I think this has been a really wonderful conversation. We're really appreciative that you were able to join us today. It's been a delight talking to you. I hope you enjoyed yourself as well. Absolutely. And to our listeners,

Starting point is 00:54:18 thank you for being with us on the Computer Architecture Podcast. Till next time, it's goodbye from us.

Computer Architecture Podcast - Ep 3: Privacy-preserving Covid Tracing and the Hardware-Software Stack with Dr. James Larus, EPFL

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.