Microsoft Research Podcast - 082 - The brave new world of cloud-scale systems and networking with Dr. Lidong Zhou

Episode Date: June 26, 2019

If you’re like me, you’re no longer amazed by how all your technologies can work for you. Rather, you’ve begun to take for granted that they simply should work for you. Instantly. All together. ...All the time. The fact that you’re not amazed is a testimony to the work that people like Dr. Lidong Zhou, Assistant Managing Director of Microsoft Research Asia, do every day. He oversees some of the cutting-edge systems and networking research that goes on behind the scenes to make sure you’re not amazed when your technologies work together seamlessly but rather, can continue to take it for granted that they will! Today, Dr. Zhou talks about systems and networking research in an era of unprecedented systems complexity and what happens when old assumptions don’t apply to new systems, explains how projects like CloudBrain are taking aim at real-time troubleshooting to address cloud-scale, network-related problems like “gray failure,” and tells us why he believes now is the most exciting time to be a systems and networking researcher.

Transcript
Discussion (0)
Starting point is 00:00:00 We have seen a lot of advances in, for example, machine learning and deep learning. So one thing that we've been looking into is how we can leverage all those new technology in machine learning and deep learning and apply it to deal with the complexity in systems. You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. If you're like me, you're no longer amazed by how all your technologies can work for you.
Starting point is 00:00:40 Rather, you've begun to take for granted that they simply should work for you, instantly, all together, all the time. The fact that you're not amazed is a testimony to the work that people like Dr. Li Dongzhao, Assistant Managing Director of Microsoft Research Asia, do every day. He oversees some of the cutting-edge systems and networking research that goes on behind the scenes to make sure you're not amazed when your technologies work together seamlessly, but rather can continue to take it for granted that they will. Today, Dr. Zhou talks about systems and networking research in an era of unprecedented systems complexity and what happens when old assumptions don't apply to new systems, explains how projects
Starting point is 00:01:17 like CloudBrain are taking aim at real-time troubleshooting to address cloud-scale network related problems like gray failure, and tells us why he believes now is the most exciting time to be a systems and networking researcher. That and much more on this episode of the Microsoft Research Podcast. Li Dongzhou, welcome to the podcast. It's great to be here. As the Assistant Managing Director of Microsoft Research Asia, you are, among other things, responsible for overseeing research in systems and networking.
Starting point is 00:02:02 And I know you've done a lot of research in systems and networking over the course of your career as well. So in broad strokes, what do you do and why do you do it? What gets you up in the morning? Yeah, I think, you know, this is one of the most exciting times to do research in system networking. And we already have seen advances of, you know, system networking have been pushing the envelopes in many technologies. We've seen the internet, the web, web search, big data, and all the way to the artificial intelligence and cloud computing that everybody kind of rely on these days. All those advances have created challenges of unprecedented complexity,
Starting point is 00:02:42 scale, and a lot of dynamism. So my understanding of systems is always, you know, system is about bringing order to chaos, the chaotic situation. So we are actually in a very chaotic situation where things change so fast and there are a lot of new technology coming. And so when we talk about system research, it's really about transforming all those unorganized
Starting point is 00:03:08 pieces into a unified whole, right? That's why, you know, we're very excited about all those challenges. And also we realized over the years that it's actually not just the typical system expertise. When we talk about distributed systems, operating systems, or networking, that's actually not enough to address the challenges we're facing. You have to actually also master other fields like database systems and programming language, compiler, hardware, and also artificial intelligence, machine learning and deep learning and what i do at microsoft research asia is to put together a team with a diverse set of expertise and inspire the team to take on those big challenges together by you know working together and you know that's
Starting point is 00:04:01 a very exciting job to have i I love the order out of chaos representation. If you've ever been involved in software code writing, you write this here and someone else is writing that there and it has to work together. And then you've got 10 other people writing and we all just take for granted on my end, it's going to work. And if it doesn't, I curse my computer. Yes, that's our problem. Well, I had Xiaowen Han on the podcast in November for the 20th anniversary of the lab there, and he talked about the mission to, in essence, both advance the theory and practice of computing in general. Your own nearly 20-year career has been about advancing the theory and practice of distributed systems, particularly. So talk about some of the initiatives you've been part of and technical contributions you've made to distributed systems over the years. You've just come off the heels of talking about the complexities now. How have you seen it evolve over those years?
Starting point is 00:04:58 You know, I think we are getting into the era of distributed systems. Being a distributed system person, we always believe, you know, what we're working on is the most important piece. I think Microsoft Research is really a great place to connect theory and practice because we are constantly exposed to very difficult technical challenges from the product teams. They're tackling very difficult problems. And we also have the luxury of stepping back and thinking deeply about the problem we're facing and thinking about what kind of new theory we want to develop, what new methodology we can develop to address those problems.
Starting point is 00:05:38 I remember in early 2000 when Microsoft started doing web search and we had a meeting with the dev manager You know, in early 2000, when Microsoft started doing web search, and we had a meeting with the dev manager, who was actually in charge of architecting the web search system. And so we had a very interesting discussion, and we talked about how we were doing research in distributed systems, how we have dealt with a lot of problems when servers fail. So you have to make sure that the whole service actually stay correct in face of all kinds of problems that you can see in a distributed system. I remember at that time we had Roy Levin, Leslie Lamport, you know, a lot of colleagues.
Starting point is 00:06:21 And we talked about protocols. And at the beginning, the manager basically said, oh, yeah, I know, you know, it's complicated to deal with all these failures, but it's actually under control. And a couple months later, he came back and said, oh, you know, there's so many corner cases. It's just beyond our capability of reasoning about the correctness, and we need the protocol that we were talking about. But it's also interesting that, you know, in developing those protocols, we tend to
Starting point is 00:06:50 make some assumptions. Say, okay, you know, we can tolerate a certain number of failures. And one question that gentleman asked was, you know, what happens if we have more than that number of failures in the system, right? And from a practical point of view, you have to deal with those kind of situations. In theory, when you work on theory, then you can say, okay, let's make an assumption, and let's just work under that assumption.
Starting point is 00:07:14 So you see that there's a difference between theory and practice. The nice thing about working at Microsoft Research is you can actually get exposed to those real problems and keep you honest about what assumptions are reasonable, what assumptions are not reasonable. And then you think about, you know, what is the best way of solving those problems in a more general sense rather than just solving a particular problem. Your work in networked computer systems is somewhat analogous to
Starting point is 00:07:41 another passion of yours that I'm going to call networked human systems. In other words, your desire to build community among systems researchers. How are you going about that? I'm particularly interested in your Asia-Pacific systems workshop and the results you've seen come out of that. So I moved to Microsoft Research Asia in late 2008. And when I was in the United States, clearly there is a very strong system community. And over the years, we also see that community sort of expanding into Europe. So the European system community sort of started the system workshop, and eventually it evolved into a conference called Eurosys, and very successfully. And, you know, we see a lot of people getting into systems and networking
Starting point is 00:08:27 because of the community, because of the influence of those conferences. And the workshop has been very successful in gathering momentum in the region. And so in 2010, I remember it was Chandrasekhar and Rama Kotler, who were my colleagues at Microsoft Research.
Starting point is 00:08:49 And they basically had this idea that maybe we should start something also in the Asia-Pacific region. At that time, I was already working in Beijing. And I thought, you know, this is also part of my obligation. So in 2010, we started the first Asia-Pacific System Workshop. And it was a humble beginning. We had probably about 30 submissions and accepted probably a dozen. It was a good workshop, but it was a very humble beginning, as I said.
Starting point is 00:09:24 But what happened after that was really beyond our expectation. It's like, you know, we just planted the seed and the community sort of picked it up and grew with it. And, you know, it's very satisfying to see that we're actually going to have the 10th workshop in Hangzhou in August. If you look at the organizing committee, they are really, you know, all world-class researchers from all over the world. It's not just from particular region, but, you know, really all the experts across the world contributed to the success of this workshop over the last, you know, almost 10 years now. And the impact that this workshop has is actually pretty tremendous. What would you attribute it to? I think it's really, first of all,
Starting point is 00:10:13 this is the natural trend, right? You go from U.S. was leading in system research and then expanded to Europe. And it's just a natural trajectory to expand further to Asia Pacific, given a lot of technological advances are happening in Asia. And the other reason is because the community really come together. There are a lot of top system researchers that originally, just like me, came from Asia Pacific region. So we had a lot of incentives and commitment to give back. And all those enthusiasm, passion,
Starting point is 00:10:51 or the willingness to help young researchers in the region, I mean, those actually contribute to the success of the workshop, in my view. Well, you were recently involved in hosting another interesting workshop or conference, the Symposium on Operating Systems Principles, right?
Starting point is 00:11:08 Right. SOSP? SOSP. And this was in Shanghai in 2017. It's the premier conference for computer systems technology. And as I understand, it's about as hard to win the bid for as the Olympics. Yes, almost. So why was it important to host this conference for you, and how do you think it will help broaden the reach of the systems community worldwide? So SOXP is one of the most important system conferences, and traditionally it has been held in the U.S., and later on they started rotating into Europe. And it was really a very
Starting point is 00:11:47 interesting journey that we went through, along with Professor Hai-Bo Chen, who's from Shanghai Jiao Tong University. We started pitching for having SSP in the Asia-Pacific region in 2011. That was like six years before we actually succeeded. We pitched three times. But overall, even for the first time, the committee was very supportive in many ways, so that we'd be very careful to make sure that the first one is going to be a success. And in 2017, when Haibo and I opened the conference, I was actually very happy that I didn't have to be there to make another pitch. I was essentially opening the conference. And it was very successful in the sense that we had a record number of attendees, over 800 people.
Starting point is 00:12:39 And we had almost the same number, if not a little bit more, from the U.S. and Europe. And we had, you know, many more people from the region, which was what we intended. And having conference in Asia-Pacific is actually very significant to the region. We're seeing more and more high-quality work and papers in those top conferences from Asia Pacific region, you know, from Korea, India, China, and many other countries. Right. And I like to believe that what we have done sort of helped a little bit in those regards. Let's talk about the broader topic of education for a minute. This is really, really important for the systems talent pipeline around the world. And perhaps the biggest challenge
Starting point is 00:13:38 is expanding and improving university-level education for this talent pipeline. MSRA has been hosting a systems education workshop for the past three years. The fourth is coming up this summer. And none other than Turing Award winner John Hopcroft has praised it as a step toward improving education and cultivating world-class talent. And he also said a fifth of the world's talent is in the Asia-Pacific region, so we better get over there. Tell us about this ongoing workshop. Yeah, actually, John really inspired us to get this started, I think more than three years ago. And I think we're seeing a need to improve, you know, system education. But more importantly, I think for MSR Asia, one of the things that we're very proud of doing is connecting educators and researchers from all over the world, especially connecting people from the U.S. and Europe with those in Asia Pacific region.
Starting point is 00:14:38 And the other thing that we are also very proud of doing is cultivating the next generation of computer scientists. And certainly, as you said, the most important thing is education. And during the process, what we found is that there are a lot of professors who share the same passion. And we're talking about a couple of professors, Lorenzo Arvizzi from Cornell and Robert Van Renissen from Cornell and Jeff Volker from UCSD, they actually came all the way from the U.S. just to be at the workshop, talking to all the system professors from all over the country in China. And so I attended those workshops myself. The first one was five days and the next two were like three days.
Starting point is 00:15:23 It's a huge time commitment. But you see all the passion from those professors. They're really into improving teaching. They're trying to figure out how to make students more engaged, how to get them excited about systems, even how to design experiments, all those aspects. I'm very optimistic that with those passionate professors, we're going to see a very strong new generation of system researchers. And this is, I think, the kind of impact we really want to see from the perspective of Microsoft Research Asia. It's not just about making the lab successful, but if we can make impact in the community in terms of talents,
Starting point is 00:16:07 in terms of the quality of education, that's much more satisfying. Before we get into specific work, I'd like you to talk about what you'd referred to as a fundamental shift in the way we need to design systems. And by we, I mean you. In the era of cloud computing and AI, you've suggested that things have changed enough that the older methodologies and principles aren't valid anymore. So unpack that for us. What's changed and what needs to happen to build next-gen systems? Yeah, that's a great question. I continue with the story about building fault-tolerant systems. So in the last 30 years, we have been working on system reliability, and we have developed a lot of techniques, a lot of protocols,
Starting point is 00:16:52 and we think it will solve all the problems. But if you look at how this thread of work started, it really started in the late 70s when we were looking at the reliability of airplanes and so on. Of course, there are assumptions we make about the kind of failures in those kind of systems. And we sort of generalize those protocols so that it can be applicable up until now. But if you look at the cloud, it's much more complicated in many dimensions. And the system also evolves very quickly. And a lot of assumptions we make actually start to break. And even though we have
Starting point is 00:17:32 applied all these well-known techniques, that's just not enough. So that's one aspect. The other aspect is, it used to be that, you know, the system we build, we can sort of understand how it works, right? And now the complexity has already gone beyond our own understanding. We can't reason about how the system behaves. On the other hand, we have seen a lot of advances in, for example, machine learning and deep learning. So one thing that we have been looking into is how we can leverage all those new technology in machine learning and deep learning and apply it to deal with the complexity in systems. And that's another very fascinating area that we're looking into as well. Well, let's get specific now. Another super interesting area of research deals with
Starting point is 00:18:25 exceptions and failures in the cloud scale era and how you're dealing with what you call gray failure. And you've also called it the gray swan, which I want you to explain, or the Achilles heel of cloud scale systems. So how did you handle exceptions and failures in a somewhat less complex pre-cloud era? And what new methodologies are you trying to implement now? Right. So as I mentioned, in the older days, we are targeting those systems with assumptions about failures, right? Like crash failure, you know, a component can fail. When it fails, it crashes. It stops working. And nowadays, we realize,
Starting point is 00:19:12 you know, this kind of assumption no longer holds. So this is why we define a new type of failures called gray failures. So thinking about what kind of name to give to this very interesting new line of research that we're starting. So we call it gray swan. People already know about black swan or gray rhino. So first of all, because we're talking about the cloud, we want something not as heavy as rhino. We want something that can fly. And the reason we call it gray is because, you know, a system component is no longer just black or white. It could be in a weird state where from some of the observers, it's actually behaving correctly, but from the others it's actually not. And that turns out to be behind many of the issues, the major problems that we're seeing in the cloud.
Starting point is 00:19:55 And it has sort of some components of black swan in the sense that some of the assumptions we're making break. So that's why everything we build on top of that assumption starts to break down. So for example, I mentioned the assumption about the failure, right? If you think that it either crashes or it's correct, then it's a very simple kind of world, right? But if it's not the case,
Starting point is 00:20:19 then all the protocols that will work under that assumption will cease to work. It also has this connection with Grey Rhino because Grey Rhino is this problem that everybody sort of sees coming, and it's a very major problem, but people tend to ignore it for the wrong reason. And in our case, we know that for the cloud, all those service disruptions happen all the time, and there are actually failures all over the places. It's just very hard to figure out which ones are important, but we know something big is
Starting point is 00:20:52 going to happen at some point, right? So we try to use this notion of Grace Wand to describe this new line of thinking where we really think about failures that are not just crash failures or not even, you know, Byzantine failures where it's essentially arbitrary failures. But there's something in between that we should reason about and then using those to reason about the correctness of the whole service. So does the word catastrophic enter into this at all, or is it? Yes, that could be catastrophic eventually. How does that kind of thinking play into what you're doing? If you look at the cloud system, it's like a rhino sort of charging towards you.
Starting point is 00:21:37 And before it hits you, there are a lot of dust and noise and other things. But you just don't know when and how something bad is going to happen. It could be catastrophic. It happens actually a couple of times already. And so one of the things we try to do is to try to figure out when and how bad things could happen to prevent catastrophic failures from all the dust and maybe, you know, other signals we have in the system. There are signals. It's just we don't know how to leverage them. Part of your approach to coping with grave failures is a line of research you call CloudBrain. Right. And it's all about
Starting point is 00:22:16 automatic troubleshooting for the cloud. It's actually a huge issue because of the remarkable complexity of the systems. So tell us how CloudBrain and what you call DeepView is actually helping operators, the people that have to deal with it on the ground, simplify how they write troubleshooting algorithms. So I think CloudBrain is one of the efforts that we have to deal with great failures. And I remember, you know, we talked about
Starting point is 00:22:43 the challenges that come from complexity of the system or the scale of the system. We really have a huge number of components interacting with each other. But on the other hand, we can really leverage the scale of the system to help us in terms of diagnosis and all detecting problems, even figure out where the problem is. And this is the premise of the CloudBrain project. So it has actually three components, three ideas. The first one is really the notion of near real-time monitoring. And so instead of trying to look at the logs after the fact and then analyze what happened,
Starting point is 00:23:27 we try to have a pulse on what the system is doing, how it's doing, and so on. So that's the first component. And the second component is we really want to form a global view. So it's not just one observation we make about a system, but really observations for all over the systems combined. So we can actually understand how system is behaving and which part is actually having problem. And then the third part is once you have, you know, all this global observations that come in real time, then we can use statistical methods to really reason about, you know, what's abnormal and so on. So this is where we really leverage the scale,
Starting point is 00:24:07 the huge amount of data that used to be a challenge and now becomes an opportunity for us to actually come up with new solutions to handle the complexity of the system. So how does that help an operator simplify writing an algorithm? Right. So now the operator actually has all the data in your real time. And you know, you can write this very simple algorithm that operates on the data, sort of like a SQL query. Right. And then you can emit signals and, you know, tell people that something is wrong or
Starting point is 00:24:39 something is correct. Or maybe we have to pay attention to part of the system that seemed to have some problems. So where is this gray failure research with all its pieces and parts in the pipeline for production? Overall, we are not at the stage where we solve other problems, but we have pieces of the technology we developed to solve some specific problems, like DeepView and CloudBrain are, you know, the two projects that have already been incorporated in Azure to deal with network-related problems, for example. But, you know, we're far from solving the problem. It's really sort of a research agenda that we set out probably for years to come. And one idea that we have been working on, which is actually
Starting point is 00:25:25 very interesting, is that we really have to change how we view programs. In the past, for defensive programming, we have been trained to handle exceptions. And it turns out that handling exceptions in a large complex system is not enough. So one of the ideas that we've been thinking about is changing exception handling into exception or error reporting. So you start to collect all those signals that we talk about, you know, the dust when the rhino comes charging at you. So you have to really collect those dusts towards one place so that you can actually reason about the behavior of the system. And that's one of those major shifts that we see coming
Starting point is 00:26:14 even in how we develop systems, not just after the fact, we already have this beast, and now we need to understand what's going on. So those methodology, I think, is where we're pushing. It's not just solving specific problem. We have an incident. We try to solve this problem. Yeah, we can do that.
Starting point is 00:26:32 But more importantly, this goes back to the theory meets practice. So we need to come out of looking at the specific instances, but think about what methodology we should adopt to change the status completely. So how do you implement then a brand new thing? I mean, we talk about the beast that already exists and is growing. What are you proposing with your research? Right. So this is always a hard problem. We already have something running and it has to keep running. And now it has all the problems you need to solve. So one of the ways we deal with those challenges is trying to solve the current problems, you know, like CloudBrain and DeepView
Starting point is 00:27:17 sort of try to fit in to the current practice. But for some other projects, what we do is like, you know, what I talked about changing from exception handling to error reporting. That actually is a system we build that we can transform automatically a piece of code that does error handling in the traditional way into a piece of code that actually does our reporting in the way that we desire. And that helps because we don't want everybody to rewrite the whole code base. It's just not possible. So we have to find ways to help developers to sort of do the transformation and also live with the current boundary of the system. And hopefully, gradually, we'll move towards the right direction.
Starting point is 00:28:06 MELANIE WARRICK- Yeah, I think you see that in just about every place software exists, is there's a legacy system. You've got a retrofit, some stuff that added complexity to it. JOHN WHYTE, That's right. MELANIE WARRICK- But you can't just make everyone throw out what they're already using.
Starting point is 00:28:19 So this is a big challenge. I'm glad you're on the job. Well, we talked about what gets you up in the morning and all the work you're doing to make sure that everything goes right. That is basically what you're doing is trying to make everything go right. Right. But as we know, as you know more than I know, something always goes wrong. Right. Unfortunately. The rhino. So given what you see in your work every day, is there anything that keeps you up at night?
Starting point is 00:28:56 Yes. I think we're realizing that the kind of distribution system we're designing or building are becoming more and more important. They're becoming part of the sort of critical infrastructure of our society. And that puts a lot of burden on us to make sure that whatever we're building can be mission critical. Right. And, you know, we have a lot of researchers working on formal methods, verification, just to make sure that the core of the system can be verifiable.
Starting point is 00:29:25 It'll give some assurance that it's actually working correctly. And we talked about applying machine learning and deep learning mechanism, but it's statistical. So sometimes, it's actually naturally there are cases where it breaks. So how we can safeguard this kind of system from what you call catastrophic issues. And this is also another thing that we have been putting a lot of thoughts into. And we're not short of challenges, especially on making the cloud infrastructure really mission critical. Li Dong, tell us your story. How did you end up at Microsoft Research?
Starting point is 00:30:02 And how did you develop your path to the positions you hold right now? Looking back, I remember when I finished my PhD, I started job hunting and I got a couple of offers. And I talked to my advisor, of course. That's what you do when you're a graduate student. And he basically gave me a very simple piece of advice. He basically said, well, just go where you can find the best colleagues, the colleagues with maybe, you know, Turing-Ward caliber. So I ended up going to Maxwell Research Lab, where at that time, we didn't have a Turing-Ward winner, but within
Starting point is 00:30:40 10 years, we had two. So that was how things started. Looking back, what's really important is the quality of colleagues you have, especially in the early stage of my career. I learned how to do research in some sense. It's not about getting papers published. It's internal passion that drives research. And I think the first phase of my career is more on personal development. I remember being pushed by my manager at the time, Roy Levin, to get out of my comfort zone. We started as a sort of technical contributor, but then I was pushed to lead a project. And there are always new challenges that you face, and you get a lot of support from your colleagues to get to the next stage, and that's very satisfying. And then I went to MSR Asia, where I later become a manager of a
Starting point is 00:31:39 research group. And I think that's sort of the second phase of my career where it's not about my personal career development. It's also about building a team and how you can contribute to other people's success. And that turns out to be even more satisfying to see the impact you can have on other people's career and their success. And also during that period of time, I also realized that it's not just about your own team. You know, we can build the best system research team in Asia Pacific, but it's more satisfying if you can contribute to the community. And we talked about starting the workshop and getting the conference into Asia Pacific and in a lot of other things that we do to contribute to society, including, you know, the talent fostering and many other
Starting point is 00:32:32 things. And those in my mind are becoming even more critical as we move on in our career. So I view this as sort of the three stages of my career. It started with personal development, learning what it means to love what you do and do what you love. And then you think about how you can contribute to other people's success and increase your ability to influence others
Starting point is 00:32:59 and impact others positively. And finally, what you can contribute to the society, to the community. And I've been very fortunate to have been working with a lot of great, you know, leaders and colleagues. And I've learned a lot along the way. And I remember, you know, I work with a lot of product teams as well. And they also offered a lot of career advice and support. So this is just, you know, my story, I guess. You know, it sounds to me like almost a metaphor. You know, you start with yourself, you grow and mature outwards to others, and then the broader community impact that ultimately a mature person wants to see happen, right? I hope so.
Starting point is 00:33:45 I get the sense that it is. It's just about seeking the truth. It's not about, you know, getting papers published. It's not about, you know, chasing fame or, you know, all those things that we start to lose sight of, you know, what the true meaning of research is. It's not about all this result that we try to get, but truly it's about finding the truth and enjoying the process along the way. At the end of each podcast, I ask my guests to give some parting advice to our listeners.
Starting point is 00:34:15 What big unsolved problems do you see on the horizon for researchers who may just be getting their feet wet with systems and networking research? Well, I think they are very fortunate to be a young researcher in system networking now. I remember I was talking to Bud Lapson when I started my career in 2003, and he said, you know, he was feeling lucky that he was doing all the work in the late 70s and early 80s because it was the right time to see a paradigm shift. And I think now we are at the point that we're going to see another major paradigm shift. Just like, you know, folks in Zero Spark, what they did was essentially to define the computing for the next 30 years. Even now, we're sort of living in the world that they defined, looking at the PC, even with the phone, I mean, it's just a different form factor, right? They sort of defined the mouse, the laser printer,
Starting point is 00:35:16 all the things that we know about, and the user interface. And the reason that happened at that time was because the computing was becoming more powerful from supercomputers to personal computing because we can pack so much computation power into a small machine. And now I think the computation power has reached another milestone where computing capability is going to be everywhere. And we're going to have intelligence everywhere around us. The boundary between sort of the virtual world and computers and our physical world will disappear. And that will lead to really paradigm-shifting opportunities where we figure out what computing really means
Starting point is 00:36:04 in the next 10 years, 20 years. And this is what I would encourage everyone to focus on rather than just incremental improvement to the protocols and so on. Because we are really seeing a lot of assumptions being invalidated. I would really have to look at the world in a very different view. And from how we interact with sort of the computing capability and how we expose computing capability to do what we need to do. And it's not just doing computing in front of a computer,
Starting point is 00:36:39 but doing everything with sort of the computing capability around us. And that's just exciting to imagine. doing everything with sort of the computing capability around us. And that's just exciting to imagine. And I can't even describe what the future will look like, but it's up to our young researchers to really make it a reality. Li Dongzhao, it's been an absolute pleasure. Thanks for joining us in the booth today. Thank you, Gretchen. Really a pleasure. To learn more about Dr. Li Dongzhao and how researchers are working to bring order out
Starting point is 00:37:12 of systems and networking chaos, visit microsoft.com slash research.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.