Microsoft Research Podcast - 097- Optics for the cloud: storage in the zettabyte era with Dr. Ant Rowstron and Mark Russinovich

Episode Date: November 6, 2019

Remember when a hard drive that could hold a terabyte of data was a big deal? Well, we’re now in an era where peta-, exa- and even zetta-bytes are the bytes of the day, and it turns out it’s hard ...to fit that many zeroes on a hard drive. That’s where Dr. Ant Rowstron, Deputy Lab Director of Microsoft Research Cambridge, and Mark Russinovich, Chief Technical Officer of Azure, come in. Their respective teams are working on paradigm-breaking solutions to give us phenomenal storage power in an itty-bitty living space. Today, Ant and Mark discuss their roles in the development of new optical technologies, like Project Silica, for cloud-scale storage demands, and talk about the Optics for the Cloud Research Alliance, an exciting new collaboration between academic researchers and MSR. They also explain how just the right mix of innovation and engineering can make the cloud more powerful and less expensive to use and, at the same time, deliver “forever” storage that’s both dishwasher and microwave safe! https://www.microsoft.com/research

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to another two guest episode of the Microsoft Research Podcast, where we're exploring the latest in disruptive technologies for data storage in the Cloud. In an era in which we're pushing the very limits of our current storage technologies, who better than the CTO of Azure and the Deputy Lab Director of MSR Cambridge to give us
Starting point is 00:00:21 the lowdown on the promise of new technologies like Quartz Glass for storing and protecting all things digital. our Cambridge to give us the lowdown on the promise of new technologies like quartz glass for storing and protecting all things digital. You're listening to the Microsoft Research Podcast, a show that brings you closer to the cutting edge of technology research and the scientists behind it. I'm your host, Gretchen Huizenga. Remember when a hard drive that could hold a terabyte of data was a big deal? Well, we're now in an era where peta, exa, and even zettabytes are the bytes of the day.
Starting point is 00:00:59 And it turns out, it's hard to fit that many zeros on a hard drive. That's where Dr. Ant Rostrin, Deputy Lab Director of Microsoft Research Cambridge, and Mark Rusinovich, Chief Technical Officer of Azure, come in. Their respective teams are working on paradigm-breaking solutions to give us phenomenal storage power in an itty-bitty living space. Today, Ant and Mark discuss their roles in the development of new optical technologies like Project Silica for cloud-scale storage demands, and talk about the Optics for the Cloud Research Alliance, an exciting new collaboration between academic
Starting point is 00:01:30 researchers and MSR. They also explain how just the right mix of innovation and engineering can make the cloud more powerful and less expensive to use, and at the same time, deliver forever storage that's both dishwasher and microwave safe. That and much more on this episode of the Microsoft Research Podcast. The word excited is overused, but there's really not a better word for how I feel about the guests I have in the booth today. First, we have all the way from Cambridge, UK, Dr. Ant Rostron, who's the Deputy Lab Director of Microsoft Research Cambridge. And with him, I have Mark Rusinovich, who is the Chief Technical Officer of Microsoft Azure.
Starting point is 00:02:22 Welcome, Ant and Mark. Mark and Ant. Thank you. Thanks. Your titles give us a picture of who you are at Microsoft, but they don't really say what you do. And I'd like each of you to give us a little more detail on what gets you up in the morning. Maybe a sort of elevator pitch on what you do and why you do it. Ant, why don't you give us a start here? Okay. So, you know, basically, as Deputy Lab Director in Cambridge, I come in every day and I get to work on cloud infrastructure and future technologies for the cloud.
Starting point is 00:02:52 And it's really fun. We're doing a lot of stuff. We're using a lot of high-tech stuff. We're doing, you know, a lot of stuff around optics, femtosecond lasers and things like that. And so, yeah, I come in every morning and I get to play with the greatest tech and try to create technologies that will help the cloud in the future. As a deputy lab director, are you doing research yourself in addition to sort of overseeing the lab work? Yeah, so I try to spend some of my time each week
Starting point is 00:03:18 actually doing research as well. And I'm very involved in a small number of the projects and I help oversee a number of the other projects as well in the group. As some people say, to keep your research muscle strong. Yeah, that's right. That's right. It's the most exciting thing. It's the bit I really enjoy, coming in and doing stuff. Mark, what do you do?
Starting point is 00:03:33 So my role is leading technical strategy and architecture for Azure, our cloud platform. So every day is completely different. It could be meeting with customers, meeting with internal partner teams, meeting with engineers deep on teams, writing papers, reading papers, meeting with analysts. I mean, every day is a new adventure. So as chief technical officer, how technical do you get to get into the weeds? I get into the weeds. Yeah, that's the fun part. So I think that that is a really great part of the job is that I can be at a very high level like strategy, like directions, trends, and then get down into implementation details. In fact, I'm involved with a couple of the projects at MSR Cambridge at a very deep technical level. Well, perhaps to
Starting point is 00:04:18 some of our listeners, you aren't obvious booth mates. So let's explain why you're both here in the booth together. There's an interesting story of how Azure met Optics during a disruptive technology review, where researchers present the work they're doing to senior leadership team members. How did that go down? Ant, why don't you go first? Okay. So, I mean, you know, it's really interesting. We've been working with the Azure storage guys for some time on some storage projects, looking at trying to use traditional technologies to sort of store long-term data. And it had got to the stage where I think, you know, both sides had realized that we'd pushed the existing technologies to the limits,
Starting point is 00:04:55 and both sides were looking around for new technologies. And it happened that someone in the Azure Storage side and us both saw a sort of a newsfeed article about a breakthrough at Southampton and how to store data in glass. And we both approached from very different angles. The Azure storage side, that person flew over to Southampton and had a look at it. We were there downloading the papers, trying to read them, trying to understand them. As computer scientists, it was quite hard. And then we started to talk about how the technology could be taken forwards and Trying to convince the company that this was going to be something worth doing
Starting point is 00:05:28 We started getting it ready for this disruptive technology review We reached out to mark to help us because it was sort of key to get both sides really bought in To what was happening and then we got the opportunity to present it to the SLT and Jason was there and you were there and That really led to the point where people said, hey, let's make it work together. Let's work collaboratively between Microsoft Research and Azure Storage to get it forward. And so that's how Mark got involved and how it sort of got going.
Starting point is 00:05:55 MARK MANDELMANN Yeah, in fact, I mean, I remember, so meeting with Ant before this, where I got my first overview of what the project was all about and the promise of it, the disruption technology review happened, the SLT was like, this looks like really exciting stuff, but that alone isn't enough to actually cause things to go forward.
Starting point is 00:06:11 What happened after that is that me and Douglas Phillips, who was in charge of storage, and Jason Zander, head of Azure, met together to talk about how we wanted to take it forward. And we'd kind of estimated with MSR Cambridge the resources required and the amount of money required to actually get the we wanted to take it forward. And we'd kind of estimated with MSR Cambridge the resources required and the amount of money required to actually get the thing moving to a point
Starting point is 00:06:29 where it could potentially intersect with production in Azure at some point. And that was the go big or go home. Either we had to commit and jointly fund it with MSR Cambridge or it just wasn't gonna happen and then we decided to do that. You know, at the time, we didn't really have any people who worked in the space of physicists or in the space of optics so it really was a decision point are we
Starting point is 00:06:48 actually going to start doing this and if we are we've got to go out and hire new people with new skills that we don't have traditionally in MSR and bring them on board so it was a big decision to do it and thankfully we made it. Mark, Microsoft wasn't first to the cloud, but it's a big player now, and you're at the technical epicenter of Azure. How have you seen the cloud computing landscape evolve over the time you've been here, both from an industry point of view and from Microsoft's point of view? Just in the landscape of what cloud was and what it is now, it's been a fascinating journey. Because when I started in Azure in 2010, Azure had been underway for a few years, just commercially launched. But Azure was tiny, basically in two data centers, a few thousand servers. And the number of companies that were saying they were going to do cloud was probably 20, 25 companies. And about four years ago, they
Starting point is 00:07:41 were still about 17. Three years ago, we were down to 13. Two years ago, it was down to six. And there's still six in what Gartner classifies as their magic quadrant for infrastructure as a service. And I think there'll be more paring down to come. And Azure has been continuously in this strong position, rising, rising. And a couple years ago, the market and analysts started to recognize us as the strong number two in this strong position, rising, rising. And a couple of years ago, the market and analysts started to recognize us as the strong number two in the space. What are the factors that go into making you a player in this field?
Starting point is 00:08:15 It's a huge amount of capital. It has to be a core business priority for the company. Yeah. And expertise from hardware data centers all the way up into software stacks and massive scale distributed systems. And also time is another factor because this stuff all needs to mature. Yeah. I think it's hard to underestimate the scale of this, right? About six, seven years ago, I was lucky enough to go to the Dublin data center, which is, I guess, by modern standards, a big one, but not the biggest. And in fact, then it was still being built out as it were. And we got there and you sort of walked for just mile after mile or kilometre after kilometre, depending which
Starting point is 00:08:53 country you're in. And you just saw servers and it's just room after room after room. And each time you go in it, it is just thousands and thousands and thousands of servers. It was just a moment where, you know, it actually changed the way that I thought about research because there was no point in trying to do the stuff that we were trying to do because when you realize how it operates, and in that scale, you're just wheeling in a rack. You don't care about it.
Starting point is 00:09:16 They're just delivering racks, and they have, you know, there's sort of a production line. You know, racks roll in at one end. It comes through a center. It's rolled in, plugged in, a few cables, power. And just the scale was just phenomenal. And I think unless you've actually been to one of those data centers, it's often hard to sort of appreciate. The ironic thing is that there's many people working in Azure that have never been to a data center, so they have not experienced it.
Starting point is 00:09:35 They should go. Yeah, I know they should. It just reminds me of that final scene of Indiana Jones, where they put the Ark of the Covenant in an army storage house, and then the camera pans back, and it's like, you'll never find it. Yeah. Yeah. In fact, you have this great photo, I think, that you show sometimes, which shows sort of like a data center from the front, and you go, this is like the size of a football
Starting point is 00:09:55 pitch. And then you show it from the other side, and it's like 10 times bigger, and you say, and this is what we're currently building out to extend that one data center. It's just unbelievable. I love that Ant says football pitch and he probably means soccer field. Soccer field, yeah. I translated that.
Starting point is 00:10:09 I translated that. All right, well, let's talk about where and how we store our digital stuff, because this is a huge part of what we're talking about here. We have storage units, and I put air quotes around that right now, for our digital stuff, but they weren't designed with the cloud in mind. And so, Ant, what are we up against with cloud-sized demands on current technologies? Is Moore's Law really dead? And what are researchers exploring in optical technologies to help us out? If you think of most of the technologies we use to store data today, things like flash, things like hard disk drives, things like tape,
Starting point is 00:10:59 it's true to say that they were all designed before the cloud existed. And in fact, they were all designed to work in multiple scenarios. You know, the flash chips have to work in your phone as well as in the data centers, or the hard disk drives have to work in the desktop machines as well as in data centers. So there's compromises that have been made. And in fact, many of those technologies, if you think of a sort of S-curve where you've got time and you've got the metric of interest, you get a new technology comes along like a hard disk drive. It starts off very slowly, it then hits its sort of main speed, and it's sort of going nicely, it's doubling every two years or three years if you're a hard disk drive. And then you get to the point where it actually stops just basically exploiting the property they were exploiting,
Starting point is 00:11:36 and it rolls over and it becomes flat. I think one of the challenges for us as a company and us as an industry is that many of the technologies that we rely on are beginning to get to the point where either they're at the end or they're starting to get to the point where you can see the end. And of course, Moore's law is a well-publicized one and we hit it some time ago. And that's a great opportunity because whenever you get that rollover, you get an opportunity to better do things differently, to have new ways of doing things. I mean, we used FPGAs a lot and this was something that we used to actually allow us to do stuff on the computational side. On the storage side, we started to look at these optical technologies and to see how we might exploit things and new discoveries in order to create new storage
Starting point is 00:12:13 technologies and new storage media that we could use when the traditional ones are reaching the end of their life cycle. Right. So I want to drill in just a little bit there, because when we talk about technologies that are invented before something else happens, now you have an opportunity with optics to build from the ground up. Does that affect the way you think about designing things? Does it change your framework? Yeah, I think it does. I mean, I think it goes back to this point about going around a data center. When you actually see the scale of a data center, you start to think about things very differently. You start to think about what does it mean to actually make something that works in that environment. And those trade-offs that you can make,
Starting point is 00:12:47 perhaps it's, you know, something can be a bit bigger. It doesn't need to be a particular size. It can be long and thin. It can have different dimensions because that's not the dominating factor is to get it under a desk as well as in the data center. Right, right, right. Mark, do you have anything to add to that?
Starting point is 00:13:03 Well, I think kind of related to the scale that Azure is going towards, I think we're going to talk about the scale of the data that's coming to the cloud and being generated in the cloud, and that's the other dimension here. Once you get to that kind of scale, small optimizations, much less big ones, have massive impact on cost of storage, cost of keeping the media fresh. I mean, when we take a look at the amount of data that's being generated in the cloud and the way that that data is managed by existing technologies, which have limited lifetimes, and we're talking about here in the cloud, there's going to be data
Starting point is 00:13:35 that's going to be persisted for decades or hundreds of years. And media like Flash needs to be constantly scrubbed. Tape needs to be refreshed periodically. And if you think about like tape, archival media, petabytes, exabytes, zettabytes of it having to be constantly read and rewritten, that's part of the cost model in these existing technologies. Mark, Azure has, to use a poker term, kind of gone all in on optical technologies and become a leader in this arena, both in thought and deed, as it were. So why is optics important to Azure? And where are these leading edge technologies in the product pipeline for you guys? And how does Microsoft compare with other players in the cloud market space as it pertains to optical technologies? As far as I'm aware, we're the only cloud that's pursuing this kind of technology. Now,
Starting point is 00:14:28 others might be pursuing it and being quiet about it, but I think that that's a differentiation of Microsoft and Microsoft Research with regard to exploring these things, and that is that we openly talk about it. And I think that does a few things. One is attracts researchers and engineers that are interested in working on cutting-edge technology when they get excited about being able to really disrupt the future by working on a project like that. The other one is that it shows our customers that we're a cloud that's constantly innovating and exploring new ideas and looking at ways we can make the cloud and their usage of the cloud much more cost-efficient and powerful. So both of
Starting point is 00:15:05 those I think are reasons why we are so transparent about the things we work on. Now as far as where it is in the pipeline, we don't know exactly when this is going to land in a way that it works at scale in the environments that we're going to be deploying it in. There's still a lot of questions to be answered. There's some fundamental questions that ANTS is still exploring the answers to, and I think that once we answered. There's some fundamental questions that Ant's still exploring the answers to. And I think that once we get the answers to the fundamental questions, that's when you start to engineer for production and get it to the point where you can stamp these things out and deploy them around the world. One of the great things is working so closely with Azure means that as we do answer more of these questions and we get more of a handle on it, we're actually answering the questions that make it more feasible for it to be deployed in Azure. And I think that's one of these things about working in such close collaboration with the teams that are actually
Starting point is 00:15:50 trying to deploy storage today and trying to use storage today. It's a really powerful thing. You get that feedback loop going, you get people actually sort of able to consume the technology, which is often very hard for a new technology. And that allows you to zoom in on the answer, an optimal answer for what comes up. What are those fundamental questions that you still haven't answered, Ant, and why haven't you? What's taking you so long? That's right. Dang it.
Starting point is 00:16:16 So, you know, we're at the stage where we're still trying to really control the technology. We're still trying to understand how to improve the performance and the way in which it operates. When we first saw the technology, you walked into a room in Southampton, and they were effectively writing a few bits per second, reading a few bits per second. And you've got to get something that is able to do hundreds of megabytes per second. There's a long way to go. And to get there, it's a mixture of engineering and innovation. So some of the problems you engineer out, other problems you have to innovate around.
Starting point is 00:16:43 And we've been quite successful at innovating around some things. We've also been quite successful at engineering things. But we've still got a long way to go, I think, to make it really happen. But we can see the path, which is always good. Right. I mean, that's half the battle, is seeing where you're going. Well, let's talk for a second about the balance between pushing the limits of existing technologies and exploiting recent discoveries in new media like Glass. We'll go deep into Project Silica in a bit but first talk briefly about the current state-of-the-art and storage
Starting point is 00:17:12 technologies along with some other explorations across the labs in other media. Today we're using tape, we're using hard disk drives and we use flash in the cloud and you know a few years ago the team in Cambridge worked very hard at trying to build out the lowest possible cost storage using hard disk drives. And, you know, we went to extreme lengths. In a rack, normally you have a couple of hundred fans. We built a rack with just three fans. You have sort of 10 kilowatts going in to power it. We got it down to a few kilowatts.
Starting point is 00:17:41 And I think the thing that you realize when you do this sort of thing is as you push the envelope more and more and more you just hit the limitations and in fact you know we were sort of getting to the stage where you pick up a hard disk drive and thinking i wish i could crack it open and start to change things inside it because i've done everything else i can do outside of the hard disk drive and i think this is a challenge in things like azure you get components coming in and some of them you can really change some of them are more fixed and one of the things i think about this sort of state-of-the-art stuff and picking new media is it's really about then going to the next level you know we talk about this vertical sort of integration going from the
Starting point is 00:18:12 really the bottom layer up whether it's glass whether it's dna well what if you can start from nothing and work up the stack and not have these kind of constraints of what is currently done right or boundaries often artificial boundaries about what you can and can't change. The fascinating thing about some of these things, to me, and this really highlights the power of MSR, is that we're very much heads down in Azure, looking really kind of a short term,
Starting point is 00:18:37 maybe a little bit longer term in some cases, but really focused on things that look like we've got to really nail to be successful. Meanwhile, MSR Cambridge has a lot more flexibility and freedom to go and just explore ideas that sometimes look kind of out there. Well, and that actually brings up my next topic, which is the unique relationship between research and product teams here at Microsoft. And I would love to know from each of your perspectives
Starting point is 00:19:05 how important this partnership is in the cloud era. According to many of my guests who sat in your chairs, this is one of those historical turning points like the Xerox PARC people defined computing for the next 60 years back in the 70s. How are research and product working together on this? And what do each of you get out of the relationship? Like I just was talking about, the fascinating and really exciting part of MSR is that they can go start to explore these things.
Starting point is 00:19:32 Even Ant was exploring glass before we had the opportunity to tell him, this looks ridiculous, stop looking at it. You know, he got it to the point where we could see the potential value and promise of it. But you're right about this being a turning point. This is a massive shift in technology that's happening right now, going from the client-server era really into the cloud era and the cloud plus edge era. And we're defining what that world looks like right now. The systems we define, the services we make, the APIs we create, the technology that we use to power it,
Starting point is 00:20:06 all of it is open for disruption, for being new. And some of it has to be new just because it doesn't exist yet. So like Ant was talking about earlier, the opportunity to have this incredible impact on the future with something like Glass. Potentially, you know, if this plays out in two decades, everybody will be like, oh, yeah, of course, Glass. And that's just the way that you store data long-term is Glass. Right now, nobody's thinking about it, but that's incredibly exciting, fun. Yeah, I mean, I think one of the great things
Starting point is 00:20:38 about where the cloud is at the moment is that you guys are so innovative, and you are so hungry for innovation, right? I've been at Microsoft for 20 years, and it's rarely that I've worked with product groups that are, you know, so innovative and you are so hungry for innovation. I've been at Microsoft for 20 years, and it's rarely that I've worked with product groups that are so open. And I think it's because it's a competitive world out there. I mean, it's like, you know. I've heard that.
Starting point is 00:20:53 Yeah. And, you know, anything you can do to add innovate makes sense. So I think from a research perspective, it's really great because you get to work with people, they get the real-world experience, and they tell you about the things that are working and are not working. And then the flip side is, you know, within MSR, we get the space to fail. And I think often when you're in a product group, people want to work on things that ship and are successful. And so, you know, in MSR, we can really do high risk stuff. And failure is almost as good as making it work because at least you
Starting point is 00:21:19 tried to make it work. And you learn something. And you learn something. And that's exciting. And I think that's something that's really great about this whole sort of MSR relationship. And I think that's something that's really great about this whole sort of MSR relationship with product groups is that we get that space and yet we get that input
Starting point is 00:21:30 that allows us to really make sure that when we do get to the end, if it has worked, we can get it over. So how often does the phrase that's ridiculous preface a breakthrough? Maybe all the time.
Starting point is 00:21:43 It's probably, yeah. In the history of science and engineering, a lot. Yeah. Let's talk about this new program called the Optics for the Cloud Research Alliance. It's a kind of big consortium of researchers, academia, MSR is obviously part of it. Why do we need this and what are its goals? Within the team in Cambridge, we started to do a lot of stuff looking at, you know, broadly in different ways, optics for the cloud. If you think of it today, optics are used in the cloud.
Starting point is 00:22:29 We use it to replace copper wires. We have a fiber optic cable and you use photons rather than electrons. But, you know, the question was, could we do more things? Can we make all optical networks? Can we build our optical compute? Can we do optical storage? Now, as MSR, we don't have a lot of people with that kind of experience. So one of the great things we seemed to do was to put together this research alliance to go out and find universities that we can partner with. We can talk to them. We can tell them about some of the problems we're having in the cloud and let them think about the technologies they're currently working on, how they relate to that. It allows us to have conversations and it allows us to sort of build a community thinking around how this optics of the cloud could look like.
Starting point is 00:23:06 And I mean, one of the interesting things is, you know, people often say, well, why did it start in the UK? And we started it in the UK. And I think, you know, historically, we had companies like British Telecom, who were the telephone company in the UK, who invested a lot in optics, particularly in the area of sort of optical networking and things like that. So we have these great places like Southampton University, like UCL, teams full of people with a huge amount of experience of using optics in different ways. And in fact, the silica stuff, you know, it started at the University of Southampton. You know, they had this breakthrough of how to store data in glass. And we've been able to sort of help grow that, help expand that. It's a very sort of powerful way to partner with these people and to collaborate. So, Mark, does it percolate up to you? I mean, this is a research alliance with a bunch of academics and you're over here in Azure running the joint. Is it opaque what they're doing,
Starting point is 00:23:54 except as it presents itself to you with new technologies? No, not completely opaque. I mean, one of the ways that Azure and MSR stay very connected is tight interactions that are constantly ongoing. So it's not like we drift apart and then MSR shows up at the door and says, hey, we're thinking of this and it's project's been going for three years and can you take it and use it? We realized a long time ago that we needed to start engaging very early in these things. So, in fact, in Azure, I am the champion for the MSR-Azure relationship. As part of that, I run a bi-yearly summit, effectively, between Microsoft Research and Azure, where Microsoft researchers will come and present on projects that they've started on, ideally early enough, that we can start to influence them if they look interesting to Azure.
Starting point is 00:24:41 We've got one coming up just actually next week. This one we're actually doing a little bit differently. And at the previous ones, I've stood up to the researchers and told them a little bit about what Azure's challenges are and what our priorities are. At this one, we're bringing in the heads of the Azure engineering groups to each talk about their own areas. So they'll get a deeper view of what Azure is looking at. And this might hopefully, you know, spur them to say, wait a minute, I'm passionate about this technology. This is a problem that Azure has.
Starting point is 00:25:07 Maybe we can apply it there, and then that'll start the formation of a new project. And of course, some of the leaders from Azure were in MSI, like Albert Greenberg and David Maltz and people like this. So there's quite a cross-pollination of people as well, which helps that communication channel going, because, you know, you're talking to someone that you,
Starting point is 00:25:24 you know, 10 years ago, you knew because they were in the office down the corridor from you. And now they're over in Azure. You don't just stop communicating because you change, you know, it's good. Let's talk about Project Silica or Azure on Glass, as it were. And you've referred to this as the journey from metal to glass. I don't know if that's a rock and roll thing or what. Let's go deep. Tell us all about Project Silica. It's not incremental. It's disruptive. It's brand new, ground up, clean slate technologies. Tell us about it, how it came about, how it works and why it's cool. So how did it come around? I guess, you know, you've got a piece of glass, you fire a femtosecond laser into the piece of
Starting point is 00:26:05 glass and you focus that laser in a particular point in the piece of glass. And what happens is the structure forms a nanograting and the structure has a sort of orientation to it and a depth and you can control the orientation and the depth. That was sort of the discovery at Southampton that you could fire these in, create these nanogratings and then control them and therefore encode potentially data in them. And you can think of this piece of glass as having layers of data in it. So it's a sort of three-dimensional type of storage. So we've been working very hard to work out the number of layers that you can have, how
Starting point is 00:26:36 you can actually read the data out from these layers, and how you can control the laser in order to be able to actually form the voxels that you want with the right properties that you want. Go in on what a voxel is, because not everyone will know. So it's just a little structure. You can think of it as an upside down iceberg almost. And it has depth to it, and you can control that. But it also has a sort of little lines on it, a little grating on it. And you can control the angle at which that grating is formed. And it's really small. Think of it, you know, sub micron. And that's how you encode data in the glass. Then you can make lots and lots of them within the glass. And once you've made that change to the glass, that change is going to exist in the glass
Starting point is 00:27:13 for hundreds of thousands to millions of years. And quartz glass, it's a very pure type of glass. And once you've made those changes, it doesn't sort of decay. If you think of things like hard disk drives, you get bit rot, you get bits that flip because of things that happen. With this, if it's there, it stays there. You can boil it in water, you can put it in a microwave, you can put it in an oven, you can do almost anything you want to it and you're not going to actually destroy the data that's in it.
Starting point is 00:27:39 That's one of the things that's really quite exciting about it is that this fact that you don't need to ever really check it. If the data's there when you've written it and you can read it out after you've written it, in 100 years' time, you'll be able to read it out again. All right. Does that make sense? Well, to someone it does. Mark, does it make sense to me? Okay, that's all right. Well, okay. So let's talk a bit about compelling applications for the kind of data that you would want to put here. When I talked to you before, you referred to three kinds of storage, hot, warm, and cold.
Starting point is 00:28:14 Tell us a little bit about the different needs for storage and how this plays into that. Sure. So, you know, hot data is data that you're accessing all the time. And, you know, if you think of things like're accessing all the time. And, you know, if you think of things like email in your inbox, that's the email that's arriving today. You're looking at it, you're probably looking at a lot of different devices. That's the data which is hot. As data gets older, it sort of becomes less hot often. And so it becomes sort of warm. So you could think of that as the emails you perhaps received last week or last month. And then you've got the stuff that you perhaps, you know, never look at. Your emails from four or five years ago, which you just rarely have ever looked at. And that's really cold
Starting point is 00:28:49 data. And there's all sorts of things which end up being cold. And with the glass stuff, we're really looking at the cold data, the data which basically is just sat there. And it's only read when it's really needed. Examples of this are sort of backups of your photos and things like this. Are you enabling me to be a hoarder, a digital data hoarder? Yeah, I mean, that's the idea. I mean, if you think of it at the moment, right, actually, you know, one of the challenges is storage costs, right? So you only store the data that you can really afford to keep, right? As you reduce the cost of storing data, you can store more and more of it because you can afford to store data that you think is cheap. And I have this example, which is, you know, probably five years ago, when you took
Starting point is 00:29:28 pictures on your phone, you would go through and edit those pictures and decide which ones to keep and which ones to throw away. Today, you probably just don't bother. You just keep them all. I've got 32,000 pictures in the cloud. There you go. That's the benefit of cheap storage. Or the detriment. But, you know, that's a lot of the thing about this data that we keep. Sometimes we don't know the value of it when it's generated. It's in the future that you suddenly discover, you know, that something was valuable and you just got rid of it because at that moment in time,
Starting point is 00:29:55 you didn't realize it was going to be valuable. Right, right, right. Well, Mark, talk about some compelling applications from an Azure perspective. Why would we want phenomenal storage power in this itty-bitty living space? Well, like Ant was saying, I think there's lots of applications where you've got data that you're generating that you never know when you'll need to go back and look at it. So archival data, data that's being generated from the ambient environment or from systems that we use or from business processes, that can be potentially interesting as we develop new ways to get insights out of data. So if you take a look at, for example, even the U.S. Census, where they're constantly going back, there's demographers that are looking back historically to see trends and try to identify patterns in the past that
Starting point is 00:30:40 might be applicable in the future. There's huge amounts of data that they typically have had to throw away just because we can't store it all and we can't process it even if we could store it. Now with the cloud scale and these kinds of technologies, you can keep all of that and you don't know what's going to be useful or not. In the future, at some point, we might come across something, some question or some new way of looking at data that we can go back now and get insights and use out of that previous data. But one of the most exciting areas that we've seen immediate interest in is from media and entertainment. Oh, interesting. Because if you take a look at these companies, their total IP is the content that they generate. Right. And so they want to preserve that. In many cases, they take a piece of content they've generated and they use that in multiple ways in the future.
Starting point is 00:31:28 And they don't know how they'll want to use it in the future. So they keep it forever because it might be eventually valuable. If you take a look at, for example, movies, they'll store movies. And when a new type of format comes out like 4K or 8K or Blu-ray or whatever, they'll re-release the movie in that new media format. Or they'll take clips from old movies and use them in new productions. So they keep everything. Now, the only way that they trust the storage of this media, because of some of the issues that Anne's talked about with things like Flash and hard disk, is they'll take it and they'll put it in analog form on film, even if it was digitally
Starting point is 00:32:06 mastered. And they do that because film actually has better properties in terms of degradation than some of these digital media. They'll take it, put it on film in black and white, and then have other strips of films in the color separations. They'll roll them up into canisters, make two copies and put them in salt mines. And whenever they need them, they'll go back and read them out. So wait, that's the state of the art right now, is going back to film. Yep. And I mean, you can already think about the kind of costs and issues with that.
Starting point is 00:32:40 This is because media that lasts for a really long time just doesn't exist. I mean, if you think of all the media we talked about today, hard disk drives, tape, flash, you really wouldn't trust any of them for more than about 15 to 20 years at most, right? Well, and even film. Yeah. I think we may have thought at some point we need to get film onto something else because it degrades. Yeah. I mean, so they vacuum seal these.
Starting point is 00:33:00 They put them in this very cold environments with no light. And so they do last longer than, know film that's laying around but still that's the only media that they've actually trust with this kind of long-term data so when they heard about this because we've talked to someone they immediately got excited because now they can write it once now they never have to go back and refresh it and it's 100 durable and it's the cost that we imagine here because the crystal glass is incredibly cheap. Cost is not actually a big factor for them. It's really the other factors. So you could put Gone with the Wind on a piece of glass and then use it as a trivet if you'd like. No, I'm kidding. Yeah, it's actually fun. Ant's Lab put one of my novels on a
Starting point is 00:33:42 piece of glass, Zero Day. So now I've got that preserved for millions of years. Okay, so that actually opens up a couple questions. One being, this has application for any kind of cultural artifacts that we might want to keep, literature, film, music, etc. Can we talk a little bit about how you get to it? I mean, I know I have, like I say, 32,000 photos. Part of my problem isn't that I have them. It's that I can't find what I'm looking for. Yeah, well, that's a research problem that we haven't been able to crack down on.
Starting point is 00:34:14 It's one of those foundational, fundamental questions. We're ready to solve the actual story in the bits. How do you actually find those bits? That's up to you. You haven't sorted through your stuff. Well, no, seriously, how do you actually find those bits? That's up to you. You haven't sorted through your stuff. Well, no, seriously. How do you address access? Well, I think, you know, increasingly there's sort of like machine learning technologies
Starting point is 00:34:32 that are able to take images and recognize things in them. And so you can do search and things like that over your image sets. So, you know, show me pictures of, you know, person X outside. And that's actually what they would do with movies, for example. They'll have machine learning algorithms run on the media and say character X is at minute 43 to minute 44, 33 seconds. And that goes into an index. So when they say, I want clips of this person, they can go quickly find them.
Starting point is 00:35:00 Brilliant. Yeah. Because the thing a lot of media is historical media, right? And so it's like some event happens, somebody dies or something happens and you want to find things relevant to that. And you don't want to pull back the whole archive just to find two hours of video out of a thousand hours or something. What about news organizations?
Starting point is 00:35:20 Is that another area? Absolutely. I mean, what Ant was just referring to, historical footage. You know, when a politician stands up and says, I never said that, they can just quickly go back to the index and very quickly say, yeah, you actually said it right here. Listen, this whole thing is a Black Mirror episode anyway, because it's like that, you know, rewind thing. Well, we're quickly moving from a terabyte
Starting point is 00:35:45 and petabyte era to a exabyte, zettabyte, yottabyte. But let's just say the cloud is a data beast, and it needs to be taken care of. So not only do we need storage, we need other aspects to make sure that things work. And we've just alluded to access. What about compute and networking? What do we need there and what's being worked on? Well, so there are, I think, a few things. If you take a look at processing data and doing it very efficiently, you want to get the data to the compute very quickly.
Starting point is 00:36:19 So looking at reducing that latency, whether that means putting the compute right next to the data or having very low latency connections between the data and the compute. And then large scale paralyzed compute, because oftentimes being able to efficiently process data requires more than just one computer, one CPU requires a fleet of them. So being able to have a very low latency, high bandwidth interconnections between the computers themselves. Large AI model training is an example of that kind of workload that requires this kind of very low latency, high bandwidth connections between the different computers. And lots of the larger machine learning models don't fit on a single system or even a dozen.
Starting point is 00:37:10 They require hundreds or thousands of them. For example, our open AI announcement that we made as building an AI supercomputer with them, that's going to have these characteristics of having a large scale data plus compute, plus networking, all working together. Yeah, we're doing some of the work on optical networking in the lab in Cambridge in Cirrus, and that's looking at how to build out these all optical networks for these kinds of things, which is all about trying to reduce the latency and Make the networks effectively higher bandwidth lower latency, which is lower power lower power Which is lower cost lower cost and higher density to which is also lower cost. Yeah, so you referred to that as serious Yes, serious project serious is another one of the bodies. Are you serious but no serious?
Starting point is 00:37:45 Yeah, I think somebody from Not like, are you serious? But seriously. Serious black. Yeah. I think somebody from Azure probably said, are you serious? That's ridiculous. Well, it's about now that I like to have a short conversation on the spectrum of risks associated with people's work. I call this the what keeps you up at night section. What could possibly go wrong with anything that the two of you are working on collectively in this area? So one of the things that keeps me up at night, particularly with our storage that the two of you are working on collectively in this area? So one of the things that keeps me up at night, particularly with our storage work,
Starting point is 00:38:31 is if you have a storage media, you expect it to store things. And we're making decisions today, and we're making design choices today that potentially in 50 or 100 years' time, we'll want to make sure that they can still get the data out, even in 1,000 years' time or 10,000 years' time. I often think, are the decisions we're making today the right decisions to enable that to happen and you know if people store really important things like movies or their lives photos in it you want to make sure that it's really there when you say it will be there and that's keeps me awake a lot of time it's a big responsibility
Starting point is 00:39:00 in my view you mark I have a high confidence in ants team and partnership we've got going with them gives me high confidence that we're actually going to get there on this one. I think what keeps me up at night in this space like this in general is somebody else coming up with something that disrupts the space and leaves us behind. You mentioned it's a competitive environment. And so this is another reason why it's so important for us to be exploring these different areas that initially look like have little chance of success, but if they do, would massively transform things. And this is one of those examples. So Mark, you actually have a unique story
Starting point is 00:39:39 about how you came to Microsoft, but people can find that story in Wired Magazine. So I want to just ask Ant here about his path to Microsoft, but people can find that story in Wired magazine. So I want to just ask Ant here about his path to Microsoft, Microsoft research, and if there's anything, Ant, that you have had happen in your life that's informed or influenced your career that landed you here doing the work you're doing. So I moved to Microsoft about 20 years ago, and before that, I was at the University of Cambridge as a research associate. And one of the really interesting things was I worked with a guy called Andy Hopper there. And I finished my PhD and went to work with him.
Starting point is 00:40:15 And as many academics do, or as many young PhD students becoming academics do, I just sort of worked on the next thing that was falling off from my thesis. And he came to my door and said to me and I want you to build me a robot football team to compete in the 1998 Robo Cup in Paris which is when the World Cup in Paris was I sort of ignored it carried on and a few weeks later he came back and he said you know and I really want you to build me a robot football team so I build a robot football team and they play on a table tennis table. And there's like five robots, and they were all autonomous.
Starting point is 00:40:48 So you sort of press go, and they had to play football. I still have the robot on my shelf in the office. But, you know, the one thing that it did for me was that it really just taught me that I could dive in. I mean, I used to work on digital systems and programming languages and where they met. And I was ending up building robots and working with a lot of other people building robots and building systems like that. Throughout my career at Microsoft, that's given me the confidence to just dive in and do other things. And I think that's a moment for me that's had a big influence on my career. As we close, there are 7 billion people on the planet, 4 billion of them are online
Starting point is 00:41:20 and computing is moving to the cloud. Ant and Mark, any parting shots? What would you like to say to our listeners in the form of advice or inspiration, wisdom or warning, or just plain predicting the future? What's on the horizon from each of your perspectives? I don't think I'd give any warning. I'd give encouragement. Again, it's an incredible opportunity
Starting point is 00:41:38 to really shape the future. And it's not just moving to the cloud. I think the important aspect to this is the edge as well, because computing is getting more organized on the edge, and you'll see more of it on the edge than what we've seen up to now, which is why Microsoft's focused on intelligent cloud, intelligent edge. So that's a huge part of our charter, and Azure, too, is looking at that space as well. And there's a bunch of problems and new technologies that need to be developed there.
Starting point is 00:42:03 And how about you? As I said earlier about the technologies and existing technologies getting to the end of their lifetimes or at least getting to the stage where it's not so easy just to get the next cycle out of them, I think it's just a huge opportunity to change things. And I think as a researcher, anywhere where there's a sort of huge opportunity for things to be done differently. It's just really exciting. This has been incredibly fun. Thank you, Ant Rostron and Mark Rusinovich, for joining us in the booth today. Thank you. Thanks.
Starting point is 00:42:41 To learn more about Dr. Ant Rostron and Mark Rusinovich and how Microsoft is working to meet all your digital storage needs, visit Microsoft.com slash research.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.