Microsoft Research Podcast - 097- Optics for the cloud: storage in the zettabyte era with Dr. Ant Rowstron and Mark Russinovich
Episode Date: November 6, 2019Remember when a hard drive that could hold a terabyte of data was a big deal? Well, we’re now in an era where peta-, exa- and even zetta-bytes are the bytes of the day, and it turns out it’s hard ...to fit that many zeroes on a hard drive. That’s where Dr. Ant Rowstron, Deputy Lab Director of Microsoft Research Cambridge, and Mark Russinovich, Chief Technical Officer of Azure, come in. Their respective teams are working on paradigm-breaking solutions to give us phenomenal storage power in an itty-bitty living space. Today, Ant and Mark discuss their roles in the development of new optical technologies, like Project Silica, for cloud-scale storage demands, and talk about the Optics for the Cloud Research Alliance, an exciting new collaboration between academic researchers and MSR. They also explain how just the right mix of innovation and engineering can make the cloud more powerful and less expensive to use and, at the same time, deliver “forever” storage that’s both dishwasher and microwave safe! https://www.microsoft.com/research
Transcript
Discussion (0)
Welcome to another two guest episode of
the Microsoft Research Podcast,
where we're exploring the latest in
disruptive technologies for data storage in the Cloud.
In an era in which we're pushing
the very limits of our current storage technologies,
who better than the CTO of Azure and
the Deputy Lab Director of MSR Cambridge to give us
the lowdown on the promise of new technologies like
Quartz Glass for storing and protecting all things digital. our Cambridge to give us the lowdown on the promise of new technologies like quartz glass
for storing and protecting all things digital.
You're listening to the Microsoft Research Podcast, a show that brings you closer to
the cutting edge of technology research and the scientists behind it.
I'm your host, Gretchen Huizenga.
Remember when a hard drive that could hold a terabyte of data was a big deal?
Well, we're now in an era where peta, exa, and even zettabytes are the bytes of the day.
And it turns out, it's hard to fit that many zeros on a hard drive.
That's where Dr. Ant Rostrin, Deputy Lab Director of Microsoft Research Cambridge,
and Mark Rusinovich, Chief Technical Officer of Azure, come in.
Their respective teams are working on paradigm-breaking solutions to give us phenomenal storage power in an itty-bitty living space.
Today, Ant and Mark discuss their roles in the development of new optical technologies
like Project Silica for cloud-scale storage demands,
and talk about
the Optics for the Cloud Research Alliance, an exciting new collaboration between academic
researchers and MSR.
They also explain how just the right mix of innovation and engineering can make the cloud
more powerful and less expensive to use, and at the same time, deliver forever storage
that's both dishwasher and microwave safe.
That and much more on this episode of the Microsoft Research Podcast.
The word excited is overused, but there's really not a better word for how I feel about the guests I have in the booth today.
First, we have all the way from Cambridge, UK, Dr. Ant Rostron, who's the Deputy Lab Director of Microsoft Research Cambridge.
And with him, I have Mark Rusinovich, who is the Chief Technical Officer of Microsoft Azure.
Welcome, Ant and Mark. Mark and Ant.
Thank you. Thanks.
Your titles give us a picture of who you are at Microsoft, but they don't really say what you do.
And I'd like each of you to give us a little more detail on what gets you up in the morning.
Maybe a sort of elevator pitch on what you do and why you do it. Ant,
why don't you give us a start here?
Okay. So, you know, basically, as Deputy Lab Director in Cambridge, I come in every day
and I get to work on cloud infrastructure and future technologies for the cloud.
And it's really fun. We're doing a lot of stuff. We're using a lot of high-tech stuff. We're doing,
you know, a lot of stuff around optics, femtosecond lasers and things like that. And so,
yeah, I come in every morning and I get to play with the greatest tech and try to create technologies
that will help the cloud in the future.
As a deputy lab director,
are you doing research yourself
in addition to sort of overseeing the lab work?
Yeah, so I try to spend some of my time each week
actually doing research as well.
And I'm very involved in a small number of the projects
and I help oversee a number of the other projects
as well in the group.
As some people say, to keep your research muscle strong.
Yeah, that's right. That's right.
It's the most exciting thing. It's the bit I really enjoy, coming in and doing stuff.
Mark, what do you do?
So my role is leading technical strategy and architecture for Azure, our cloud platform.
So every day is completely different.
It could be meeting with customers, meeting with internal partner teams, meeting with engineers deep on teams, writing papers, reading papers, meeting with analysts. I mean, every day
is a new adventure. So as chief technical officer, how technical do you get to get into the weeds?
I get into the weeds. Yeah, that's the fun part. So I think that that is a really great part of
the job is that I can be at a very high level like strategy,
like directions, trends, and then get down into implementation details. In fact, I'm involved
with a couple of the projects at MSR Cambridge at a very deep technical level. Well, perhaps to
some of our listeners, you aren't obvious booth mates. So let's explain why you're both here in
the booth together.
There's an interesting story of how Azure met Optics during a disruptive technology review,
where researchers present the work they're doing to senior leadership team members.
How did that go down? Ant, why don't you go first? Okay. So, I mean, you know, it's really interesting. We've been working with the
Azure storage guys for some time on some storage projects,
looking at trying to use traditional technologies to sort of store long-term data.
And it had got to the stage where I think, you know, both sides had realized that we'd pushed the existing technologies to the limits,
and both sides were looking around for new technologies.
And it happened that someone in the Azure Storage side and us both saw a sort of a newsfeed article about a breakthrough at Southampton and how to store data in glass.
And we both approached from very different angles.
The Azure storage side, that person flew over to Southampton and had a look at it.
We were there downloading the papers, trying to read them, trying to understand them.
As computer scientists, it was quite hard.
And then we started to talk about how the technology could be taken forwards and
Trying to convince the company that this was going to be something worth doing
We started getting it ready for this disruptive technology review
We reached out to mark to help us because it was sort of key to get both sides really bought in
To what was happening and then we got the opportunity to present it to the SLT and Jason was there and you were there and
That really led to the point where people said, hey, let's make it work together.
Let's work collaboratively between Microsoft Research
and Azure Storage to get it forward.
And so that's how Mark got involved
and how it sort of got going.
MARK MANDELMANN Yeah, in fact, I mean, I remember,
so meeting with Ant before this,
where I got my first overview of what the project
was all about and the promise of it,
the disruption technology review happened,
the SLT was like, this looks like really exciting stuff,
but that alone isn't enough to actually cause things
to go forward.
What happened after that is that me and Douglas Phillips,
who was in charge of storage,
and Jason Zander, head of Azure,
met together to talk about how we wanted to take it forward.
And we'd kind of estimated with MSR Cambridge
the resources required and the amount of money required to actually get the we wanted to take it forward. And we'd kind of estimated with MSR Cambridge
the resources required and the amount of money required
to actually get the thing moving to a point
where it could potentially intersect
with production in Azure at some point.
And that was the go big or go home.
Either we had to commit and jointly fund it
with MSR Cambridge or it just wasn't gonna happen
and then we decided to do that.
You know, at the time, we didn't really have any people
who worked in the space of physicists or in the space of optics so it really was a decision point are we
actually going to start doing this and if we are we've got to go out and hire new people with new
skills that we don't have traditionally in MSR and bring them on board so it was a big decision to do
it and thankfully we made it. Mark, Microsoft wasn't first to the cloud, but it's a big player now, and you're at the technical epicenter of Azure.
How have you seen the cloud computing landscape evolve over the time you've been here, both from an industry point of view and from Microsoft's point of view?
Just in the landscape of what cloud was and what it is now, it's been a fascinating journey.
Because when I started in Azure in 2010, Azure had been underway for a few years, just commercially launched. But Azure was tiny, basically in
two data centers, a few thousand servers. And the number of companies that were saying
they were going to do cloud was probably 20, 25 companies. And about four years ago, they
were still about 17. Three years ago, we were down to 13.
Two years ago, it was down to six.
And there's still six in what Gartner classifies as their magic quadrant for infrastructure as a service.
And I think there'll be more paring down to come.
And Azure has been continuously in this strong position, rising, rising.
And a couple years ago, the market and analysts started to recognize us as the strong number two in this strong position, rising, rising. And a couple of years ago, the market and analysts
started to recognize us as the strong number two in the space.
What are the factors that go into making you a player in this field?
It's a huge amount of capital. It has to be a core business priority for the company.
Yeah.
And expertise from hardware data centers all the way up into software
stacks and massive scale distributed systems. And also time is another factor because this stuff
all needs to mature. Yeah. I think it's hard to underestimate the scale of this, right? About six,
seven years ago, I was lucky enough to go to the Dublin data center, which is, I guess, by modern
standards, a big one, but not the biggest. And in fact, then it was still being built out as it were. And we got there and you
sort of walked for just mile after mile or kilometre after kilometre, depending which
country you're in. And you just saw servers and it's just room after room after room. And each
time you go in it, it is just thousands and thousands and thousands of servers. It was just
a moment where, you know, it actually changed the way that I thought about research
because there was no point in trying to do the stuff
that we were trying to do
because when you realize how it operates,
and in that scale, you're just wheeling in a rack.
You don't care about it.
They're just delivering racks,
and they have, you know, there's sort of a production line.
You know, racks roll in at one end.
It comes through a center.
It's rolled in, plugged in, a few cables, power.
And just the scale was just phenomenal. And I think unless you've actually been to one of those data
centers, it's often hard to sort of appreciate. The ironic thing is that there's many people
working in Azure that have never been to a data center, so they have not experienced it.
They should go.
Yeah, I know they should.
It just reminds me of that final scene of Indiana Jones, where they put the Ark of the Covenant in
an army storage house, and then the camera pans back, and it's like, you'll never find it.
Yeah.
Yeah.
In fact, you have this great photo, I think, that you show sometimes, which shows
sort of like a data center from the front, and you go, this is like the size of a football
pitch.
And then you show it from the other side, and it's like 10 times bigger, and you say,
and this is what we're currently building out to extend that one data center.
It's just unbelievable.
I love that Ant says football pitch
and he probably means soccer field.
Soccer field, yeah.
I translated that.
I translated that.
All right, well, let's talk about where and how we store our digital stuff, because this is a huge part of what we're talking about here. We have storage units, and I put air quotes around that
right now, for our digital stuff, but they weren't designed with the cloud in mind. And so, Ant,
what are we up against with cloud-sized demands on current technologies?
Is Moore's Law really dead?
And what are researchers exploring in optical technologies to help us out?
If you think of most of the technologies we use to store data today,
things like flash, things like hard disk drives, things like tape,
it's true to say that they were all designed before the cloud existed.
And in fact, they were all designed to work in multiple scenarios. You know, the flash chips have to work in your phone as well as in the data centers,
or the hard disk drives have to work in the desktop machines as well as in data centers. So
there's compromises that have been made. And in fact, many of those technologies, if you think
of a sort of S-curve where you've got time and you've got the metric of interest, you get a new
technology comes along like a hard disk drive. It starts off very slowly, it then hits its sort of main speed, and it's sort of going nicely,
it's doubling every two years or three years if you're a hard disk drive. And then you get to the
point where it actually stops just basically exploiting the property they were exploiting,
and it rolls over and it becomes flat. I think one of the challenges for us as a company and
us as an industry is that many of the technologies that we rely on are beginning to get to the point where either they're at the end or they're starting to get to the point
where you can see the end. And of course, Moore's law is a well-publicized one and we hit it some
time ago. And that's a great opportunity because whenever you get that rollover, you get an
opportunity to better do things differently, to have new ways of doing things. I mean,
we used FPGAs a lot and this was something that we used to actually allow us to do stuff
on the computational side. On the storage side, we started to look at these optical technologies and
to see how we might exploit things and new discoveries in order to create new storage
technologies and new storage media that we could use when the traditional ones are reaching the
end of their life cycle. Right. So I want to drill in just a little bit there, because when we talk
about technologies that are invented before something else happens, now you have an opportunity with optics to build from the ground up. Does that
affect the way you think about designing things? Does it change your framework?
Yeah, I think it does. I mean, I think it goes back to this point about going around a data
center. When you actually see the scale of a data center, you start to think about things very
differently. You start to think about what does it mean to actually make something that works in that environment.
And those trade-offs that you can make,
perhaps it's, you know, something can be a bit bigger.
It doesn't need to be a particular size.
It can be long and thin.
It can have different dimensions
because that's not the dominating factor
is to get it under a desk as well as in the data center.
Right, right, right.
Mark, do you have anything to add to that?
Well, I think kind of related to the scale that Azure is going towards,
I think we're going to talk about the scale of the data that's coming to the cloud
and being generated in the cloud, and that's the other dimension here.
Once you get to that kind of scale, small optimizations, much less big ones,
have massive impact on cost of storage, cost of keeping the media fresh.
I mean, when we take a look at the amount of data
that's being generated in the cloud and the way that that data is managed by existing technologies,
which have limited lifetimes, and we're talking about here in the cloud, there's going to be data
that's going to be persisted for decades or hundreds of years. And media like Flash needs
to be constantly scrubbed. Tape needs to be refreshed periodically. And if you
think about like tape, archival media, petabytes, exabytes, zettabytes of it having to be constantly
read and rewritten, that's part of the cost model in these existing technologies.
Mark, Azure has, to use a poker term, kind of gone all in on optical technologies and become a leader in this arena,
both in thought and deed, as it were. So why is optics important to Azure? And where are these leading edge technologies in the product pipeline for you guys? And how does Microsoft compare
with other players in the cloud market space as it pertains to optical technologies?
As far as I'm aware, we're the only cloud that's pursuing this kind of technology. Now,
others might be pursuing it and being quiet about it, but I think that that's
a differentiation of Microsoft and Microsoft Research with regard to exploring these things,
and that is that we openly talk about it. And I think that does a few things. One is
attracts researchers and engineers that are interested in working on cutting-edge technology
when they get excited about being able to really disrupt the future by working on a project like
that. The other one is that it shows our customers that we're a cloud that's constantly innovating
and exploring new ideas and looking at ways we can make the cloud and their usage of the cloud
much more cost-efficient and powerful. So both of
those I think are reasons why we are so transparent about the things we work on.
Now as far as where it is in the pipeline, we don't know exactly when this
is going to land in a way that it works at scale in the environments that we're
going to be deploying it in. There's still a lot of questions to be answered.
There's some fundamental questions that ANTS is still exploring the answers to, and I think that once we answered. There's some fundamental questions that Ant's still exploring the answers to.
And I think that once we get the answers to the fundamental questions, that's when you start to engineer for production and get it to the point where you can stamp these things out and deploy them around the world.
One of the great things is working so closely with Azure means that as we do answer more of these questions and we get more of a handle on it, we're actually answering the questions that make it more feasible for it to be deployed in Azure. And I think that's one of
these things about working in such close collaboration with the teams that are actually
trying to deploy storage today and trying to use storage today. It's a really powerful thing. You
get that feedback loop going, you get people actually sort of able to consume the technology,
which is often very hard for a new technology. And that allows you to zoom in on the answer,
an optimal answer for what comes up.
What are those fundamental questions that you still haven't answered, Ant, and why haven't you?
What's taking you so long?
That's right.
Dang it.
So, you know, we're at the stage where we're still trying to really control the technology.
We're still trying to understand how to improve the performance and the way in which it operates.
When we first saw the technology, you walked into a room in Southampton,
and they were effectively writing a few bits per second, reading a few bits per second.
And you've got to get something that is able to do hundreds of megabytes per second.
There's a long way to go.
And to get there, it's a mixture of engineering and innovation.
So some of the problems you engineer out, other problems you have to innovate around.
And we've been quite successful at innovating around some things.
We've also been quite successful at engineering things.
But we've still got a long way to go, I think, to make it really happen.
But we can see the path, which is always good.
Right. I mean, that's half the battle, is seeing where you're going.
Well, let's talk for a second about the balance between pushing the limits of existing technologies
and exploiting recent discoveries in new media like Glass. We'll go deep into Project Silica in a
bit but first talk briefly about the current state-of-the-art and storage
technologies along with some other explorations across the labs in other
media. Today we're using tape, we're using hard disk drives and we use flash in the
cloud and you know a few years ago the team in Cambridge worked very hard at trying to build out the lowest possible cost storage using hard disk drives.
And, you know, we went to extreme lengths.
In a rack, normally you have a couple of hundred fans.
We built a rack with just three fans.
You have sort of 10 kilowatts going in to power it.
We got it down to a few kilowatts.
And I think the thing that you realize when you do this sort of thing is as you push the envelope more and more and more you just hit the limitations and in fact you know
we were sort of getting to the stage where you pick up a hard disk drive and thinking i wish i
could crack it open and start to change things inside it because i've done everything else i
can do outside of the hard disk drive and i think this is a challenge in things like azure you get
components coming in and some of them you can really change some of them are more fixed and
one of the things
i think about this sort of state-of-the-art stuff and picking new media is it's really about then
going to the next level you know we talk about this vertical sort of integration going from the
really the bottom layer up whether it's glass whether it's dna well what if you can start from
nothing and work up the stack and not have these kind of constraints of what is currently done
right or boundaries often artificial boundaries
about what you can and can't change.
The fascinating thing about some of these things, to me,
and this really highlights the power of MSR,
is that we're very much heads down in Azure,
looking really kind of a short term,
maybe a little bit longer term in some cases,
but really focused on things that look like
we've got to really nail to be successful.
Meanwhile, MSR Cambridge has a lot more flexibility and freedom to go
and just explore ideas that sometimes look kind of out there.
Well, and that actually brings up my next topic,
which is the unique relationship between research and product teams here at Microsoft.
And I would love to know from each of your perspectives
how important this partnership is in the cloud era.
According to many of my guests who sat in your chairs,
this is one of those historical turning points
like the Xerox PARC people defined computing
for the next 60 years back in the 70s.
How are research and product working together on this?
And what do each of you get out of the relationship?
Like I just was talking about, the fascinating and really exciting part of MSR is that they can go start to explore these things.
Even Ant was exploring glass before we had the opportunity to tell him, this looks ridiculous, stop looking at it.
You know, he got it to the point where we could see the potential value and promise of it.
But you're right about this being a turning point.
This is a massive shift in technology that's happening right now,
going from the client-server era really into the cloud era and the cloud plus edge era.
And we're defining what that world looks like right now.
The systems we define, the services we make, the APIs we create,
the technology that we use to power it,
all of it is open for disruption, for being new. And some of it has to be new just because it
doesn't exist yet. So like Ant was talking about earlier, the opportunity to have this incredible
impact on the future with something like Glass. Potentially, you know, if this plays out in two decades,
everybody will be like, oh, yeah, of course, Glass.
And that's just the way that you store data long-term is Glass.
Right now, nobody's thinking about it,
but that's incredibly exciting, fun.
Yeah, I mean, I think one of the great things
about where the cloud is at the moment
is that you guys are so innovative,
and you are so hungry for innovation, right?
I've been at Microsoft for 20 years,
and it's rarely that I've worked with product groups that are, you know, so innovative and you are so hungry for innovation. I've been at Microsoft for 20 years, and it's rarely that I've worked with product groups that are so open.
And I think it's because it's a competitive world out there.
I mean, it's like, you know.
I've heard that.
Yeah.
And, you know, anything you can do to add innovate makes sense.
So I think from a research perspective, it's really great
because you get to work with people, they get the real-world experience,
and they tell you about the things that are working and are not working. And then the flip side is, you know, within
MSR, we get the space to fail. And I think often when you're in a product group, people
want to work on things that ship and are successful. And so, you know, in MSR, we can really do
high risk stuff. And failure is almost as good as making it work because at least you
tried to make it work.
And you learn something.
And you learn something. And that's exciting. And I think that's something that's really
great about this whole sort of MSR relationship. And I think that's something that's really great
about this whole sort of MSR relationship
with product groups
is that we get that space
and yet we get that input
that allows us to really make sure
that when we do get to the end,
if it has worked,
we can get it over.
So how often does the phrase
that's ridiculous
preface a breakthrough?
Maybe all the time.
It's probably, yeah.
In the history of science and engineering, a lot.
Yeah.
Let's talk about this new program called the Optics for the Cloud Research Alliance.
It's a kind of big consortium of researchers, academia, MSR is obviously part of it.
Why do we need this and what are its goals?
Within the team in Cambridge, we started to do a lot of stuff looking at, you know,
broadly in different ways, optics for the cloud. If you think of it today, optics are used in the cloud.
We use it to replace copper wires. We have a fiber optic cable and you use photons rather than
electrons. But, you know, the question was, could we do more things? Can we make all optical networks?
Can we build our optical compute? Can we do optical storage? Now, as MSR, we don't have a lot of people with that kind of
experience. So one of the great things we seemed to do was to put together this research alliance
to go out and find universities that we can partner with. We can talk to them. We can tell
them about some of the problems we're having in the cloud and let them think about the technologies
they're currently working on, how they relate to that. It allows us to have conversations and it
allows us to sort of build a community thinking around how this optics of the cloud could look like.
And I mean, one of the interesting things is, you know, people often say, well, why did it start in the UK?
And we started it in the UK.
And I think, you know, historically, we had companies like British Telecom, who were the telephone company in the UK, who invested a lot in optics, particularly in the area of sort of optical networking and things like that. So we have these great places like Southampton University, like UCL, teams full of people with a huge amount of experience of using
optics in different ways. And in fact, the silica stuff, you know, it started at the University of
Southampton. You know, they had this breakthrough of how to store data in glass. And we've been
able to sort of help grow that, help expand that. It's a very sort of powerful way to partner with
these people and to collaborate. So, Mark, does it percolate up to you? I mean, this is a research alliance with a bunch of
academics and you're over here in Azure running the joint. Is it opaque what they're doing,
except as it presents itself to you with new technologies?
No, not completely opaque. I mean, one of the ways that Azure and MSR stay very connected
is tight interactions that are constantly ongoing.
So it's not like we drift apart and then MSR shows up at the door and says, hey, we're thinking of this and it's project's been going for three years and can you take it and use it?
We realized a long time ago that we needed to start engaging very early in these things.
So, in fact, in Azure, I am the champion for the MSR-Azure relationship. As part of that, I run a bi-yearly summit, effectively, between Microsoft Research and Azure,
where Microsoft researchers will come and present on projects that they've started on,
ideally early enough, that we can start to influence them if they look interesting to Azure.
We've got one coming up just actually next week.
This one we're actually doing a little
bit differently. And at the previous ones, I've stood up to the researchers and told them a little
bit about what Azure's challenges are and what our priorities are. At this one, we're bringing in the
heads of the Azure engineering groups to each talk about their own areas. So they'll get a deeper view
of what Azure is looking at. And this might hopefully, you know, spur them to say, wait a
minute, I'm passionate about this technology.
This is a problem that Azure has.
Maybe we can apply it there, and then that'll start
the formation of a new project.
And of course, some of the leaders from Azure
were in MSI, like Albert Greenberg and David Maltz
and people like this.
So there's quite a cross-pollination of people as well,
which helps that communication channel going,
because, you know, you're talking to someone that you,
you know, 10 years ago, you knew because they were in the office down the
corridor from you. And now they're over in Azure. You don't just stop communicating because you
change, you know, it's good. Let's talk about Project Silica or Azure on Glass, as it were.
And you've referred to this as the journey from metal to glass. I don't know if that's a rock and roll thing or what. Let's go
deep. Tell us all about Project Silica. It's not incremental. It's disruptive. It's brand new,
ground up, clean slate technologies. Tell us about it, how it came about, how it works and why it's
cool. So how did it come around? I guess, you know, you've got a piece of glass, you fire a
femtosecond laser into the piece of
glass and you focus that laser in a particular point in the piece of glass. And what happens
is the structure forms a nanograting and the structure has a sort of orientation to it and
a depth and you can control the orientation and the depth. That was sort of the discovery at
Southampton that you could fire these in, create these nanogratings and then control them and
therefore encode potentially data in them.
And you can think of this piece of glass as having layers of data in it.
So it's a sort of three-dimensional type of storage.
So we've been working very hard to work out the number of layers that you can have, how
you can actually read the data out from these layers, and how you can control the laser
in order to be able to actually form the voxels that you want with the right properties that
you want. Go in on what a voxel is, because not everyone will know. So it's just a little
structure. You can think of it as an upside down iceberg almost. And it has depth to it,
and you can control that. But it also has a sort of little lines on it, a little grating on it.
And you can control the angle at which that grating is formed. And it's really small. Think
of it, you know, sub micron. And that's how you encode data in the glass. Then you can make lots and lots of them within the
glass. And once you've made that change to the glass, that change is going to exist in the glass
for hundreds of thousands to millions of years. And quartz glass, it's a very pure type of glass.
And once you've made those changes, it doesn't sort of decay. If you think of things like hard
disk drives, you get bit rot, you get bits that flip because of things that happen.
With this, if it's there, it stays there.
You can boil it in water, you can put it in a microwave,
you can put it in an oven,
you can do almost anything you want to it
and you're not going to actually destroy the data that's in it.
That's one of the things that's really quite exciting about it
is that this fact that you don't need to ever really check it.
If the data's there when you've written it and you can read it out after you've
written it, in 100 years' time, you'll be able to read it out again. All right. Does that make sense?
Well, to someone it does. Mark, does it make sense to me?
Okay, that's all right.
Well, okay. So let's talk a bit about compelling applications for the kind of data that you would want to put here.
When I talked to you before, you referred to three kinds of storage, hot, warm, and cold.
Tell us a little bit about the different needs for storage and how this plays into that.
Sure. So, you know, hot data is data that you're accessing all the time.
And, you know, if you think of things like're accessing all the time. And, you know,
if you think of things like email in your inbox, that's the email that's arriving today. You're
looking at it, you're probably looking at a lot of different devices. That's the data which is hot.
As data gets older, it sort of becomes less hot often. And so it becomes sort of warm. So you
could think of that as the emails you perhaps received last week or last month. And then you've
got the stuff that you perhaps, you know, never look at. Your emails from four or five years ago, which you just rarely have ever looked at. And that's really cold
data. And there's all sorts of things which end up being cold. And with the glass stuff, we're
really looking at the cold data, the data which basically is just sat there. And it's only read
when it's really needed. Examples of this are sort of backups of your photos and things like this.
Are you enabling me to be a hoarder, a digital data hoarder?
Yeah, I mean, that's the idea. I mean, if you think of it at the moment, right, actually,
you know, one of the challenges is storage costs, right? So you only store the data that you can
really afford to keep, right? As you reduce the cost of storing data, you can store more and more
of it because you can afford to store data that you think is cheap. And I have this example, which is, you know, probably five years ago, when you took
pictures on your phone, you would go through and edit those pictures and decide which ones to keep
and which ones to throw away. Today, you probably just don't bother. You just keep them all.
I've got 32,000 pictures in the cloud.
There you go. That's the benefit of cheap storage.
Or the detriment.
But, you know, that's a lot of the thing about this data that we keep. Sometimes we don't know
the value of it when it's generated. It's in the future that you suddenly discover, you know,
that something was valuable and you just got rid of it because at that moment in time,
you didn't realize it was going to be valuable.
Right, right, right. Well, Mark, talk about some compelling applications from an Azure perspective.
Why would we want phenomenal storage power in this itty-bitty living space?
Well, like Ant was saying, I think there's lots of applications where you've got data that you're generating that you never know when you'll need to go back and look at it.
So archival data, data that's being generated from the ambient environment or from systems that we use or from business processes, that can
be potentially interesting as we develop new ways to get insights out of data. So if you take a look
at, for example, even the U.S. Census, where they're constantly going back, there's demographers
that are looking back historically to see trends and try to identify patterns in the past that
might be applicable in the future. There's huge amounts of data that they typically have had to throw away just because we can't store it all and we can't process it even if we could store it.
Now with the cloud scale and these kinds of technologies, you can keep all of that and you
don't know what's going to be useful or not. In the future, at some point, we might come across
something, some question or some new way of looking at data that we can go back now and get insights and use out
of that previous data. But one of the most exciting areas that we've seen immediate interest in is
from media and entertainment. Oh, interesting. Because if you take a look at these companies,
their total IP is the content that they generate. Right. And so they want to preserve that. In many
cases, they take a piece of content they've generated and they use that in multiple ways in the future.
And they don't know how they'll want to use it in the future.
So they keep it forever because it might be eventually valuable.
If you take a look at, for example, movies, they'll store movies.
And when a new type of format comes out like 4K or 8K or Blu-ray or whatever, they'll re-release the movie in that new media format.
Or they'll take clips from old movies and use them in new productions. So they keep everything.
Now, the only way that they trust the storage of this media, because of some of the issues that
Anne's talked about with things like Flash and hard disk, is they'll take it and they'll put it
in analog form on film, even if it was digitally
mastered. And they do that because film actually has better properties in terms of degradation than
some of these digital media. They'll take it, put it on film in black and white, and then have
other strips of films in the color separations. They'll roll them up into canisters, make two
copies and put them in salt mines. And whenever they need them, they'll go back and
read them out.
So wait, that's the state of the art right now, is going back to film.
Yep. And I mean, you can already think about the kind of costs and issues with
that.
This is because media that lasts for a really long time just doesn't exist. I mean,
if you think of all the media we talked about today, hard disk drives, tape, flash,
you really wouldn't trust any of them for more than about 15 to 20 years at most, right?
Well, and even film.
Yeah.
I think we may have thought at some point we need to get film onto something else because it degrades.
Yeah.
I mean, so they vacuum seal these.
They put them in this very cold environments with no light.
And so they do last longer than, know film that's laying around but still that's the only media that they've actually trust with
this kind of long-term data so when they heard about this because we've talked to someone they
immediately got excited because now they can write it once now they never have to go back and refresh
it and it's 100 durable and it's the cost that we imagine here because
the crystal glass is incredibly cheap. Cost is not actually a big factor for them. It's really
the other factors. So you could put Gone with the Wind on a piece of glass and then use it as a
trivet if you'd like. No, I'm kidding. Yeah, it's actually fun. Ant's Lab put one of my novels on a
piece of glass, Zero Day. So now I've got that preserved for millions of years.
Okay, so that actually opens up a couple questions.
One being, this has application for any kind of cultural artifacts that we might want to keep, literature, film, music, etc.
Can we talk a little bit about how you get to it?
I mean, I know I have, like I say, 32,000 photos.
Part of my problem isn't that I have them.
It's that I can't find what I'm looking for.
Yeah, well, that's a research problem that we haven't been able to crack down on.
It's one of those foundational, fundamental questions.
We're ready to solve the actual story in the bits.
How do you actually find those bits?
That's up to you.
You haven't sorted through your stuff.
Well, no, seriously, how do you actually find those bits? That's up to you. You haven't sorted through your stuff. Well, no, seriously.
How do you address access?
Well, I think, you know, increasingly there's sort of like machine learning technologies
that are able to take images and recognize things in them.
And so you can do search and things like that over your image sets.
So, you know, show me pictures of, you know, person X outside.
And that's actually what they would do with movies, for example. They'll have machine learning algorithms run on the media
and say character X is at minute 43 to minute 44, 33 seconds.
And that goes into an index.
So when they say, I want clips of this person,
they can go quickly find them.
Brilliant.
Yeah.
Because the thing a lot of media is historical media, right?
And so it's like some event happens, somebody dies or something happens and you want to
find things relevant to that.
And you don't want to pull back the whole archive just to find two hours of video out
of a thousand hours or something.
What about news organizations?
Is that another area?
Absolutely.
I mean, what Ant was just referring to, historical footage.
You know, when a politician stands up and says, I never said that, they can just quickly
go back to the index and very quickly say, yeah, you actually said it right here.
Listen, this whole thing is a Black Mirror episode anyway, because it's like that,
you know, rewind thing.
Well, we're quickly moving from a terabyte
and petabyte era to a exabyte, zettabyte, yottabyte. But let's just say the cloud is a data beast,
and it needs to be taken care of. So not only do we need storage, we need other aspects to make
sure that things work. And we've just alluded to access. What about compute and networking?
What do we need there and what's being worked on?
Well, so there are, I think, a few things.
If you take a look at processing data
and doing it very efficiently,
you want to get the data to the compute very quickly.
So looking at reducing that latency,
whether that means putting the compute
right next to the data or having very low latency connections between the data and the compute. And then large scale
paralyzed compute, because oftentimes being able to efficiently process data requires more than
just one computer, one CPU requires a fleet of them. So being able to have a very low latency,
high bandwidth interconnections between the computers themselves.
Large AI model training is an example of that kind of workload that requires this kind of very low latency, high bandwidth connections between the different computers.
And lots of the larger machine learning models don't fit on a single system or even a dozen.
They require hundreds or thousands of them. For example, our open AI announcement that we made as building an AI supercomputer with them, that's going to have these characteristics of having a large scale data plus compute, plus networking, all working together.
Yeah, we're doing some of the work on optical networking in the lab in Cambridge in Cirrus,
and that's looking at how to build out these all optical networks for these kinds of things,
which is all about trying to reduce the latency
and
Make the networks effectively higher bandwidth lower latency, which is lower power lower power
Which is lower cost lower cost and higher density to which is also lower cost. Yeah, so you referred to that as serious
Yes, serious project serious is another one of the bodies. Are you serious but no serious?
Yeah, I think somebody from Not like, are you serious? But seriously. Serious black. Yeah. I think somebody from Azure probably said, are you serious?
That's ridiculous. Well, it's about now that I like to have a short conversation on the spectrum of risks
associated with people's work.
I call this the what keeps you up at night section.
What could possibly go wrong with anything that the two of you are working on collectively
in this area?
So one of the things that keeps me up at night, particularly with our storage that the two of you are working on collectively in this area?
So one of the things that keeps me up at night, particularly with our storage work,
is if you have a storage media, you expect it to store things.
And we're making decisions today, and we're making design choices today that potentially in 50 or 100 years' time,
we'll want to make sure that they can still get the data out,
even in 1,000 years' time or 10,000 years' time.
I often think, are the decisions we're making today the right decisions to enable that to happen
and you know if people store really important things like movies or their
lives photos in it you want to make sure that it's really there when you say it
will be there and that's keeps me awake a lot of time it's a big responsibility
in my view you mark I have a high confidence in ants team and partnership we've got going with them gives me high confidence that we're actually going to get there on this one.
I think what keeps me up at night in this space like this in general is somebody else coming up with something that disrupts the space and leaves us behind.
You mentioned it's a competitive environment. And so this is another reason why it's so important
for us to be exploring these different areas
that initially look like have little chance of success,
but if they do, would massively transform things.
And this is one of those examples.
So Mark, you actually have a unique story
about how you came to Microsoft,
but people can find that story in Wired Magazine.
So I want to just ask Ant here about his path to Microsoft, but people can find that story in Wired magazine. So I want to just ask Ant here about his path to Microsoft, Microsoft research, and if there's anything, Ant, that you have
had happen in your life that's informed or influenced your career that landed you here
doing the work you're doing. So I moved to Microsoft about 20 years ago, and before that,
I was at the University of Cambridge as a research associate.
And one of the really interesting things was I worked with a guy called Andy Hopper there.
And I finished my PhD and went to work with him.
And as many academics do, or as many young PhD students becoming academics do, I just
sort of worked on the next thing that was falling off from my thesis.
And he came to my door and said to me and I want you to build me a robot football
team to compete in the 1998 Robo Cup in Paris which is when the World Cup in
Paris was I sort of ignored it carried on and a few weeks later he came back
and he said you know and I really want you to build me a robot football team so
I build a robot football team and they play on a table tennis table.
And there's like five robots, and they were all autonomous.
So you sort of press go, and they had to play football.
I still have the robot on my shelf in the office.
But, you know, the one thing that it did for me was that it really just taught me that I could dive in.
I mean, I used to work on digital systems and programming languages and where they met.
And I was ending up building robots and working with a lot of other people building robots and building systems like that. Throughout my career at Microsoft, that's
given me the confidence to just dive in and do other things. And I think that's a moment
for me that's had a big influence on my career.
As we close, there are 7 billion people on the planet, 4 billion of them are online
and computing is moving to the cloud. Ant and Mark, any parting shots? What would you
like to say to our listeners
in the form of advice or inspiration, wisdom or warning,
or just plain predicting the future?
What's on the horizon from each of your perspectives?
I don't think I'd give any warning.
I'd give encouragement.
Again, it's an incredible opportunity
to really shape the future.
And it's not just moving to the cloud.
I think the important aspect to this is the edge as well,
because computing is getting more organized on the edge,
and you'll see more of it on the edge than what we've seen up to now,
which is why Microsoft's focused on intelligent cloud, intelligent edge.
So that's a huge part of our charter, and Azure, too, is looking at that space as well.
And there's a bunch of problems and new technologies that need to be developed there.
And how about you?
As I said earlier about the technologies and existing technologies getting to the end of their lifetimes or at least getting to the stage where it's not so easy just to get the next cycle out of them, I think it's just a huge opportunity to change things.
And I think as a researcher, anywhere where there's a sort of huge opportunity for things to be done differently. It's just really exciting.
This has been incredibly fun.
Thank you, Ant Rostron and Mark Rusinovich,
for joining us in the booth today.
Thank you.
Thanks.
To learn more about Dr. Ant Rostron and Mark Rusinovich and how Microsoft is working to meet all your digital storage needs,
visit Microsoft.com slash research.