Microsoft Research Podcast - 023 - Clouds, Catapults and Life After the End of Moore’s Law with Dr. Doug Burger
Episode Date: May 9, 2018Some of the world’s leading architects are people that you’ve probably never heard of, and they’ve designed and built some of the world’s most amazing structures that you’ve probably never s...een. Or at least you don’t think you have. One of these architects is Dr. Doug Burger, Distinguished Engineer at Microsoft Research NExT. And, if you use a computer, or store anything in the Cloud, you’re a beneficiary of the beautiful architecture that he, and people like him, work on every day. Today, in a fast-paced interview, Dr. Burger talks about how advances in AI and deep machine learning have placed new acceleration demands on current hardware and computer architecture, offers some observations about the demise of Moore’s Law, and shares his vision of what life might look like in a post-CPU, post-von-Neumann computing world.
Transcript
Discussion (0)
These are people who just have the scientific and engineering discipline, but also that
deep artistic understanding, and so they create new things by arranging things differently
in this bag.
And when you see a design that's done well, it's beautiful.
It's a beautiful thing.
You're listening to the Microsoft Research Podcast, a show that brings you closer to
the cutting edge of technology research and the scientists behind it.
I'm your host, Gretchen Huizenga.
Some of the world's leading architects are people that you've probably never heard of,
and they've designed and built some of the world's most amazing structures that you've probably never seen,
or at least you don't think you have.
One of these architects is Dr. Doug Berger, Distinguished Engineer at Microsoft Research Next.
And if you use a computer or store anything in the cloud,
you're a beneficiary of the beautiful architecture that he and people like him work on every day.
Today, in a fast-paced interview, Dr. Berger talks about how advances in AI and deep machine learning have placed new acceleration demands on current hardware and computer architecture, offers some observations about the demise of Moore's Law, and shares his vision of what life might look like in a post-CPU, post-Von Neumann computing world.
That and much more on this episode of the Microsoft Research Podcast.
Doug Berger, welcome to the podcast today.
Thank you. Great to be here.
We're in for an acronym-rich recording session, I believe. I assume that our audience knows more about your world than I do. But since
my mom listens, I'm going to clarify a couple times. I hope that's okay with you. Whether it's
acronym heavy is TBD. And you are an SME. You are actually a distinguished engineer for MSR Next.
Tell us what that is. The title is a standard Microsoft title for somebody that is in a technology leadership
position.
MSR Next is a branch of MSR or an organization within MSR.
So when you think about MSR, we have a lot of diversity.
We have geographic diversity.
We have disciplinary diversity.
And we have some organizational diversity.
And so Next is just a different
organizational structure that tends to produce different outcomes. That's how I like to think of
it. So you do research in computer architecture. Give our listeners an overview of the problems
you're trying to solve. What gets you up in the morning in computer architecture research?
Great question. So formally defined, computer architecture is defined as the interface between hardware and software. And architecture is what the software is able to see of the hardware. So if I build a chip and a system and a disk drive and some memory, all of those things, they're just dead hardware until you do something with them. And to do something with them, you have to have a way to talk to them.
And so all of these things expose a way to talk to them. And that really is the architecture.
You can think of it as the language of the computer at its very lowest level.
Machine language.
Machine language. Right. And how that translates from people into the transistor.
That's exactly right. There's actually a lot of layers in between the person and the transistor. And the architecture that I just described is one of those layers,
more towards the bottom, but not at the bottom. Right. Speaking of the stuff that's at the bottom,
transistors and devices and things like that, we've experienced a very long run of what we
call Moore's law. Yes, it's been wonderful.
Which is transistors get smaller,
but their power density stays constant.
And some people, including you,
have suggested that Moore's Law is probably coming to an end.
Why do you think that is?
Let me start with a little bit of context.
At its heart, all of this digital computing
we're doing is built up of switches and wires.
It's a light switch.
You flick it, it turns on. You flick it, it turns off.
That's really what a transistor is. What we do is we'd start with a wire and a transistor,
which, remember, is just a switch. It's a fancy name for switch.
And then we'd start putting them together. You put a few more transistors together,
and then you have a logic gate, something that can say AND, OR, NOT.
Right? Take a zero and turn it into a one, that's a not, or two ones together,
anded together, become a one, and a one and a zero, anded together, becomes a zero. Just the
very basic universal logic functions. Long ago, in the time of Babbage, we were building computers
that really didn't work out of mechanical relays, then we had vacuum tubes, which at least were
electronic. And then, of course, the transistor was invented, and the integrated circuit, and these were a a really big deal because now it allowed you to miniaturize these things on a chip.
Sure. And then what Gordon Moore did in his seminal paper in the mid-60s, he pointed out that he
expected these things to double in density, be able to fit twice as many of these things on a
chip on an integrated circuit in a year and a half or two years as you would two years ago.
And then he revised the law in 1975 to be a little bit faster.
And then we've been doubling the density every couple of years since then.
For 50 years. It's been a 50-year exponential where the cost of computing has been dropping exponentially for 50 years. And that's what's given us this crazy society we have with, you know, the internet
and computational science and sequencing genomes for incredibly cheap compared to what they were.
It's nuts. And so are we nearing the end of this era? So I get asked this a lot and there's two
versions of Moore's law. There's the actual one that he published, which talks about the rate at which chips get denser.
And then there's what I call the New York Times version, which is something loosely associated with an exponential in computing.
Like performance is growing exponentially.
Well, and that's what we hear.
That's right. That's right. That's what people think. on the precise version, which that's where I am because that's my field, the rate of that density
increase has slowed down for the first time in 50 years. So it hasn't stopped, but it's noticeably
slowed. Now maybe the chip manufacturers will pick the cadence back up again, and then I'll
issue a mea culpa and say I was wrong. But I think we're in the end game. I think we're in a period
right now where it's slowed. It may be that we're on a new cadence that's just slower, or it may be that the length
of time between each new generation of semiconductor technology lengthens out. And the problem really
is that we're running up against atomic limits. I was just going to say, you've defined these,
you know, size limits in terms of atoms and how small they are
now. If you half it again and half it again, we're pretty soon down to a couple atoms. I can't even
comprehend how small that is. Yes. Now, people have built single atom or single electron transistors,
right? So it's not a question of can we, it's more a question of can you build a chip that has 10 billion of these on it economically? It's really a question of economics because we
know we can build smaller devices, but can you make them work and sell them cheaply enough to
actually be worth doing? The other thing to note is that as we started to run into atomic limits,
when we were scaling down the transistors, the old standard, very regular structure of a
transistor wasn't possible anymore. I mean, it used to be a very simple thing that had, you know,
three bars of material with something sprayed on top and wires connected to some of those terminals.
Now these things are actually very intricate 3D structures. So they're just getting more and more
complex as we try to scale them down. Of course, that's to control quantum tunneling of electrons,
so we're not leaking a light.
When the switch is off, it should be off.
And there have been challenges all along,
and we always find interesting ways to deal with it.
And people are doing incredibly creative things now.
It's just getting really hard.
I suspect, even as you look at this complexity of the transistors getting smaller and smaller,
that you're looking at other ways to keep the progress.
Absolutely.
So that's what I want to ask you next is, let's just assume that you're right.
And what are you doing in a parallel universe to keep the progress of computing power and speed and efficiency moving forward. We're now in an uncertain era where, you know, in one generation your cost gains might be smaller
than expected or your power gains might be smaller, you might not get any speed and you're
trading these things off. There are still lots of advances going in the other parts of computing,
things like memory density, flash memory, network speeds, optics. So there's all the ancillary parts people are also working on
new types of computing trying to bucket ties what some of these might be so programmable biology is
a super exciting one right like dna is a computer dna is a stored program that can actually replicate
itself yeah but there's a program encoded in it and there's lots of rules about it and so we're Yeah. And now we're actually starting to leverage. There's another one which you could think of as just digital computing that's not silicon-based.
So people have been looking at carbon nanotubes,
different materials.
None of it looks very close to me.
It looks like we're kind of 10 years away
from any of it getting competitive.
And of course, silicon has had so much investment.
Like hundreds of billions of dollars, if not trillions.
It takes, something that always worries me as a researcher, I know I hundreds of billions of dollars, if not trillions. It takes, you know,
something that always worries me as a researcher, I know I'm jumping around a little bit here, is
if you're on technology X and then there's technology Y that is not only better, but will
take you farther. But if you don't get Y started early enough, X gets advanced far enough that the
amount of money you need to bootstrap Y is just too high and it never happens.
Right. And so that's going to be really interesting to see that play out when
we think about silicon and post-silicon technologies. It may be that there are magical,
much better things out there that we'll never achieve because we've invested too much in
silicon. And that seems to be one of the drivers is the cost of something. I mean, if you made a
compelling case for something and it was cheap, people would adopt it. Yeah. And so these computing systems we have have become exponentially cheaper for many decades.
They're also very general. They do everything. And that's based on something called von Neumann
computing. And that's a paradigm. You write software, it's run on a CPU. That's kind of
the paradigm we've been with for a very long time. And as the silicon gains are
slowing, the gains you get from that paradigm are also slowing. And so we're starting to see even
the digital computing ecosystem fracture and diversify because of the huge economic need to
do more. Let me roll back a minute and get to the other bucket. So there's, there's neural computing then they are also a programmable
albeit learning machine. And they're incredibly interesting. I mean, just profoundly interesting.
We don't really understand how and why they work yet, despite all the progress we've made in digital
AI. Yeah. There's something super magical there that may not even be understandable at the end
of the day. I hope it will be, we don't know. And then of course there's, you know, chemistry. And so there's just
all these other ways to compute. And of course, the nice thing about the paradigm we've been on
is all of the levels are deterministic. You know, exactly what they're capable of,
what they can express is bounded, but very powerful. Sure. Turing complete, if you're a computer scientist.
And so it's tractable.
And so you can, each layer hides the complexity underneath,
presenting you with a relatively simple interface
that allows us to do all this wonderful stuff.
And then now things are getting more complex and interesting.
Yeah.
But also harder.
Well, and I, as, you know, a non-scientist here,
look at the simplicity of the interface,
and the underneath part is intimidating to me in terms of trying to get my brain around it.
But like you say, when you unpack what's in the box, people go,
aha, I don't have to ignore that man behind the curtain anymore.
That's right.
I am that man behind the curtain.
Pay no attention, no way. I didn't say that.
Suddenly I joined the technopoly.
That's right.
Yeah, let me make a comment on that.
These things are both incredibly simple
and incredibly complicated at the same time.
But they've got interfaces that are very clean
and the concepts are pretty simple,
like switch, AND gate, adder, add two numbers,
binary arithmetic, right?
It's just math with zeros and ones instead
of zeros through nines. But then the number of things we do to optimize the system, it's insane.
And the complexity of these things, that's some of the most complex things humans have ever built.
I mean, you think about 5 billion switches on one chip, it's a small postage stamp size thing
with 5 billion switches organized in a way to make stuff go really fast.
I mean, that's amazing. Simon Peyton Jones said that computers and software and architecture
are some of the most amazing structures people have ever built.
They are amazing structures.
And in the architecture field, when you're designing one of those interfaces
and then deciding what you put in the chip underneath to support that interface,
the cool thing is it's unlike software, where it's much more open-ended. I mean, to me,
when I write software, I feel too free. There's no guardrails. I could do anything. I had too
much freedom. And when you're doing the hardware, you have a budget. You have an area budget. I have
this many square millimeters of silicon. Yeah. And you have to decide what to fill it with
and what the abstractions are you expose
and how much to spend on performance optimization versus features. If you want to put something else
in, you have to take something out. So your bag is of a finite size and you're trying to figure out
how to fill up that bag with the right components interconnected in the right way
to allow people to use it for lots of stuff. You want it to be general.
Yeah.
You want it to be efficient. You want it to be fast. You want it to be simple for software
to use. And so all of those things you have to balance. So it's almost like an art
rooted in science. There are a small number of people, and I don't count myself among them,
who are the true artists. You know, Jim Keller is a very famous one
who's active in the area.
Bob Caldwell, retired from Intel.
Uri Weiser, also from Intel.
I mean, these are some of the more recent examples,
but these are people who just have
the scientific and engineering discipline,
but also that deep artistic understanding.
And so they create new things
by arranging things differently in this bag.
And when you see a design that's done well, it's beautiful.
It's a beautiful thing.
Let's talk about a project that you co-founded and co-led called Project Catapult.
Tell me about Project Catapult. Tell me about Project Catapult.
Before I moved to Microsoft,
I had started some research with one of my PhD students,
Hadi Ismailzadeh, who's a professor now at the University of California at San Diego.
And at the time, the community was moving towards multi-core.
And there was not a global consensus,
but definitely it was the hot thing.
And people were saying,
if we just figure out the way to write parallel software,
we'll just scale up to thousands of cores.
And Hadi and I really, I said, this is, you know,
when everyone's buying, it's time to sell.
Right.
And so we wrote a paper that ended up getting published in 2011
and got some pretty high visibility.
It didn't coin the term dark silicon, but it popularized it.
And the observation was because the transistors aren't getting more power efficient, we can
keep increasing the number of cores, but we're not going to be able to power them.
So even if you have parallel software and you drive up the number of cores, the benefits
you get are much lower than you've gotten historically.
And what that really said to me is that that's a great avenue, but we're also going to need
something else. And so that something else started to be specialized computing where you're optimizing
hardware for specific workloads. And the problem with that is that building custom chips is
expensive. And so what you didn't want is, say, a cloud where people are renting computers from us
to have 47,000
different times of chips and try to manage that and have that be your strategy going forward.
And so we took a bet on this hardware called FPGAs. Now we're to your acronym soup. It stands
for Field Programmable Gate Array. What they are is programmable hardware. So you do a hardware
design and you flash it on the FPGA chip. That's why they're called field programmable, because you can change them on the fly in
the field.
And as soon as you're done using it, you can change it to something else.
You can actually change them every few seconds.
So what we ended up doing was to say, let's take a big bet on this technology and deploy
it widely in our cloud.
And we'll start lowering important things into hardware on that fabric.
We designed a pretty interesting system architecture too and then that's going to be our general purpose
platform for hardware specialization and then once you have hardware designs that are being
baked onto your fpgas you can take some of them or all of them and then go spin those off into
custom chips when the economics are right so it's sort of a way to dip your toe in the water, but also to get a very clean, homogenous abstraction
to get this whole thing started.
And then while stuff is evolving rapidly,
or if its units are too small,
you leave it on the programmable fabric.
And if it becomes super high scale,
so you want to optimize the economics,
and or it becomes super stable,
you might harden it to save money
or to get more performance.
So there's flexibility there that you...
Yeah, the flexibility is a really key thing. And again, the FPGA chips had been used
widely in telecom. They're very good at processing streams of data flowing through
quickly and for testing out chips that you were going to build. But in the cloud,
nobody had really succeeded at deploying them at scale
to use not as a prototyping vehicle for acceleration,
but as the actual deployment vehicle.
Well, so what can you do now post-Catapult or with Catapult
that you couldn't do on a CPU or a GPU?
Well, let me first say that the CPUs and GPUs are amazing things
that focus on different types of workloads. Well, the CPUs and GPUs are amazing things that focus on different types of workloads
while the CPUs are very general. And what a CPU actually does well is it takes a small amount of
data called the working set sitting, you know, for those of you who are architecture geeks,
you know, in its registers and level one data cache, and then it streams instructions by them
and operating on those data. We call that temporal computing. If the data that those
instructions are operating on is too big,
or if those data are too big,
the CPU doesn't actually work very well.
It's not super efficient, which is why, for example,
processing a high bandwidth stream coming off the network,
you need custom chips like, you know, Nix for that,
because the CPU, you know,
if it has to issue a bunch of instructions to process each byte,
and those bytes are coming in at 12.5 billion bytes a second, you know, that's a lot of overhead.
So what the GPUs do well is something called SIMD parallelism, which stands for single instruction, multiple data.
And the idea there is you have a bunch of tasks that are the same, all operating on similar but not identical data. So you can issue one instruction and that instruction ends up doing the same operation on say eight
data items in parallel. Okay. So that's the GPU model and then the FPGAs are
actually a transpose of the CPU model. So rather than pinning a small amount of
data and running instructions through, on an FPGA we pin the instructions and then
run the data through.
I call that structural computing.
Other people have called it spatial.
I mean, both terms work.
But the idea is you take a computational structure,
you know, a graph of operations,
and you pin it,
and then you're just flowing data through it continuously.
And so the FPGAs are really good
for those sorts of workloads.
And so in the cloud,
when we have functions that can fit on a chip and
you want to pin it and stream data through at high rates, it works really well. And it's a
nice complement to the CPUs. Okay. So Catapult is? Catapult is our project code name for Microsoft's
large-scale deployment of FPGAs in the cloud. Sort of covers the boards and the system architecture,
but it's really a research project name. I was just going to say, is this now in research? Is it beta? Is it production? Where are
you with it? In late 2015, Microsoft started shipping one of the Catapult FPGA boards in
almost every new server that it bought. That's Bing, Azure, and other properties. And so by this
point, we've gone to very large scale.
This stuff is deployed at ultra large scale worldwide.
We're one of the largest consumers of FPGAs on the planet.
And so, of course, there are now teams all over the company
that are using them to enhance their own services.
When you are a customer of using the accelerated networking feature,
that speed that you're getting,
which is a huge increase in the speed,
both over what we had before, but also it's faster than any of our competitors' networks,
is because the FPGA chip is actually rewriting the network flows with all of the policies that we have
to keep our customers secure and keep their virtual private networks private and make sure
that everyone adheres to our policies.
And so it's inspecting the packets as they're flowing by at 50 gigabits,
50 billion bits a second,
and then rewriting them to follow our rules
and making sure that they obey the rules.
If you try to do that on the CPUs,
which is where we were doing it before,
the CPUs are very general.
They're programmable.
They do a good job,
but you use a lot of them to process flows at that rate.
And so the FPGAs are just a better alignment for that computational model.
Talk about Brainwave, Project Brainwave.
So Brainwave, there's a big move towards AI now, as I think, you know, just about everyone listening will know.
Very hot area.
And in particular, it was spurred by this thing called deep learning, which I think many of your listeners will know too.
But what they figured out was that with the deep models, you know, deep neural networks, if you add more data to train them, they get better.
As you add more data, you make them bigger and they get better. As you add more data, you make them bigger and they get better. And they kind of marched through a lot of the traditional AI spaces like machine translation, speech
understanding, knowledge encoding, computer vision, and in replacing each dedicated set
of algorithms for that domain that had been developed painstakingly over years. And that's
really what spurred, I think, a lot of this huge movement because seeing the, okay, there's
something here, these things are very general. And if we can just make them bigger
and train more and more data, we can do even more interesting things, which has largely been borne
out. It's really interesting. And so of course that put a lot of pressure on the silicon given
the trends that we were discussing. And so now there's a big investment in custom architectures
for AI machine learning and specifically deep learning.
And so Brainwave is the architecture that we built in my team,
working with partners in Bing and partners in Azure
to deploy to accelerate Microsoft services.
Right.
So sort of our version of a deep neural network processor,
what some people call a neural processing unit or NPU.
And so for an organization like Bing, who's really compute bound, like they're trying to learn more and more so they
can give you better answers, better quality searches, show you what you need to see.
And so we've been able to deploy larger models now that run in the small amount of time that
you're allowed to take before you have to return an answer to a user. And so we've been running it at worldwide
scale for some time now. And now what we announced at Build is that we're bringing it to Azure
for our customers to use. And also a private preview where customers in their own servers
in their companies can actually buy one of our boards with the Catapult architecture and then
pull models down, AI models down from
Azure and run them on their own site as well. Wow. So they become an Azure endpoint in some
sense that benefits from all of the AI that's sitting in Azure. One other thing about the
Brainwave stack itself that's significant is that I think right now for inference, which is asking
of the questions, a lot of the other technologies use something called batching where you have to
glom say 64 different requests together and ship them as a package,
and then you get all the answers back at once. The analogy I like to use is if you're standing
in a bank line and you're the second person, but there's 100 people in line, that the teller
processes them all by taking all their IDs and then asking them all what they want and then
withdrawing all the money and then handing it to each person. And, you know, you all finish at the same time, right?
That's batching.
I love it.
It's great for throughput on these machines, but not so good for latency.
So we're really pushing this real-time AI angle.
That leads me to a kind of philosophical question.
How fast is fast enough?
Well, how fast is fast enough really depends on what you're trying to do. So, for example, if you are taking multiple signals from around the web
and trying to figure out that, for example, there's an emergency somewhere,
a couple minutes might be fast enough.
You know, a millisecond is not life and death.
If you're doing interactive speech with somebody,
actually very small pauses matter for the quality of the experience.
We've all experienced that in, you know,
watching a television interview where there's latency
between the question and the answer.
And you often step on each other.
That's right.
Another good example, another piece of AI tech that Microsoft unveiled
was its HPU, which goes in the HoloLens headset.
And that's also processing neural networks. That was a very, very amazing team in a different organization, the HoloLens
organization that built that chip, working with the Lon Spillinger's team. But that chip, if you
think about the latency requirements, it's figuring out the environment around you so that it can hang
virtual content that's stable as your eyes are darting around. So, you know, even a couple of milliseconds there is a problem. So really how fast depends on what
you're trying to do. Yeah. So speed is one component and the other is cost. Yeah. You know,
so if you have billions of images or millions of lines of text and you want to go through and
process it so that you can, you know, understand the questions that people commonly ask in your company, or you want to look for signs that this might be cancer and a bunch of
radiological scans, then what really matters there is cost to serve. So you have a sort of a trade
off between how fast is it, how much processing power are you going to throw at it right away,
and how cheap is it to do one request. And the great thing about the brainwave system is I think
we're actually in a really good position on both.
I think we're the fastest
and we are in the ballpark to be the cheapest.
What was your path to Microsoft Research?
How did you end up here?
I was a professor at the University of Texas for 10 years. I worked very closely with a gentleman named Steve Keckler. We were sort of two in a box. And we did this really fun and ambitious project that was called Tri FPGA project, so I'm really actually a hardened
chip guy.
And so we came up with some ideas and said, hey, there's a better way to do CPUs.
And it's really a new architecture, a new interface to the machine that uses different
principles than the historical ones.
And so we raised a whole bunch of DARPA money that DARPA was super interested in the technology,
built a team in the university, went through the grind, built a working chip.
I mean, Steve was an expert in that space.
He really led the design of the chip, did a phenomenal job.
We worked together in the architecture and the compiler.
So we built this full working system, boards, chips, compilers, operating system, all based on these very new principles.
And academically, it was a pretty influential project and pretty high profile.
But we got to the end of that.
It was too radical to push into industry.
It was too expensive to push into a startup.
You know, semiconductor startups weren't hot at the time.
But after that, I really wanted to go and try to influence things more directly.
And so Microsoft came calling right around the time I was wondering what's next.
And it just seemed like time for new challenges.
So your work has amazing potential.
And that usually means we need to be thinking
about both the potential for good
and the potential for not so good.
So is there anything about what you're doing
that keeps you up at night?
I'll tell you that I don't worry
about the AI apocalypse at all.
I think we're still tens or hundreds of thousands
of times away from the efficiency of a biological system. And these things are really not very smart. Yes, we need to keep an eye on it
and ethics. I mean, frankly, I worry more about global warming and climate change. That's kind
of my big thing that keeps me up at night. So to the extent that our work makes computing more
efficient and we can help solve big, important scientific problems, then that's great. So as we close, what's the most exciting challenge or set of challenges that you see maybe on the
horizon for people that are thinking of getting into research in hardware systems and computer
architecture?
Really, it's find a North Star that you passionately believe in and then just drive to it
relentlessly. Everything else takes care of itself. and, you know, find that passion. Don't worry about the career tactics and, you know, which
job should you take? Find the job that puts you on that path to something that you really think
is transformational. Okay. So what might be one of those transformational end goals? For me,
and this is, I think the first time I've talked about it, I think we're beginning to figure out that there is something
more profound waiting out there than all of these heterogeneous accelerators
for the post von Neumann age. So we're at the end of Moore's law. We're kind of in the silicon
end game and von Neumann computing, meaning, you know, these, the stream of instructions I
described, you know, has been with us since von Neumann invented it, you know, in the 40s. And it's been amazingly
successful. But right now we have Von Neumann computing, and then a bunch of bolted on
accelerators, and then kind of a mess with people exploring stuff. I think there's a deeper truth
there for that next phase, which in my head, I'm calling structural computation. So that's what I'm thinking
about a lot now.
I'm starting to sniff out
that I think there is something there
that might be general
and be that next big leap
in computer architecture.
And of course, the exciting thing
is I could be totally wrong
and it's just a mess,
but that's the fun of it.
Doug Berger,
I wish I had more time with you.
Thanks for coming in
and sharing the depth of knowledge that you've got stacked in that brain.
My pleasure.
Thank you for the great questions and for all your listeners.
To learn more about Dr. Doug Berger and the latest research on acceleration architecture for next-generation hyperscale clouds, visit microsoft.com slash research.