Not Your Father’s Data Center - The Myth of Data Center Inefficiency
Episode Date: May 4, 2020With their constantly whirring server fans and enormous cooling systems in cavernous buildings, data centers seem to be sucking up a lot of energy and contributing more than their fair share ...to climate change. But are they? Dr. Eric Masanet and several colleagues decided to put it to the test. What they found is that, while demand for compute instances has gone up sixfold, the energy trend is close to flat. With servers becoming more efficient and virtualization coming more into vogue, the data center has seen its energy per compute instance plummet. “That has gone down by about 20 percent every year, and that pace of efficiency improvement is far greater than we can see in any other sector of the energy system,” Masanet said. “We wanted to point out that, even though the data center industry sometimes gets beaten up about its energy use or its contributions to climate change, that is really a remarkable energy improvement and better than nearly any other sector for which we have data.” While the news is mostly positive for data centers, Masanet and his team also want to push for more data to be made available on data center energy consumption and sound a warning about major challenges on the horizon. “We think there are three trends that really need to be better understood,” he said. “One is artificial intelligence, which should require a lot of computational intensity. The second is 5G, which could spur lots of new demand for data center services, and the third is edge computing, moving potentially some workloads and compute instances much closer to the end user in smaller data centers. We really don’t know yet how that’s going to play out from an energy perspective.” The path certainly is not linear, but with progressive solutions, data centers can continue to be mindful about their energy use.
Transcript
Discussion (0)
Welcome to Not Your Father's Data Center podcast, brought to you by Compass Data Centers.
We build for what's next. Now here's your host, Raymond Hawkins.
Welcome everybody to another edition of Not Your Father's Data Center.
I'm your host, Raymond Hawkinskins and today we'll be joined by
Dr. Eric Mazinette. Eric joins us with an impressive resume holding degrees from the
University of Wisconsin, a master's from Northwestern, and a PhD from that little
school you might have heard of Cal Berkeley. On today's podcast, we'll be talking about energy use and the data center industry and
how it impacts global energy use. You guys will be surprised to learn that this question is asked
by some really smart people who've done a lot of incredible study on it and released
some academic papers on how the data center is impacting energy use on our globe today.
And look forward to digging into
that subject with Dr. Mazanet. Eric, I think you're joining us today from Santa Barbara. Is
that right? I think that's a recent move for you. That's right. Yeah. UC Santa Barbara.
All right. Lovely out there. And again, I think I'm struggling to understand how you could leave
Chicago in February for UC Santa Barbara, but there's no explaining taste.
So good move on your part. Yeah. Well, today, I think what we want to compare notes on today
is to talk about energy in the data center. I think our space largely gets misunderstood.
Is it a net positive for society? Is it a net negative? Is it hurting the planet? Is it helping
the planet? There's lots of good things that come out of computing, lots of modeling and lots of
understanding, but also lots of energy use and trying to balance this perception of is the data
center space and the technology suit we support, is it a net good thing for mankind and the planet
or a net negative?
And I think, Eric, you and your team have honed in on the question specifically around energy use. Is that too big a summary or is that pretty accurate?
No, that's pretty accurate.
But if we think about the energy impacts of digital services, so streaming versus driving to the video store in the old days or what most of us are doing now, remote working as opposed to driving into the office.
To figure out whether that's a net positive from an energy perspective or any other environmental impact, one has to look at the whole system.
And we focus specifically in this paper on the data center, a key component of the digital system,
because there is a lot of misunderstanding about just how much energy data centers use and where that energy use is trending.
So you – and I appreciate you doing it, Eric, alluding to the study.
Do you mind doing a two- or three-minute – and I'll try to make sure that I can track with it, summary of the paper and of the research,
and the research you've been doing for quite a while, but in the paper that you most recently published as well. Sure. Yeah, I'd be happy to do so. Well, the paper was written to fulfill
really three goals. The first is there is a lot of misperception out there about the energy use
of data centers. Depending on what you read, you may think that data centers are gobbling up all the world's energy and the situation will get even worse
in the future. Or you may read that data centers are really efficient. And we wanted to basically
try to recalibrate the public's understanding and also policymakers by putting out what we
thought were the most rigorous, best estimates
of global data center energy use, kind of to set the record straight.
So that was one goal of the paper.
I wrote this paper with two, well, four colleagues, two of which I've been working with for a
long time, Dr. Arman Shahabi and Dr. John Kumi. We go way back to a study in 2007 where we were asked to
calculate the energy use of U.S. data centers for the U.S. Congress. And ever since then,
we've been working to maintain a data center energy model that we update with recent data,
and occasionally we'll publish a study using the model to weigh in on where energy use stands for
data centers. So really, the first goal was to try to put out what we thought were the best
available numbers on global data center energy use.
The second key goal was we wanted to ring the alarm bell somewhat.
We've been enjoying a lot of efficiency gains over the last decade in data centers,
which we can talk about more.
But really those efficiency gains can't last forever because demand for
data center services is poised to grow rapidly.
It has been growing rapidly, and it'll continue to grow rapidly with the emergence of some key trends like artificial intelligence or 5G or edge computing.
And then the third motivation was really there aren't enough of us, frankly, in the research community who study data center energy use, who publish numbers,
and we wanted to motivate more research with this paper. So we released all of our data sets,
our entire model. We're opening it up for critique because we want more people to be
working on this issue because we need to get a better handle on where energy use is going
moving forward. And there just aren't
enough analysts out there who specialize in data centers. Eric, in the paper, you guys talk about,
first of all, love the idea of it being open, love the idea of having other smart guys look
at smart guys' work and check. I think we all get better iron, sharpening iron. So I think that's a great move. And it makes me think of open source computing, right? Let's have all the best
minds look at it and let's make sure that we're thinking about this the right way.
Because I think it's an important question, right? Digitization continues to increase. So
this question isn't going away, right? People aren't going to put away their smartphones
and be happy to go back to their flip phones or not be connected at all, right? So the digitization of the world is underway and
probably not reversible. So how do we manage these resources in a way that are smart, wise,
and good for the planet? You talk in the paper about proliferation sort of beginning in 2010
and the growth of data centers, footprints growing. I think you mentioned in the paper about proliferation sort of beginning in 2010 and the growth of data centers, footprints
growing. I think you mentioned in the paper four or five, maybe even sixfold, but that although
the volume of compute resources is going up, that it's not a perfect straight line correlation of
energy use going up and some of the reasons why. Could you dig into a little of that? Because I
think that's important. Yes, there's more compute cycles being used, but the energy cycle isn't matching it one for one
and the why behind that. Yeah, no, you're exactly right. Demand for compute instances or data
center services has gone up sixfold, at least sixfold. But the energy use associated with
providing that level of compute haven't risen nearly as fast. In fact,
we found that the energy trend isn't quite flat, but it's pretty close to flat. So in the paper,
we put down some numbers. Compute instances have grown by over sixfold over the period 2010 to
2018. And over that same time period, energy use, we think, only rose by about 6%. That shows that there's this enormous efficiency effect, meaning the data center industry is getting better and better at providing core services with very little additional energy.
And we found that there are really three major reasons for this trend. The first is that the IT devices that are the workhorses of the data center, servers, storage devices, network switches, those devices have gotten a lot more efficient.
And we see this in our everyday lives, right?
Our cell phones can last a lot longer on a charge than they used to 10 years ago, and we get a lot more service.
It's just a general trend due to technological advances in processing technologies, memory technology, storage technologies.
So that's one explanatory factor that's quite large.
We just have much more efficient IT equipment now than we had 10 years ago.
The second major trend is that especially large data centers are virtualizing their servers to a greater degree than in the past.
What virtualization means for servers in particular is that servers, when they idle, they still use power.
And so if we can utilize servers at a much greater capacity level,
we're spreading out that idle energy use across many more workloads,
and the net effect is that each workload comes with less energy.
So the second major trend was virtualization of servers,
which has increased quite drastically
over the last decade.
And the third trend is that a lot of workloads
have been shifting to much larger cloud
and hyperscale data centers,
which are run with much greater cooling efficiencies.
So some of the biggest data centers in the world
where there's a lot of compute happening
have PUEs of 1.1 or in that ballpark.
So it's really those three factors, more efficient IT
equipment, greater capacity utilization of that equipment, particularly through virtualization
of servers, and then shifting workloads to the cloud and to hyperscale, which have much greater
cooling and power provision efficiencies. We found that those three effects largely explain this
near plateau in energy use over the last decade.
So love those three trends, and I think it helps explain to a layperson like me,
just because our compute cycles are growing and the amount of capacity we have is going up globally,
just thinking about the global compute utilization, doesn't mean that energy is going up at the same pace
and great big reasons why.
We're really efficient.
Our tools are better.
Your example of the phone is great.
I can run my iPhone a lot longer than I could run my phone from 10 years ago.
The ability of the compute equipment to run at a higher optimum level via virtualization
and then running in the most efficient locations,
putting that workload instead of in a less efficient, smaller data center, having it
in a very large, efficient facility.
When we think of people like Microsoft's Azure platform or Amazon's AWS or Google's
GCP, when you run workload there, it's running in an extremely efficient place.
That's a fair description of each one of those?
Oh, yeah, absolutely.
And I would say that the nuance about your phone lasting a lot longer than it did 10 years ago,
you're also getting a heck of a lot more service out of that same phone.
So it's a perfect visual of getting more computational service for less energy as time goes on.
Right, right, right.
That's an easy one.
We all have them in our pocket, and that's an easy one to get our arms around and understand.
So as we think about moving forward, this is we turn around and we've looked back a decade and we said, OK, these trends have happened, and this is why energy isn't going
up at the same pace as compute cycle.
What were your insights as you looked at the data about the way that power curve and, meaning literally energy use, and the capabilities in the compute as those two curves moving forward?
Do you think they're going to continue to look like they have over the last 10 years? What did
you guys learn in that regard? Yeah, it's a really great question. So this is a good point. This is a good moment for me to point out that we really struggle
in the analyst community with having data on data center operations, on the energy use of servers,
storage devices, and so forth. And so one of the backstories to this study is it took us an
awful long time to put together the data sets that we could use with confidence in order to weigh in
on global data center energy use. And the reason for that is most data center operators don't
report their energy use. Server manufacturers may report some component level and energy use data,
but in a very limited way and in a very small public data set.
So we had to rely on a lot of inference from the data we had, a lot of talking to industry experts
about the trends we were seeing. And this is one of the reasons why we want to promote more research
in this area. And part of that will be hopefully getting companies and data center operators,
meaning operators, but by companies, I mean device manufacturers,
server manufacturers, to open up a bit, sharing more data about the energy use and other
characteristics of the equipment. But from what we could tell, looking at the scientific literature,
the empirical data we could find, the data sets that are being reported,
the energy use of servers is really tricky to pin down because on the one hand, we have these nice component level trends we're getting out of servers compared to the amount
of energy we're putting in for a number of years, and has found that so-called Kumi's law, which he
didn't name himself, but others named, that the computations per unit of energy has been doubling
around every 2.6 years or so. But that's been slowing down a bit as processors are getting close to their physical limits
in terms of the hardware and so forth.
So the short answer is looking at the data,
looking at the trends in energy use of servers in particular,
we're finding that they're certainly
getting more efficient.
But those efficiency gains are beginning to slow down.
The other major effect is that servers that are being deployed
are using
more memory, more storage, and so forth. And so we still have room to enjoy a lot of efficiency
gains, we think, but that pace is slowing down a bit. Meanwhile, demand is going up quite rapidly,
and we expect the trend, the rapidly rising demand trend for data center services, whether it be
streaming or artificial
intelligence or collaborative tools, as the world gets more connected, as we have greater data
speeds, as we depend on the internet more and more for our daily lives, there's no doubt in our minds
that demand will increase. We're finding that from the data we're seeing and in our conversations
with hardware manufacturers, that absolutely there's still a lot of room for the devices to continue becoming more efficient.
That's slowing down a bit as we're running up against the physical limits of the hardware.
At the same time, demand is increasing even more rapidly.
So our conclusion was that there still is room for the industry to maintain this near
plateau in energy use to meet what we said would be the
next doubling of demand for compute instances. Beyond that, though, given how fast we think
demand will rise, it's very likely that energy use will start ticking up again because efficiency
won't be able to keep pace with rising demand. That was our major conclusion. And so we wanted to
ring the alarm
bell, so to speak, saying, yes, we've been enjoying a lot of efficiency gains in the past, and yes,
we can continue to enjoy them in the near term. But at some point this decade, we're going to
have to reckon with demand increases outpacing the ability of efficiency to keep up. And what do we
do about that as a society? And who are the stakeholders who can
help us manage that potentially large source of growing energy use?
So Eric, I want to get into this alarm bell idea for a second. But before we do that,
I want to go back to Kumi's law. So Moore's law, I'm familiar with. I think a lot of folks in the
tech business are very familiar with that. I think essentially saying that our ability to improve the compute cycle by 100%, doubling compute cycle somewhere between 12 and 18 months,
really centered around the founder at Intel around the ability to put processors on silicone.
I think that's Moore's Law.
Can you give me one more take on Kumi's Law on how to make sure?
I think I understood what you said, but I'm going to ask you to say it one more time and then I'll see if I can
paraphrase. Sure. So, so Kumi's law is based on observations of the computational power of
servers and their energy use. And it's been, it's been adjusted once. So the very first study that
Kumi did after which, you know, this, this term Kumi's law came out, showed that the computations we're
getting from servers for every unit of energy that we put into them, that will double roughly
every two years. Most recently, when John went back and looked at more recent data, he found
that the computations per unit of energy doubled roughly every 2.6 years, which suggests that the ability of servers
to provide computations in an efficient way is slowing down slightly.
Okay.
So I want to make sure I'm going to use numbers here, even though they're made up.
So what I think Kumi's law is saying is if I could get 100 outputs for 10 kilowatts of power in his modified number now, I could get 260 outputs for 100
kilowatts of power. In other words, I'm getting 2.6 years to double the capacity, the compute
capacity on the same amount of energy. Is that the right way to say it? So I would say that if we
take as our baseline 100 units of compute, let's say per
unit of energy, in 2.6 years, we'll get 200 units of compute. 200 units, okay. For the same amount.
For the same energy. All right, thank you. That helps me pick it up. Whatever compute measure I
have, I'll get twice that much in 2.6 years for the same amount of energy output because of
improving. And that's back to
your point of it appears that our ability to be efficient or achieve those efficiencies is slowing
down a bit, but I'm still every 2.6 years being able to get twice as much compute power for the
same unit of energy. Okay. That helps me understand. All right. Very, very good. Okay. Back to this
concept of an alarm bell. And can you give me just a little bit of a sense,
and I think it was mentioned in the article when I read it, if you think about the last decade,
2010 to 2020, the amount of compute, so let's use the same thing, the amount of compute power
globally compared to the amount of energy globally, What's been the rate of rise of those two numbers over the last decade?
And I think if we understand that looking backwards,
then that'll tell us why you guys are considering,
as you've looked at the data,
trending a little bit more quickly than it has over the last decade,
why we ought to be thinking about this.
What did it look like over the last decade, those two numbers?
Those two numbers.
So just to clarify, the two numbers
are the amount of compute globally and the energy inputs
into the.
And at a super macro level, which I ultimately
think is sort of the spirit of the study,
how do those two numbers look compared to each other?
Yeah, that's a really good question.
And I should be careful.
I need to frame this quite carefully.
So in our study, we looked at the period 2010 to 2018. So it's not quite a decade. And the reason we did
that is it takes a few years for data to appear. And the most robust assessments are always
retrospective because the technology in the IT sector and, frankly, data center operations change
quite quickly. So it's typically most
credible to look back when you have reasonable data on the past to weigh in on these trends.
So let me take the first number, which is the amount of compute globally. So we don't know
that number precisely. What we had to use in our study as a reasonable proxy was the number of
compute instances. And this was defined by Cisco.
They have a report which they've been publishing
each year for roughly the last decade,
the Global Cloud Index Report, GCI,
where they estimate the number of workloads
and compute instances running in the world's data centers.
But we don't necessarily know the computational intensity
of each of those.
We have to take an average.
But if we use that value as a proxy, the number of compute instances, that has gone up by
more than sixfold over the period 2010 to 2018.
So a sixfold increase.
Sixfold.
600%.
Sixfold, yes.
Yeah.
Sixfold, yes.
A factor of six.
Okay.
Over that same time period, our best estimates suggest that the energy used by
data centers has only risen by about 6%. So that is, we calculated in the paper, that equates to
an energy intensity reduction. If we take the amount of energy in the numerator and the number
of compute instances in the denominator, so energy per compute instance, that has gone down by around 20% every year.
And that pace of efficiency improvement is far greater than we can see in any other
sector of the energy system, global industry, global aviation,
global transport sector.
And we wanted to point out that even though the data center industry sometimes
gets beaten up a bit about its energy use or about its contributions to climate change,
that is really a remarkable efficiency improvement and better than nearly any other sector for which
we have data. Okay, so we're on a data center podcast, Eric, so I'm going to ask you to drive
this one home. I think what I just heard you say is if you run the math, it's essentially the data center industry or the compute footprint on the planet has improved its efficiency by about 20% a year and stacked up against any other industry.
That's fantastic.
Is that a simple way of saying what I think I just heard you explain?
I think that's a good description of it.
We couldn't find any other sector that has improved its efficiency per unit of service
provided anywhere near what the data center industry has been able to accomplish over
the last decade.
And that's 20% per year.
It's not like over the last decade, the industry's figured out how to get 20% better.
It's delivered that kind of performance on average for several years, for almost the last decade. Is that
fair? Because 600% growth and compute footprint with only a 6% growth in
electricity consumed, I mean that's consistent performance year after year
after year. That's a fair way to say it, right? That's a fair way to say it, yes.
20% reduction in energy intensity per year. Granted, the caveat there is that we're looking
at the number of compute instances hosted in global data centers. That's our proxy,
but we felt it was a reasonable proxy to show that the energy required for a unit of service
that's delivered by a data center has been dropping rapidly. Yeah, well, I think that historically, great job on the technology industry, great job by the data
center industry, great job on, hey, we've got to manage this proliferation of compute cycles
in a wise way and make sure that it doesn't get out of balance from an energy consumption
standpoint. But also just what an
incredible tool to solve so many other problems, right? I mean, there's all these incredible
ancillary benefits. And as a consumer, right, I think of the easy ones like Uber has changed my
ability to get around, especially when I'm out of town or my kids love getting food delivered to their, you know, just the convenience of Uber Eats or Postmate or Grubhub.
Those are simple consumer understandings of how technology helps us.
But, I mean, there's an incredible set of benefits that come along with this increased energy use that I think helps tilt the scale towards technology being an overall benefit.
I'm not asking you to weigh in necessarily from a global energy footprint, but from what you guys studied, hey, it seems like the industry is doing a heck of a job.
Is that a fair statement?
I think that's a fair statement.
Now, trying to understand the net benefits of digitalization is notoriously very difficult,
partly because we don't have great data on the entire system, but partly because you have to set up the calculus
for understanding whether streaming is better than going to the video store, which frankly,
nobody does anymore, or teleworking is better than commuting. You need to look at a really
broad system. You need to look at, on the one hand, I've now eliminated a commute,
but what did I eliminate? Was it public transit or was it me driving 30 miles each way
as a single rider in an SUV?
You know, the benefits of eliminating that commute
really depend on what's being eliminated.
And then on the teleworking side, if I'm at home,
now with my air conditioning on and all the lights
and music blaring, versus if, you know,
I'm just using a little bit of additional
energy to connect my computer to the internet.
The benefits of any of those shifts really depend on the specific situation.
And teasing that out has always been difficult for the research community.
But in general, what we're seeing from the literature is that digital services, because
we're moving bits and not physical you know stuff digital
services are generally much more environmentally efficient than the the
the physical services that they replace so you mentioned in the study and you
mentioned in our conversation here that you guys are doing the best with the
data that's made available and that you'd love to see the industry make more
data about their facilities and about their compute
devices available specifically around energy usage. But it can be hard to know exactly,
but you think it's a good proxy. I love that term using this Cisco GCI as a great proxy.
As we think about a large data center and getting all the reports around a 50 or 100 or 200
megawatt data center,
we can get a lot of information out of that one building.
As the trend of edge computing begins to pick up pace and where our compute cycle sits starts to distribute,
do you think that's a net positive for energy utilization, a net negative,
or is it just a whole other set of problems around reporting?
And maybe I'm teeing up the question too much, utilization a net negative or is it just a whole other set of problems around reporting and maybe
i'm maybe i'm teeing up the question too much but how do you guys as your group think about edge
computing changing this equation yeah well edge computing is certainly something that needs a lot
more study so we think there are three trends that really need to be better understood one is
artificial intelligence which could uh require a lot of computational energy intensity or computational intensity.
The second is 5G, which could spur lots of new demand for data center services.
And the third, as you mentioned, is edge computing, moving potentially some workloads and some compute instances much closer to the end user in smaller data centers.
And we really don't know yet how that's
going to play out from an energy perspective.
On the one hand, we can imagine it bringing us back in time
to having many smaller data centers that
have perhaps less efficient cooling systems,
less ideal infrastructure systems, and so forth.
And that could drive energy use up.
But if they're well-managed edge data centers, meaning you're using the most efficient equipment,
it's run at high-capacity utilization, PUEs are kept to their practical minimum,
then perhaps the net effects on energy use will be more minor.
But the fact of the matter is we don't really know. And it's a topic of
interest that really needs a lot more exploration from an energy analysis perspective. And it's
really on the agenda, I think, for us, my colleagues, but also other data center energy
researchers. Getting a handle on AI, 5G, and the edge, I think, is the most important job we have
as analysts for weighing in on where
data center energy use may go in the near future. Well, Eric, I think those are all big ones that
you hear talked about a lot in our industry and certainly are going to have a major impact on
not only how we do business and how we do what we engage with from a technology perspective,
but how they impact energy use, which is certainly right in your wheelhouse. Thank you so much, Eric.
Thank you, Raymond. Really great questions, by the way. It was a lot of fun talking to you.