Catalyst with Shayle Kann - The rise of flexible data centers
Episode Date: April 9, 2026As the buildout of data centers accelerates on a dramatic trajectory,its strain on the electric grid has increased in turn; forecasts suggest they could consume up to 17% of all US power by 2030. To a...void higher rates and slower AI growth, the industry has embraced a promising solution: data center flexibility. In this episode, Shayle speaks with Varun Sivaram, the CEO of Emerald AI. Coming on the heels of a $25 million investment round led by Energy Impact Partners, Varun returns to the show to provide an update on the "wickedly complicated" challenge of aligning utilities, cloud providers, and the grid. Shayle and Varun explore topics like: Tapping into the 100+ gigawatts of unused grid capacity Why the "Watt-Bit spread" is shifting to make power flexibility profitable The differences between training and inference flexibility, including Google’s new "flex" and "priority" tiers The "mini dispatch curve" for data centers created by batteries, gas turbines and fuel cells Emerald’s plans to collaborate with NVIDIA and other partners on the world’s first 100-megawatt, truly power-flexible AI factory Resources Catalyst: The mechanics of data center flexibility Catalyst: The potential for flexible data centers Latitude Media: How the world’s first flexible AI factory will work in tandem with the grid Latitude Media: Nvidia and Oracle tapped this startup to flex a Phoenix data center Latitude Media: A reality check on flexible data centers Latitude Media: Can VPPs unlock grid capacity for data centers? Credits: Hosted by Shayle Kann. Produced and edited by Max Savage Levenson. Original music and engineering by Sean Marquand. Stephen Lacey is our executive editor. Catalyst is brought to you by FischTank PR, an award-winning climate and energy tech, renewables, and sustainability-focused PR firm dedicated to elevating the work of both early-stage and established companies. Learn more about their PR approach and how they can support your company’s messaging by visiting fischtankpr.com. Catalyst is brought to you by EnergyHub. EnergyHub helps utilities build next-generation virtual power plants that unlock reliable flexibility at every level of the grid. See how EnergyHub helps unlock the power of flexibility at scale, and deliver more value through cross-DER dispatch with their leading Edge DERMS platform, by visiting energyhub.com.
Transcript
Discussion (0)
Latitude Media, covering the new frontiers of the energy transition.
I'm Shail Khan. I lead the early stage venture strategy and energy impact partners.
Welcome to Catalyst.
So my friend Varroon, Sivaram, came on this podcast back in August 2025, or roughly a century ago in AI terms.
At that time, we talked about his mission at Emerald AI to make data centers flexible,
specifically at that time by shifting AI workloads to deliver compute flexibility and
response to grid signals, it was then and remains now a somewhat controversial concept.
Largely, I think because the history of data centers, dating back to the emergence of the
cloud industry, always suggested that they are perhaps the most inflexible load on the planet.
Not only would they generally pass on participation in demand response programs, but they
actually needed N-plus-2 reliability just to ensure that the spice would always continue to flow.
But the world has changed in a bunch of ways since then.
The strain that the data centers are putting on the grid has become clearer and more present than ever.
Pressure on electric power rates is high, and so affordability is top of mind across the board.
And data center flexibility, either through compute orchestration or through behind-the-meter resources,
has started to become a mainstream concept.
I actually went on my own journey on this concept and spent a lot of time on it over the last year.
And here's where I came out.
From first principles, data centers should be flexible assets.
They just should.
They have many different types of workloads
with different degrees of urgency,
and it's crazy to think that they couldn't differentiate.
And in fact, some of them are now starting to differentiate.
But the actual mechanics of getting them to do so
and getting all the players aligned is really tricky.
Anyway, long story short, I invested in Emerald.
We announced just a couple weeks ago
that we at EIP led a $25 million round in Verroon's company.
So I brought him back on today
for an update really on the extremely dynamic world.
of data center flexibility.
That's coming up next.
When utilities need flexible capacity they can count on, they turn to Energy Hub.
Energy Hub works with more than 170 utilities, coordinating over 2.5 million devices
to manage 3.4 gigawatts of flexibility built for the moments when utilities can't afford uncertainty.
Energy Hub builds and operates virtual power plants that utilities actually stake their grid planning on,
coordinating EVs, batteries, thermostats, and more through a single platform,
built-for-utility scale.
Predictive, verifiable, and designed to perform when it counts.
Learn more at Energy Hub.com.
Trillions of dollars are flowing into clean and critical infrastructure,
but those investments aren't driven by technology alone.
They're shaped by markets, by policy, by capital,
and by the institutions that connect them.
I'm Alfred Johnson, CEO of Crux,
and host of a brand-new podcast, Critical Capital.
Each episode, I talk with people deploying capital,
shaping policy and building the clean economy.
Tune in as we unpack how progress is actually made.
Listen to Critical Capital on Spotify, Apple, or wherever you get your podcasts.
Catalyst is supported by Fish Tank PR.
An award-winning PR firm focused on climate and energy tech, renewables, and sustainability.
Fish Tank is known for generating prominent and effective media coverage for the brands they work with.
If you want a PR partner that's thoughtful, shoots straight, and gets results, you'll like Fish Tank PR.
To learn more about Fish Tank's approach, visit Fish Tank,
That's F-I-S-C-H-F-T-T-P-R.com.
Vroon, welcome back.
Shail, thank you for having me back.
It's nice to be on the other side and talking as partners in crime in your business.
So you were back on this podcast in August of last year, which, depending on how you look at it,
is either a very long time or a very short time ago.
I guess I want to start with just like your high-level perspective on what is how
happened in the market. And I guess by the market in this context, I mean data centers in the grid,
and then specifically also in your market, which is the provisioning of flexibility for data centers
in the grid. So just over that last, whatever that is, nine, ten months, like, what do you see
as having happened? I think, Shail, it is a very long time, given the dynamics of data centers
and the grid. But actually, before I answer that question, let me first say, how delighted I am
to get to work with you as a partner in crime.
Ten years ago today was when you and I co-authored a Nature Energy article.
We wrote this article about setting a new cost target for solar power,
really ambitious one, 25 cents per watt fully installed.
And I think, you know, the latest prices in China show that the cost is roughly double that.
So we're almost there.
More to come on that, because I have been doing some things that are still trying to hit
that target was you said, even in China, we have not hit that target yet. I still think it's
possible. But yes, it was, it's fun to know. I did not realize it was 10 years ago to the day,
but obviously it's been a fun journey. So good to be, good to be on the same team here.
Absolutely. Data centers in the grid. Absolutely. So let me get back to that overview.
It's been a long 10 months since I was last in the pod, and here's what's changed. Data centers
are even more of an energy issue than we thought was the case back in 2025.
The latest numbers show that in January of this year, NERC forecasts a summer peak increase of
224 gigawatts, almost all of which will come from data centers.
Data centers now account for 94% of PJMs projected peak load growth, and by 2030 EPRI
forecast that data centers could use up to 17% of America's power.
All of these are incredible insane statistics, and they're reshaping the landscape of energy as we know it.
And you've obviously talked about this a lot on your other podcast sessions.
But I also think that, you know, this particular topic we're talking about today, it's important to keep talking about it.
The last time we talked about it was a different context than today.
224 gigawatts of peak load growth is between a quarter and a third of peak demand.
And so that's a massive increase that data centers are going to be driving.
And if we try and build our way out of this, as I told you on the last podcast, we risk higher rates and slower AI growth.
And that's good for nobody.
I would add, you said higher rates.
I mean, I think one thing to me, like, yes, it is true that the buildout, the data center buildout has just continued to accelerate over the past nine or ten months.
Like, nothing has stopped that upward trajectory.
And maybe it has gone parabolic.
like it's hard to tell exactly because we're in the middle of it.
The one thing, though, that I think is much more front of mind today than it was in August
of last year, you alluded to, which is affordability.
Like, that has become, that has come front and center now in a way that I think it wasn't
quite there, you know, nine, ten months ago, it was just starting to be.
And now it's like every conversation is about affordability.
Oh, absolutely.
And look, we should be clear that historically the drivers of rate, including,
may very well not have been data centers.
Data centers may have been conflated in the data.
And some data shows that in areas where data centers grew more quickly,
rates actually increased more slowly.
But I think it's incontrovertible that going forward,
if data centers do drive most peak load growth,
and peak load drives most rate increases,
data centers absent some mitigations to help with this exorbitant grid build-out,
data centers could very well drive
affordability difficulties.
And that's why I think that, you know, there's a risk that they become a problem,
but there's also a real opportunity for data centers to become the true hero of solving this
affordability challenge.
Right.
And there's multiple ways to do that, and lots of them are starting to emerge.
These, you know, unique tariffs that data center operators or developers are starting to sign
with utilities, I think, are interesting.
And there's a bunch of different structures of those.
Let's narrow in, though, on the portion of that that is most relevant to, you know,
which is making data centers flexible assets.
So on that front, over the past nine, 10 months,
like what has changed?
A lot and a little has changed, I'd say.
Let me put it this way.
There is this deep divide that I observe between the level of flexibility,
service levels and tiers that the compute industry offers,
and the level of flexibility in the service tiers and level
that the utility and grid operator ecosystem offers in terms of electric service.
And that schism is one of the reasons that this is such a hard problem to solve.
That even though, you know, I'm super excited and want to talk to you about the five commercial demonstrations we've done since you and I just last talked at Emerald AI,
it's still a fundamental challenge to make sure that we can take advantage of all of what I call the stranded power on the grid.
So to back up, Shale, for those viewers who didn't hear the last episode, data center flexibility, why is it important?
Data center flexibility is important because if these new AI factories,
as Jensen Huang at Nvidia calls him,
if they can be flexible, just a little bit of the time,
they can utilize this vast amount of unused capacity on the grid.
The grid is utilized at roughly 50% or less
during most of the year.
And therefore, we have all of this 100 plus gigawatts
of stranded power capacity that could power AI factories
connecting to the grid today if during those rare peak moments,
AI factories could ramp down partially
for a limited amount of time.
That's why flexibility is so important.
Now, I mentioned the schism because just this month,
we've seen a lot of movement on the AI and compute side
in terms of flexibility tiers.
Just earlier this week, Google announced
its flex and priority inference tiers.
So if you're a developer and you are buying tokens
of artificial intelligence, you're running models,
you're being served inference,
you can choose to have those delivered absentee,
immediately, or you can choose to have those delivered after a delay, and you'll pay different
prices. But more importantly, it's not just the price component. It is the service component.
The service literally changes between those tiers. And it's not just Google that does this.
You know, Anthropic has a peak period for serving tokens, and you tend to hit your rate
limits earlier, and it has a non-peak period. I know all of this because we have one of these
token maxers, Nick Hill on our Emerald Day.
team, and he always tells me every time he hits a rate limit. So given that there are literally
different service levels, it's going to be possible, I believe, relatively quickly, to take advantage
of all this flexibility in the different ways people use compute to throttle their power consumption.
The question, however, is on the other side. On the utility side, it is still a work in progress
for a utility to offer a different service level, right? Today, you typically get one service level.
The service level is you get power. There isn't a service level.
level that says for most of the year you get power, but some of the year, we're going to ask you
to be flexible. There is to some extent, which is demand response, right? That's the legacy
version of what we're talking about, which is a you enroll in a program. This is less so
for data centers historically, but for large loads. You enroll in a program, we, and that as
that enrollment, you know, we are going to send you a signal or we're going to call you as it's
been historically a couple times a year at peak hours, and we're going to ask you to ramp down.
This is kind of an extension of demand response, right?
Like, what do you think of as being the same and different?
I completely agree that, first of all, there are natural pricing tiers.
So if you're in Erkot in Texas, you can pay more money for power right now,
or you can ramp down and pay less money because power prices are high.
Then there are demand response programs, as you mentioned.
A utility might say you can make some money if you agree to curtail during this period.
You might own an S thermostat and be enrolled in a smart thermostat program,
and they will pay you if you are willing to reduce your concerns.
There are even mandatory programs where a prerequisite of your enrollment is your promise to curtail and a hefty penalty if you do not curtail.
But what's missing from all of these is the ability to offer the service of curtailment at the large scale that data centers could theoretically provide it and get a real benefit out of it.
And that benefit is not just a cheaper cost of power or a flexibility payment.
that benefit is a larger power connection or faster access to the power grid.
Whether you're an existing data center seeking to increase your capacity
or a brand new data center seeking to connect,
you should be allowed to harvest the existing stranded capacity on the grid that can be served to you
if you are willing to curtail every so often.
And that product doesn't exist today.
And with good reason, you know, the electric utility industry for over a century
has wanted to promise firm service to its customers.
And so there really isn't this non-firm service tier that allows you to skip the line.
And there are all kinds of legal considerations that come into play.
But that's absolutely where we have to go.
You know, the interesting thing about that, I think my guess is that most people would say,
if you ask most folks who are kind of like in and around this market, like what is the,
what is the limiter on data centers becoming flexible grid assets, particularly leveraging flexibility
and compute as opposed to leveraging on-site?
behind the meter resources, which we will talk about as well. I would guess that most people would say
the main limiter is actually on the compute side. It's because in, you know, in the legacy cloud world
and then AI as it has emerged since then, there was an expectation of extraordinarily high reliability
and low latency. And so, you know, your expectation of your workload getting curtailed is very
low. And so that would have been the limiter. You know, it's interesting that you're saying it's
actually on the other side, because on that side, the compute side, it seems to be emerging.
I saw that Gemini announcement as well, and that's cool to see Google doing that, and that makes
the new limiter on the other side of the equation, which again is, as you said, the problem is not
offering differential pricing or savings based on curtailment. It is actually saying, look,
if you agree to curtail a certain amount for certain times, we will interconnect you faster or we
will give you a larger interconnection. That's the thing that's missing.
Yeah, exactly. And let me first just preface by saying, it's not the case that faster and larger connections because you're willing to be flexible is not happening anywhere in the world. In fact, Google now has reached a gigawatt of contracted flexible capacity across, I think, five different utility territories, at least some of whom are willing to provide some of these benefits. So Google's really been a pioneer. I believe across the more than 3,000 American utilities, we have a long way to go to bring these.
differential service tiers onto the market, but still, we're making good progress.
But the point you make Shale, I think, is a really important one.
The point you made Shale is, look, everybody sort of discounts data center flexibility
because they're thinking about the compute side.
They say these AIGPUs or accelerators are extremely valuable
and produce wildly valuable tokens of artificial intelligence.
If you watched Jensen's keynote at Nvidia GTC,
you saw some of the ridiculous economics of operating token factories, AI factories, right?
It's a great idea to maximize the tokens you are generating per watt of power.
And so the intuitive response is it is a terrible idea to ever curtail any GPU
because the economics just absolutely don't make sense.
That, by the way, is one of the reasons why I feel,
fortunate that basically no one else is doing what Emerald does because that in itself is an
intuitive blocker to founding a company like this. But I believe it's actually on the other side,
as you said, Shail. I believe that if electric power utilities and grid operators offered
these a range of different service tiers, just like on the compute side, the cloud operators
offer a range of different service tiers to their end-AI customers,
innovation would solve this problem.
And you would absolutely have data centers taking advantage
of the lower service tiers.
And they're not that low, by the way.
It's just to call it 50 or 100 or 200 hours a year
that you'd have to curtail.
Data centers would very happily take utilities up
on these lower service tiers
in order to accelerate their connection
or get a larger connection.
Yeah, I think there's another way to put it,
which is that it is true.
true economically that it's probably a dumb idea to curtail, to not maximize token generation
if the benefit of doing so is purely a cost savings on your electricity bill.
Right. And so if it is traditional demand response or something like that,
and the benefit that you get is just you save some money on your bill, those numbers don't
pencil, largely for the reason of, you know, Brian Janney,
coined the bitwatt spread term. The bitwatt spread is so big that it's just not that valuable to you
to save some money on your electricity bill relative to the revenue that you're going to generate
with your chips. So that is true, generally. You could disagree with me if you want,
but the economics of getting a data center connected larger or faster are orders of magnitude
different. And so if that is a benefit that you can get, it actually does flip those
economics. So I completely agree with your second point, and I sort of half agree and half disagree
with the first point, right? On the second point, completely agree, if you've got a 200-megawatt
data center and you are able to increase its capacity to 230 megawatts just a year ahead of schedule
and you can swap out to liquid cooling and next generation Nvidia GPUs, you create billions of
dollars of value, even netting out the cost of some of the downtime of curtailing during those rare
peak load hours, you should absolutely take the deal.
about a new electric utility service tier.
The earlier point you made though, which is,
is there ever an economic incentive
that makes it worthwhile on the fly
to reduce your operational expense,
your OPEX by reducing your power cost
or getting a flexibility payment to curtail a little bit?
Even if the answer isn't yes in all cases today,
and I think there are some cases where it is yes,
I think the answer will increasingly become yes in the future
as the cost of inference, the cost of token generation,
asymptotically approaches the cost
of power, which is the only real operational expense input into the cost of intelligence generation.
And even today, there are lots of efficiencies we can harvest on a temporary basis to mean that I can
reduce peak power by a larger amount than the token generation that I'm avoiding simply because
by operating, this is getting a little technical, by operating a little differently on the power
performance curve on a part of it where I'm not losing as much performance.
but I am losing, reducing quite a bit of power.
There's a pretty good tradeoff to be made on a temporary basis, for example, for some
inference workloads.
Microsoft has done a great job quantifying this.
So there is some low-hanging fruit to harvest here, which means, I believe even today,
it makes sense to be responsive.
And in the medium and longer run, it's going to make a ton of sense to be responsive
just in terms of reducing your power bill.
It's a good point.
Yeah, it's a good point particularly about over time.
We're in a moment right now that is not reflective of.
where we're going to be in five years or 10 years or who knows how many years, two years maybe,
but like inference should and probably will get cheaper and cheaper.
The market will be saturated with it.
At that point, the cost of energy does matter.
So that is a good point.
You alluded to one thing I wanted to ask you about.
As you think about workload flexibility, talk to me about the different workloads
and like what is more and less suited.
Obviously there's the training versus inference distinction, and I'm interested in your perspective
on that. But even within a category, even within inference, for example, like, what have you learned, having run all these demos now about, like, what types of workloads seem to be the ones where there is enough volume? They are a big enough portion of the overall workloads to matter and where they have the most flexibility.
Yeah, absolutely. There are many different AI workloads. It's more than just a simple dichotomy of training and inference. Each of these categories has lots and lots of different subtypes. I will say,
A good reference, there's a March 2026 paper that was published actually by Emerald's chief scientist,
Professor Isha Koskuna at Boston University, with two others, both at BU and at Emerald AI,
that shows a 18 to 55 percent power flexibility opportunity across a range of different representative AI workloads,
spanning, training, inference, fine-tuning, and all of their subtypes.
So there's a lot of inherent flexibility is kind of my overarching point.
And then I can go into the various kinds, as you suggested.
As you mentioned, we've now done these five demonstrations at data centers with
Nvidia, with EPRI's DC Flex initiative, and with many other partners from Oracle to
Nebius to National Grid.
And in each of these, we've tried to reenact real production grade actual workloads.
So, for example, in London, we ran workloads from real models, whether they're, you know,
open AIs models, meta's open source model, even.
an Alibaba model from China to showcase that we could achieve performance levels that real customers
are find acceptable, making sure not to throttle workloads that are the customer labels as mission
critical or that should not be throttled, while at the same time precisely meeting grid
objectives, whether that is responding within seconds to a scenario of a lightning strike or
reducing power by 30 or 40% during the halftime of a soccer game, where in England,
everybody turns on their tea kettles. So we believe that many AI workloads can be throttled
in a way that's acceptable to customers. And many of these will be fine-tuning, post-training,
training-type workloads. There are other workloads, and by the way, some inference workloads
as well, like batch inference.
There are other workloads where these workloads,
even if they can't be throttled,
they might be batched differently.
So, again, you're still basically reducing
the power consumption in one data center,
or they can be migrated.
So with Oracle, we showcased migrating AI workloads
from one location, Virginia to another, Chicago.
And this was during the data winter,
the Dominion Winter Peak period,
the inference queries got rerouted
in such a way that you were able
to precisely meet the Dominion grid's power constraint
while utilizing capacity far away,
while the queries moved within milliseconds halfway across the country.
So the user experience really wasn't changed.
And so if you're chatting with the chatbot,
which by the way, is just one of like a million different AI use cases,
that isn't an experience that's going to change very much
if what's happening is this geo-shifting under the surface.
So my only point here is there's so many different use cases,
You know, Google, when they announced their flex and priority inference tiers yesterday,
they cited cases like background CRM updates, large-scale research simulations,
agentic workflows where a model is browsing or thinking in the background,
as cases of workloads that are inherently flexible that you probably don't need an answer from right the second.
And, of course, that helps Google to better optimize its own AI infrastructure, right?
You might be able to use your AI servers for queries that are more urgent.
But it also helps us to throttle power use by tapping into the inherent flexibility of computational workloads.
And so the last thing I'd close on this is, historically, data centers have been optimized as a closed system.
There are computers, CPUs, GPUs, memory, storage.
There are fiber optic networks.
And so there are multiple data centers across the country.
And you optimize this closed system.
Nobody's ever considered adding within this close.
closed system, if you weren't listening to this podcast and you were looking at my hands,
one circle is around the data center system, and a bigger circle is around the data center plus
the grid system. No one's ever added the grid to the closed system. And once we add the grid,
then we are optimizing not just for where there are servers available or where there are
fiber optic congestion constraints, but also where there are transmission line congestion constraints
or where there is inadequate generation. And that overall optimization problem causes you to
harvest computational flexibility in a different way and often in a way that utilizes this massive
electric grid fixed asset and save everybody on.
Virtual power plants are becoming a reliable way for utilities to manage capacity, but enrolling
devices is just the start. What really matters is confidence, knowing those resources will
perform when dispatched and being able to prove it from the control room to the living room.
Energy Hub's platform handles the full picture, from near real-time forecasting, the
vocational dispatch, and the kind of rigorous verification that holds up when regulators,
grid operators, or leadership ask, did it deliver?
Easy enrollment creates momentum, proven performance builds trust.
That's why more than 170 utilities rely on Energy Hub to manage over 2.5 million devices
delivering 3.4 gigawatts of flexible capacity.
See what that looks like at energy hub.com.
We're living through a profound economic shift, and energy sits at the center of all of it.
Trillions of dollars are flowing into power plants, transmission lines, battery factories,
data centers, but the future of energy isn't shaped by technology alone. It's shaped by markets,
by policy, by capital, and by the institutions that connect them. I'm Alfred Johnson, CEO of
Crux, the capital platform for the clean economy. Join me for my brand new show, Critical Capital,
as I talk with people deploying capital, shaping policy and building projects. Together, we
unpack how risk is priced, how incentives are.
structured and how progress is actually made. Listen to critical capital on Spotify, Apple,
or wherever you get your podcasts. Are you tired of overpaying for big-name PR firms, but not
really knowing what they're delivering? Is your comms team wasting time reviewing lengthy messaging
briefs and decks instead of engaging journalists or producing content? Are you wondering
why your competitors are getting press and you aren't? Fishtink PR is an award-winning climate
and energy tech, renewables, and sustainability-focused PR firm dedicated to elevating the work
of both early stage and established companies.
Whether you need to position yourself as a thought leader
in between project announcements
or translate complex ideas and technologies
into tangible, compelling stories that resonate with the media,
fish tank can help.
Check out fish tankpr.com.
That's f-i-s-ch-fish-tankpr.com.
One thing that, as I've spent more and more time with you
and sort of learning about what Emerald is building
and more broadly learning about the concept of compute flexibility with data centers.
One of the challenges, it seems to me, is just it's a multi-party situation.
There are a bunch of actors in any given situation.
It's not as simple as you want it to be.
It's not like there's data center and grid.
Even within the data center, quote unquote, right, there somebody is operating the data center,
somebody else might be the cloud provider to the data center.
Somebody else might be running the workloads or actually being the customer.
So can you walk me through how you think about the stack of who needs to do what?
If we're going to deliver on this promise and we're going to take advantage of the 100 gigawatts of latent capacity we've got on the grid by making data centers flexible, like who needs to sign off on what?
Yeah, it is a wickedly complicated multi-parting problem.
I'm delighted Shale.
You got comfortable getting your hands around this and now we can work together.
Look, I go back to an earlier question you asked, which is, what's the moment?
most critical thing that has to happen. The most critical thing that has to happen is power utilities
and system operators and regulators and governors saying, if you are willing to be power flexible,
O data center, we want you in our state. You get to skip the line. You get to connect faster.
As a flexible load fast track, you get a bigger data center, et cetera. If that happens, I believe
everything else quickly falls in line. Now, you're right. The data center is not one monolithic
entity, it comprises a lot of players. You might have, for example, a data center, developer,
owner, and operator. I'll make one up, digital realty, a terrific one that we partner with,
that is operating a data center within which they have a tenant, right? That tenant might be one of
the many folks we've partnered with, like Nebius, for example, or Oracle or Lambda.
And within that cloud provider, by the way, it could be a hyperscaler as well, within that cloud
provider, you might then have a customer. And that customer, by the way, may not be the end customer.
You might have together.a.i or fireworks. which is an inference serving service, which is then serving
tokens and enabling an end customer to run models on them. And there may be N layers here.
And ultimately, all of those layers have to work together so that the data center at the point
of common coupling, that interconnection point to the grid, has to adjust its net, which
draw from the system consistent with the grid signals, right?
This is complicated.
Emerald seeks to be the easy button for data center flexibility, and in order to do that,
we basically have to have modules at every layer of this stack, right?
We have a module for utilities.
We have a module for the data center operator to interact with the utility and communicate.
We have modules for the cloud operator, for the end user to have Emerald agents to help
them to most gracefully throttle the workloads that they may want to throttle.
We have agents elsewhere that are working on the on-site energy resources to harness all of those
energy resources as well.
So it is a complicated stack, but I will say everybody becomes much more willing to work
together when there's a real economic incentive, and it's the grid that sets that incentive,
which is to say you get connected faster, you get a bigger data center if you're willing to do this,
everybody else will work together.
Yeah, I agree with that.
the prerequisite, if the grid says the right thing,
everybody else falls into line,
you just mentioned onset resources.
I want to talk about that for a second, too,
because I think part of what's happened
as the concept of data center flexibility
has gained more prominence,
is that it has morphed somewhat.
Sometimes people, when they say data center flexibility,
they're talking about workload flex.
Other times they're talking about,
from the grid's perspective,
what makes a flexible data center.
And in many cases right now,
what's happening is that data center developers and operators are putting a bunch of assets behind
the meter. Usually what that means is gas turbines of one kind of another, maybe some best,
some battery storage, maybe there's some other generation solar could be behind the meter as well.
But like, it's behind the meter generation and storage. And so there's one version of data center
flexibility, which is just, you know, grid says or utility says, I need you to curtail now this amount.
and you just fire up your behind-the-meter generator
and you don't do any workload flex at all.
There's another version where these things all play together.
So walk me through how you see the landscape emerging
with the relationship between behind-the-meter physical resources
and workload flex.
Totally.
I'll say a prefatory point, which is,
I believe AI factories belong on the grid.
I think it's best for everybody.
This is counterintuitive, by the way.
A lot of folks might say,
say, well, if the data centers just went off the grid, that would insulate the rate pairs
from the peak load increase that the data centers would cause and the bill increases,
etc. But I believe that's a little bit short-sighted. The far-sighted way of thinking about this
is, as data centers become, by the end of this decade, up to 17 percent of America's load,
and, you know, in the decade beyond, a quarter, and a third, and half of America's load,
it would be a catastrophe if data centers were entirely decoupled from the electricity system,
because the system loses their biggest source of anchor tenant revenue
and the most exciting engine of American economic growth.
It's a terrible idea to be completely off-grid forever.
But in order to achieve that,
there may be a period of time in the near term
where data centers say, I need to get online right the second.
And so therefore, I'm going to build myself bridge power.
It's highly rational.
And the hope is we will be able to quickly bring those resources behind the meter
to bear to support the broader electrical system
and connect those data centers.
With NVIDIA, we made a major announcement at Zero Week that
Nvidia has a reference architecture.
It's called DSX.
It's how AI factories should be laid out and should operate.
One element of it is DSX flex, the capability to be flexible,
and Emerald is a software partner that helps to operationalize that.
And we joined six large, the largest American power companies to say,
even if you are putting on bridge power,
we're going to incorporate that into the DSX reference design.
We're going to call them hybrid AI factories,
and we're going to make sure that they can work together as a single unit
to provide services back to the grid.
In some sense, this is a super-flexible AI factory facility.
And the reason for this is you can coordinate the on-site resources,
the gas generators, the batteries, alongside the computational flex.
Because AI factories, these token factories,
are inherently flexible, as I mentioned,
that 18 to 55% inherent flex built-in to AI workloads,
You can take all of this together, and when you do get a grid connection,
and ideally it comes faster than it otherwise should because you're flexible,
you're able to provide real services back to the grid.
So one of the things that at Emerald we've been focused on
is orchestrating not only the computational resources,
slowing down workloads, moving workloads,
but also doing that in tandem with the on-site energy resources,
generators, cooling, batteries, fuel cells,
The reason that's important is you might have a microgrid on-site.
It may be operated through the software systems of Siemens or an Eton or a GE-Vernova,
all of whom actually just entered this round in Emerald AI that you led, Shale.
And we will play very nicely with all of them.
We'll integrate and we'll say Emerald will help to recruit the amount of generation and flexibility
from these on-site energy resources and coordinate it with the computational flexibility from the GPUs on-site.
And that entire unified amount of flexibility is what the grid sees in terms of a change in the net withdrawal of energy from the grid.
So I look at bridge power behind the meter resources is really a way to supercharge flexibility and not a way to remain as a permanent island.
Yeah, I'll tell you the way that I think about it, and you can tell me if this resonates with you, you know, there's a dispatch curve in electricity in general.
And when there's a certain amount of demand, you start at the bottom of the dispatch curve, which is basically the cheapest resource to,
generate and you keep moving up the dispatch curve until you meet the demand, right? And so in the
context of the broader electricity market, the low end of the dispatch curve is stuff with no
marginal cost, which is solar and wind, mostly, right? And then you get further and further up and
other things get dispatched more and more. When there's a single asset, a single data center that
has multiple resources that it can draw upon to meet a need, which in this case is going to be,
you know, some amount of curtailment from the grid's perspective, it's kind of like, it's kind of
like a little mini dispatch curve, right?
And they may have one thing that can go in that dispatch curve, or they may have six.
It doesn't really matter.
And if you think about it in that context, then it should be that workload flex is the bottom
of that dispatch curve.
In other words, the cheapest thing to deploy, assuming that you still, you know, to your
point, you're taking advantage of the latent inherent flexibility.
In other words, you're not sacrificing customer SLAs.
you're not sacrificing performance to customers, things like that.
If you take that to be true, then workload flex is the cheapest thing you can do.
And you should do as much of it as you can as long as you don't sacrifice customer performance.
Then, if you need more, which you may well, you should then dispatch things that cost more money.
And that may be firing up your generator that you have behind the meter.
It may be firing up your battery or dispatching your battery or fuel cells or whatever it is.
All those things come at a significantly higher cost to dispatch.
but you might be able to get more out of them, right?
You might have a gas generator behind the meter
that is rated to the same capacity
as the entire data center.
And so you can, you know, if you need to flex down to zero,
that's the way to do it.
But if you need to flex down 20%,
it actually might make more sense
in most cases just to do the workload flex.
So I think of it as this like little mini dispatch curve
that some data centers will ultimately have,
but really only the ones that do have
the behind-the-meter resources.
I love the dispatcher.
curve analogy, all I'll say is I don't think it's a static dispatch curve.
So in electricity markets, it's always that gas pinker that's going to set that marginal
price when you have sufficient demand, right?
It's always that skinny, pointy one at the right side of the dispatch curve.
Whereas in the data center, you'll have a complicated, dynamic, constantly changing dispatch
curve.
I agree with you that there's always going to be a fat short part of the, on the left-hand side
of the dispatch curve that's going to be some latent workload flexibility that we can just harvest.
There will be customers who are willing to tolerate a little bit of flexibility,
and there will be workloads that are inherently tolerant to some flexibility.
I'll just note here, by the way, there are so many reasons that AI users are tolerant to workload
flexibility because all other kinds of things can happen in a data center that might require them to be flexible.
So power is just yet the next thing that we asked them to be flexible about.
But in addition, in addition, there will be workloads that are less inherently flexible
or that are higher up on that dispatch curve.
They might sandwich the battery.
The battery, by the way, might have an operating constraint.
It can provide you a certain amount for a certain amount of time.
That sets the width of that bar, so to speak.
But you might have some very interesting dispatch algorithms.
One day you might even have what I call energy token.
arbitrage or watt token arbitrage where you might actually choose to throttle tokens even before
the grid actually requires you to do so because it's economically optimal in this particular case to
charge your battery, let's say. And I believe that as we build in intelligence such as forecasting,
which many of our five demonstrations now have done, we'll be able to forecast on both sides,
both the grid side when we expect an event to arrive and on the AI side, when we expect a job,
that is more or less flexible to arrive on the scheduler.
All of this means it's just a more complex dispatch curve,
but I love the analogy.
And for us, gridwarks, it's a useful organizing principle for us.
Yeah.
It's a good point on the charge of battery one.
I think people haven't really thought this one through, right?
We're going to put a lot of batteries behind the meter data centers.
Like, I'm pretty convinced that that's going to happen.
But let's say that you're a data center that has 200 megawatt,
you're a 200 megawatt data center,
and you have a 200 megawatt interconnect,
and you add a battery.
How do you charge that battery?
Right? Like you kind of either need the data, either you need a bigger interconnect.
Your total load is actually 200 megawatts plus the size of the battery, which is going to be big.
Or you need to figure out how to be flexible on your power consumption from the data center,
such that some of the time you can be simultaneously charging the battery and pulling from the grid.
And so that's like an inherent workload flex requirement that you're going to have to solve
unless you are going to get a much bigger interconnection, which nobody can get.
Yeah, completely agree.
And Shale, I guess I just don't want to lose sight of the overall, the overarching story here, which 38 minutes in I'm now going to share.
That, what you just shared, Shale, is an important functionality, and I'll call it the fourth bird that you can kill with a stone.
But the first three birds are, first, let's get AI factories, data centers connected much more quickly and at larger capacities to grids thanks to flexibility.
Second, let's keep rates low and stable, thanks to flexibility by avoiding unnecessary grid build-out.
We still got to build, but nevertheless, if we can harness flexibility, we can build less quickly to less quickly rising peak demand while bringing on massive amounts of energy demand, megawatt hour demand from data centers that help to pay for the whole system.
And third, let's keep the system reliable.
The third bird here is if AI factories can respond to a system needs, that lightning strike, that soccer game, T-Kettle spike, a heat dome we demonstrated in Portland, Oregon with PGE, the U.S.
and NVIDIA, and many other potential reliability issues, well, then we'll be able to,
with one solution, we'll be able to basically get the grid we want and the AI adoption that we
want.
It's that really rare, holy grail solution.
It's why there's so much chatter about it, but you've correctly laid out the reasons
it's hard.
There's a lot of actors that you have to coordinate.
There's a lot of ongoing forces, such as the push to just go entirely off-grid.
And of course, there's the lack of those differentiated service tiers from the electric power system.
I think later this year, as you know, Emerald and NVIDIA and some other partners, DigitalBLT, EPRI, Dominion, and PJM, we will put together the world's first 100-magawatt commercial-scale AI factory that is truly power flexible.
It's custom design from the ground up to be power flexible, and it's going to be able to respond precisely.
to all of these grid needs, but at a commercial scale.
My hope is the community sees that in parallel,
we're getting to the point where more electric utilities are understanding
they have to offer these differentiated service tiers
and give you accelerated interconnection and larger connection sizes.
And that's when, you know, late 2026 and 27, this really takes off
and it kind of solves all three of those problems.
All right, Verun, this was fun, as always.
Appreciate you coming back.
Really appreciate it, Chale.
Thanks for having me.
This show is a production of Latitude Media.
You can head over to Latitude Media.com for links to today's topics.
Latitude is supported by Prelude Ventures.
This episode is produced by Max Savage-Levenson, Anne Bailey, and Sean Marquand.
Mixing and Theme Song by Sean Marquand.
Stephen Lacey is our executive editor.
I'm Shale Khan, and this is Catalyst.
