Catalyst with Shayle Kann - Frontier Forum: An energy-first approach to data centers
Episode Date: October 22, 2024AI is enabling a multitude of solutions across power, industry, and transportation. But AI energy demands are increasingly stressing the electric grid — creating a bottleneck for growth and new chal...lenges for clean energy supply. The mounting tension highlights the need for an energy-first approach to computing. Developer Crusoe is building AI infrastructure that takes advantage of clean energy to power workloads for AI modeling. Likewise, Nvidia, Crusoe’s primary GPU supplier, has been consistently improving the energy efficiency of its GPUs. Both demonstrate the innovation that’s happening in the marketplace to create a 'climate-aligned cloud' for customers. In the AI era, how do you build data centers with an energy-first approach? In this Frontier Forum, Stephen Lacey explores all sides of the AI-energy nexus with talks with Chase Lochmiller, the co-founder and CEO. They discuss innovations in data center design, why the energy demands of AI could be higher than projected, and why that shouldn't scare us. Chase Lochmiller will be speaking at Latitude Media’s Transition-AI conference on December 3rd in Washington, DC. Get your tickets here.
Transcript
Discussion (0)
This is a Frontier Forum, brought to you by Latitude Studios.
Long before most of us woke up to the high-powered demands of artificial intelligence, Chase Lockmiller witnessed it firsthand.
Chase specialized in AI at Stanford.
He later became a quantitative researcher developing algorithms for high-frequency trading.
And those algorithms were built using both classical machine learning and deep learning through neural networks.
We started training workloads on GPUs instead of CPUs.
The training time extended out, and the data center was.
bill went up significantly. And it was kind of one of these things where, you know, to get the
data center bill at the end of the month and, you know, it'd be kind of an eye-popping number in
terms of how much money we were spending on power. Now, it made sense for us, but, you know,
was just kind of one of those things that always stuck with me. Later, Chase became a general
partner at Polychain Capital, a multi-billion dollar cryptocurrency and blockchain investment
fund. That is, of course, another industry that requires massive amounts of power, and he soon
became interested in reducing the environmental impact of computation. After I left Polychain, I was
really wanting to do something at the infrastructure layer of computing.
And I've been thinking quite a bit about that energy side of things.
Chase linked up with an old friend named Cully Cavness,
who'd worked across investment banking, renewable energy development, and fossil fuel exploration.
And while working in the fossil fuel business, Cully witnessed the problems with gas flaring.
And so he and Chase co-founded Crusoe energy systems,
which utilized wasted gas to run small data centers,
which were built right next to drilling operations.
By bringing mobile and modular data centers to the location of this wasted flare gas
and then utilizing it to power the data center,
we were essentially able to bring a market to an otherwise stranded energy resource.
And then, as Crusoe scaled its digital flare mitigation approach,
a new computational challenge emerged, mass market AI products.
Seeing the opportunity, the Crusoe team raised hundreds of millions of dollars
to build a fleet of permanent data centers that, if fully constructed, could reach gig,
watts of capacity. Since then, it's continued to invest in this theme of taking this very energy-first
approach to developing compute infrastructure and trying to do things in a lower cost as well as
lower impact capacity to power future innovations. I'm Stephen Lacey. This week, we're featuring a
conversation with Chase Lockmiller, the co-founder and CEO of Crusoe. It was recorded live as part of
Latitude Media's Frontier Forum series. In the AI era, how do you build data centers with an energy-first
approach. Crusoe is scaling what it calls climate-aligned computing infrastructure, spanning data
centers, networking, and a cloud product. The company has raised billions in both venture capital
and project finance to deploy it. I talked with Chase about innovations in data center design,
why the energy demands of AI could be higher than projected, and why that shouldn't necessarily
scare us. I'm a firm believer in AI's potential to completely transform the entire human experience,
and with that, it will require tremendous amounts of power.
So let's talk more about that shift in the last year or so you've expanded to focus on these permanent data centers more specifically to serve AI.
These are facilities in the hundreds of megawatts.
They will have hundreds of thousands of GPUs.
What does it mean to apply an energy-first lens to building those larger data centers?
We are thinking about both the cost and the environmental impact of the energy resources that are going to be powering our data centers.
data centers. So, you know, in the same way that we solve this natural gas flaring problem,
we've tried to focus on building data centers and building large clusters for AI computing
in locations that have low-cost, clean, and abundant energy. So markets like West Texas
that have been, you know, heavily developed with wind and solar, but have massive amounts
of curtailment and negatively price power, you know, other markets like upstate New York,
You know, we're working on a former Alcoa factory that's, that's, you know, powered by a large hydro dam that's, you know, being currently massively underutilized.
And so a lot of these, like, more brownfield type assets that can be utilized in a way to power energy-intensive compute workloads, as well as new greenfield development where, you know, you can look at an asset and say, hey, it sure would make a lot of economic sense if you had a use for the power to build generation in this location.
Well, we can say, great, let's partner up with an IPP and develop a data center alongside new greenfield generation capacity.
What we do is, you know, we partner with renewable energy producers like these IPPs, and we can partner with them behind the meter.
We're taking load, you know, behind the meter.
And we leverage the existing substation infrastructure and actually set up a grid connection as sort of a backup, you know, resource.
So you looked at this space and you looked at AI specifically and determined that you were in a once-in-a-lifetime moment to meet the computing demands of this technology as cleanly as possible.
So there are a lot of different projections out there for how big the energy impact will be.
IEA, Goldman Sachs both believe electricity demand will double in the next three to five years.
Morgan Stanley projects that in 2025, generative AI alone could count for a third of the total computational demand from data centers we saw in 2022.
Regulated utilities could see capital investments of $5 to $10 billion annually to meet this resulting power demand.
When you look at these projections and what is actually happening in the industry, how explosive do you think energy demand from data centers will be?
and why did you decide that this was the area that you wanted to serve?
I sort of think that the demand is actually under forecast.
All that being said, I think there's a lot of incredible opportunities with that.
Power usage on the surface isn't necessarily a bad thing,
especially if it's advancing humanity forward and advancing and developing all these incredible new technologies,
including inventing new ways to mitigate climate change and sort of advance an ineffective energy.
transition. You know, we work with a number of different companies that, you know, are doing just
that, ranging from advanced physics modeling systems to develop next generation fusion architectures,
developing, you know, more advanced climate adaptation strategies with, you know, advanced weather
modeling techniques, you know, developing new battery chemistries. And, you know, we just had a great
press release with SES around, you know, a lot of the battery chemistry model.
that is fundamentally push forward in ways that we can invent new techniques to basically
create more effective grid storage battery solutions. We were even talking to someone that's
working on a custom engineered material that's empowered by a foundational inorganic chemistry
model that can provide more cost-effective and more engineered direct air capture systems.
I think there's just a lot of very, very interesting innovations that are going to take place from this technology, but we do believe that it's going to consume tremendous amounts of power. I think we kind of view that as an opportunity, that load itself. If that load can catalyze net new greenfield generation that's clean, whether it's building data centers in areas like West Texas that have abundant wind and solar and you can deploy large battery clusters as the cost curve comes down, or, you know,
exploring, you know, one area we're spending a tremendous amount of time is on carbon capture and sequestration, so net new gas generation with carbon capture and sequestration, where we can use a lot of the existing fossil fuel production and infrastructure to produce power, but capture the negative impact from the CO2 emissions there. So we're sort of looking at everything. And, you know, I think the energy demand is immense. You know, one other great example of the
type of deployment that we do is, you know, we're doing quite a bit in Iceland, where geothermal
and hydropower is low-cost, clean, and abundant. And we're able to sort of develop these large
GPU clusters in Iceland powered entirely by clean energy. And, you know, obviously there's
benefits there in terms of it being an easier environment to manage from a cooling perspective.
But, you know, we think that's increasingly going to become an important market to power,
power next generation AI applications.
I appreciate that perspective because I think when you talk to, let's say, folks at the large
tech companies privately, they'll say, yeah, this is going to use a lot of power.
But then publicly they'll say, no, we think we can solve for this problem.
But, you know, we think the benefits to society are much greater than the resulting power
demand increases.
And they hedge a little bit.
What you're saying is like, yeah, this is going to take a lot of power.
And, yeah, I think that the net benefit is going to be.
very high for society and like let's just deal with the power increases. Is that a fair way to
characterize it? I like that you're tackling it head on and not like hedging. Yeah, I mean,
we need to address it head on because, you know, and I think what you're seeing is that like,
I mean, just if you think about just the power consumption of an H-100, right, which is the current
generation of Nvidia chips, like, you know, if every, you know, American adult used half an H-100
as, you know, a form of co-pilot for their, you know, daily workflows and, you know,
social interactions, et cetera. Like, that would require 250 gigawatts of power, right? These are,
these are order of magnitudes bigger than the forecasts that people are providing. And, you know,
these models are only getting bigger, more complex. And, you know, there's a bit of a positive
reflexivity to it, right? Where it's like, you know, as you actually get utility from using these,
you know, AI applications, whether it's for, you know, scientific discovery,
or social interactions, as they get better, you use them more, right?
That's like the positive reflexivity aspect to this that is going to therefore drive more
demand for more compute and that is going to consume more power.
I don't think this is an unsolvable challenge, is my point.
And I think it's actually an interesting opportunity for the ICT industry at large to
basically help shape what the future of, you know, our grid and what our generation infrastructure
looks like. Because, you know, one of the superpowers of AI that we haven't really talked about
is that, you know, it is far more tolerant of latency than many, many other applications. So
you can actually position where that load goes, you know, and you can really take this sort of
energy-first approach. That's why, you know, we're building data centers.
in West Texas, it's not a traditional data center market. But a lot of people are coming there because
the product itself is so compelling from clean and large-scale data centers that can power
their compute workloads. So I want to get to some of the solutions. Just another question
about the power supply issue. How much of a bottleneck is power supply compared to, say,
chip availability right now for scaling AI? Is power the main constraint?
Power is the main constraint right now. I would say, you know, the bottlenecks in AI kind of like have moved around like a year ago today. There was like this, you know, fever of, you know, being able to get, you know, enough chips. And I think Elon Musk famously said something along the lines of like, you know, Nvidia H-100s are significantly harder to buy than illegal drugs right now. And, you know, it was definitely like a crazy fever to, you know, get the chips.
But once the data center capacity was sort of fully absorbed, you know, we ended 2023,
sub 2% in vacancy rates across data centers in the U.S., you know, when you think about what
that is compared to, you know, another commercial real estate asset, that's like, you know,
tiny crumbs of capacity that are sort of spread across people's portfolios.
You know, today, I think it's sub 1%.
So, you know, there's just not enough data center capacity.
And what's eliminating that data center capacity build oftentimes is the ability
to access the power. And there's multiple different reasons for that. It can be grid and our connection
cues. It can be high voltage transformers. It can be switch gear. It can be backup generation.
You know, there's all these different components in the overall supply chain that prevent us from,
you know, being able to sort of meet the demand side of the industry. So talk about the cloud product
that you've developed. You call this the climate aligned cloud. You know, you're helping some of these
companies actually train their models in a less energy intensive or emissions intensive way.
How is your cloud product actually structured for these customers?
Yeah, for our customers, what they consume is, you know, virtual IT resources.
So what we provide is virtual machines and compute storage and networking and a high-performance
configuration that's really purpose-built and design for AI innovators to run generative
AI applications, train large language models, and do so with sort of the highest performance,
compute and networking architectures.
Now, what makes it special and, you know, sort of, or one of the things that makes it
special for those that care about the emission footprint is that it's powered, you know, by clean
energy.
And like I mentioned, we are making significant investments in Iceland.
This is not a new trade, you know, so to speak.
you know, the aluminum smelters are famous for going to Iceland to run the very energy-intensive
process of smelting in Iceland. So they actually ship the raw aluminum ore across the ocean.
They do the smelting in Iceland, and then they sort of ship the finished products to their end
destination. Well, if you think about a large language model and what needs to be done to train one,
you know, this is a far, training a large language model is a far better manifestation of that trade,
where you can move data across subsea cables
to get to a data center based in Iceland.
You can run your back propagation and large scale.
Pre-training on significant amounts of data
to train your LLM,
and then you can sort of ship the finished product
to wherever you needed to go,
or you can run inference by sort of feeding tokens,
again, across those sub-C cables,
and then getting the output back to wherever you need it.
So I think there's this philosophy
that moving data is quite a bit easier than moving, you know, real world physical materials.
And, you know, it's a way of sort of co-locating the energy-intensive compute workload
in an area that has that low-cost, clean, and abundant energy.
What's harder?
Building the data center, building the network, or building the cloud product?
They're all hard for different reasons.
They all sort of face their own set of challenges and engineering problems.
We've sort of brought together a great group of...
of experts across each individual component,
building a data center, you're sort of amassing
a lot of different labor and trades to make it happen.
And I think that's actually one of the interesting phenomenon
of this AI boom that not a lot of people are talking about
is sort of the boom in blue collar labor workforces.
So we're experiencing massive shortages of electricians,
of welders, of plumbers, like these large,
these large-scale liquid cooling architectures that we've described that require tremendous
amounts of plumbing, right? So you're building all these different pipes to move water around to
cool the data center. And, you know, a lot of these blue-collar trades are just inherently,
we just don't have enough people to do them. And it's sort of creating a boom for that whole
sector of the economy and a revitalization of, you know, kind of areas like the Rust Belt. So
to me, that's one exciting trend from a labor perspective. I think there's a lot of challenges from
from that angle on the data center side.
The network, it all depends on kind of where you're at, you know, the complexity there,
but, you know, there's a bunch of different, you know, facets of the network engineering
challenges that we face between, you know, the long-haul physical network of how we're going
to get data in and out of the data center to the high-performance network, the local area
network within the data center that enables people to share data from, you know, GPU to GPU
in a super high performance capacity
with our rail-optimized
networking architecture.
And being able to virtualize
all that and do so through
our software-defined network is like
a whole new set of
engineering challenges that
have been fun to work through
as a business.
Are there any big picture changes
you see in the data center industry
right now
that you think are shaping
the industry
the industry broadly or shaping the size data centers that you're pursuing. What are you
think some of those big shifts are right now? I think the biggest shift is just the scale.
Like the number of people I talk to that are like, yeah, I want a gigawatt data center. I mean,
it's like a remarkable number of people that are asking for that scale of capacity, right?
Will you build in that scale or are you, the hundreds of megawatts? Yeah, that's like in our
pipeline. I think we have about 12 gigawatts in our pipeline of either development or
in advanced commercial discussions.
So it is absolutely kind of in our wheelhouse to kind of go that big.
But I do also think that people are trying to find ways because some of the folks asking for
one gigawatt actually want 10 gigawatts.
And that sheer scale, it's a very challenging problem to kind of get in a single location.
And so people are looking at ways to sort of link together data centers across, you know, different geographies and sort of create, you know, a more synchronized, decentralized computing footprint.
I think one thing to remember for folks, especially for, you know, training these cutting edge foundational models, is that one of the complexities that we've actually had an engineer around is that the cluster itself is, like, moving in, in,
in sync with one another. So, you know, you're sort of, uh, all of the data is sort of being
broadcast, um, out across this high performance network, you know, to the GPUs. They're running
their compute workloads and then, and then publishing the results to everything. So it's,
it's almost like the cluster is breathing. Like you have these like big spikes in power and then
sort of low, low spikes in power and then big spikes in power. And it's like basically every
GPU is sort of moving in sync. Um, and that frequency and, you know, they're doing this hundreds of
times or thousands of times a second. And with that crazy frequency of power demand draw,
it sort of has introduced other complex challenges in terms of power management.
So you work across the entire value chain. When you look at future efficiencies of chip
design, how are you like in video chip design, how are you collaborating on data center design
to maximize performance and sustainability potential?
I think the big architectural shifts that we have to engineer around are really this power density aspect.
If you look at like the last generation Nvidia chip, it was the Amperer.
So the A100, that consumed about 300 watts per chip.
The hopper, the current generation, the H100, H200, those are about 700 watts per chip.
And the Blackwell is sort of the next generation, the GB200.
That'll be about 1,200 watts per chip.
So like the power density is going up, which sort of, you know, I sort of alluded to this earlier, but, you know, that's the biggest engineering challenge from the data center perspective of, you know, how do you cool those systems effectively, you know, from a sustainability aspect. You know, I think there's, you know, there's the energy usage, but there's also things like water usage, right? So we've tried to do things with closed loop systems. We're able to sort of minimize the amount of, you know, net water.
that we're consuming from the data center's perspective.
What kind of advancements are firing you up right now?
What gets you excited about the future?
I had a really cool conversation with this company, orbital materials.
And orbital materials, you know, it was a, the team had previously come from DeepMine
working on a lot of the foundational models for inorganic chemistry.
And one of the things that they were proposing to me was basically,
basically a purpose-built material for our heat profile.
Like, you know, our big, one of our big, like, byproducts of running a data center is heat, right?
It's, you know, either hot water or hot air sort of coming out of the cooling systems.
Now, you know, that's typically going to waste or, you know, we have to, you know, chill the water back down or, you know, the hot air is kind of going out the tail end of the data center.
But what was being proposed is, like, you can actually custom engineer a material.
a direct air capture material that can sort of utilize heat at a specific temperature range to basically absorb carbon and also, you know, in the water case, chill the water.
You know, this is a bit like science fictiony, but like you could imagine a future where, you know, you're powering the data center with on-site, clean energy, cooling loop of the system is actually running a direct air capture system.
and you're sort of creating a net carbon negative data center running large-scale AI workloads.
Are there any other major advancements that you're working on that you will iterate over the course of coming data centers?
Do you think we'll see major changes from over the course of your next couple of data centers?
How much do you expect things to change technologically?
Yes, I do.
and a lot of it is coming from
how do we optimize the process of deploying
large-scale clusters of GPUs
and how do we stand up these data centers
in a faster and more cost-effective manner?
So a lot of that sort of comes back
to this modularity aspect of
if you're going to build in a very remote market
that has this low-cost and clean energy,
oftentimes getting labor there
for on-site construction
can be a huge challenge.
challenge. So, you know, the more you can do off-site, the faster you can deploy on-site,
and the lower your costs can be because you're actually doing a lot of the hard manufacturing
work in a controlled manufacturing environment. I think a lot of the work we're doing is directly
in that domain. A lot of the other work is in terms of, you know, how do we manage cabling in a more
cost-effective and speed-effective manner? You know, if you look at one of these big clusters that
we're deploying it's over a million strands of fiber, right?
That's just like a crazy amount of things to manage.
So anything that you can do and value engineer from the data center perspective
to streamline a lot of those processes is pretty cool.
So there's a lot of concern about the energy impact of this industry.
And what I'm hearing you say is you think we have a lot of the solutions to solve for the problem.
Of course, power.
availability is a huge constraint right now for scale. But it feels to me like you think we have a lot of
the business models and clean energy technologies to solve a lot of the problem. Is that right?
And is there is there anything about you seem pretty optimistic? Like are there, is there
anything that you want the people who are pessimistic about our ability to solve this challenge
to know about what you're seeing out there that makes you optimistic? Look, I'm, I'm optimistic.
in terms of, you know, the load itself has control over, you know, how we're going to be demanding
power just as an industry. And I think, you know, people are philosophically aligned with, like,
people want the future of AI to be clean and sustainably powered. You know, I think the pessimists,
what I would say is that the technological leap forward that we stand to gain from deploying this
technology at scale is so massive in terms of being able to invent any solution we could imagine
that would help solve all of our sustainability challenges as a society at large is like
right there for us. It's like you could take two different perspectives, right? One is like,
let's put the genie back in the bottle and, you know, we open Pandora's box and, you know,
we need to shut this thing down before it destroys us all.
My perspective is like that's not going to happen.
So, you know, why try to fight it and recognize that the innovative potential of this thing is absolutely enormous in terms of, you know, inventing things that, you know, fundamentally wouldn't be possible with, you know, traditional techniques across every industry, but, you know, especially around sustainability and clean energy production.
Chase Lockmiller is the co-founder and CEO of Crusoe.
Chase, I really enjoyed this a lot.
Thank you.
Thank you, Stephen. I enjoyed it as well.
This conversation was recorded live as part of Latitude Media's Frontier Forum with Crusoe.
And there is so much more to this conversation.
This is an edited version.
So if you want to watch the full video with lots of listener questions and a lot more technical details,
head on over to Latitudemedia.com slash events and click watch recording.
You can read more about Crusoe's energy-first approach to data centers at crusoe.
and if you want additional coverage on AI applications and AI power demand, subscribe to our new
newsletter, the AI Energy Nexus. I am one of the writers there. You can find it at latitudemedia.com
slash newsletter. Thanks so much for listening.
