In The Arena by TechArena - Data Center in the AI Era with Jen Huffstetler
Episode Date: August 22, 2024Join Allyson Klein as she welcomes former colleague/ industry innovator Jen Huffstetler. Jen shares her extensive experience driving advancements from client devices to the data center, including grou...ndbreaking technologies like Centrino and 3D packaging.
Transcript
Discussion (0)
Welcome to the Tech Arena,
featuring authentic discussions between
tech's leading innovators and our host, Alison Klein.
Now, let's step into the arena.
Welcome to the Tech Arena.
My name is Alison Klein, and I am so excited because I
have former colleague and friend Jen Hufstetler back on the show. Welcome to the show, Jen.
Thanks, Allison. It's a pleasure to have you. You have an incredible history of driving innovation
across the industry from client to mobile and into the data center. Can we just start with a brief
history of your career and the areas of the industry that you've touched?
Yeah, I've had this incredible fortune to be at the center of several inflection points
throughout the computing industry. Harkening way back to my early client days, what folks
might not know is in the late 90s, despite the internet boom
taking off, laptops were still large, luggable, the battery life didn't really last very long,
and they had to plug it into a network cable. And so you can move your PC, but it couldn't
really be connected everywhere. And that was where my journey began with what we called Centrino
technology, which was the first mainstream laptop focused on being thin and light,
providing all day battery life, integrating wireless connectivity so that you could have
pervasive connectivity. And this was way back in 2003. And we all just probably forget what life
was like without being able to go to an airport or a coffee shop
or anywhere and now have that pervasive connectivity. So that was the first tectonic
shift I was in. And then since then, I've mostly been focused on the data center side,
building server systems for network storage, mission critical applications,
helping our LXM business turn profitable there. It's not an easy thing to do if you're familiar
with this very competitive space. I also then focused on componentry in the data center space,
leading product across CPU, GPU, DIMMs, building an industry first silicon as a service business,
and also looking out to the future of the data center in partnership with technical leaders across the
industry about what do we need to be successful with the future of compute in the data center.
Some of the things that came out of that were integrated optical IO POCs that you've seen
publicly announced recently. And then most lately, I've been focused on product sustainability
across all of these devices and what that means for fab
processing, chip design, for software, the system level. It could be at the data center level and
inclusive of software, AI frameworks, and beyond. That last bit is what I want to talk to you about
today. I want to talk about the data center and what's going on in the data center. And there is such a wide gamut of topics to address when we think about the data center. We stand at an
amazing moment for data center innovation. And I talk about this every day on the tech arena, but
we don't get a chance to talk about it that often on air. It's really filled with opportunity and
challenges. And I guess one question that I have
for you as somebody who has been in this space for a long time, just like me, how do you see
this moment and how do you compare it to other moments that we've lived through? Yeah, right now
we are in the midst of this unfolding data center of the future. These last few years, we've been
experiencing a shift from homogeneous compute,
where there were millions of the same computer, and now a much more expansion into heterogeneous
compute. And what that means is having specialized accelerators that more closely match the workload
characteristics. These examples have been unfolding and accelerating in terms of volume
and deployment over the last few years, starting first with offloaded network, storage, security
applications. There's many names for this in the industry. That's continuing. And in this AI moment,
we're now seeing not only the GPUs, but also specialized accelerators for inference,
for training. This is very new
in this moment. And in the past, it was really, how do we take advantage of light compute and run
it most efficiently, almost like a Southwest Airlines model, where every airplane is the same,
and we can run it really efficient. The workloads have become so complex that the infrastructure
providers are now seeing the need to innovate at that silicon level to
match it. We're also seeing a lot of innovation at the systems architecture level. You see some
of these accelerator companies now deploying at a pod level. That's very different than the past
where we had singular nodes, everything was fitting into what you and I would call a pizza box in a one-year or four-year. But the design point now is in a multi-rack level. That's huge
innovation in the infrastructure. And it really allows them to integrate these new cooling
technologies like liquid cooling. And we're seeing even more innovation we'll talk about, I'm sure,
in the software layers and the data types.
I see it as these two inflection points are happening. There's that system architecture piece, but we're also at another computing inflection point where AI is becoming the
killer app that will define everything in the next 10, 15, 20 years. And that's really what
we're seeing unfold today with this extensive deployment of high power
GPUs, ASICs to train these ever increasing large language models. We see hyperscalers investing
heavily in this space, seeking to retain their market position, but they're also encountering
challenges because they weren't planning for this much power. And when you build a data center,
that's a long range project. You need to procure that power years in advance to ensure that it's going to be there.
So right now this energy load is actually constraining the compute growth. That's very
unique. Whereas in the past, a lot of what we saw was consolidation and energy efficiency. So it was
actually more work being done with less. We're now seeing a
very different model where they're just deploying more and more product to meet the ever-growing
size of these large language models and running into this constraint with regards to energy
availability. Now, you described a really beautiful picture of all of this innovation,
and a lot of it
aimed at hyperscale today because they're chasing the monetization of AI.
When I think about the next five years, one question that I have is, how will all of this
infrastructure innovation actually influence the deployment of technology across the large
hyperscalers and enterprise? And when do you think enterprise will have fully digested the fact that AI is that killer
app and they have it running across their use cases and their workloads?
It's a great question.
It's like the $64,000 question from that old show.
I think for the hyperscalers, they're already feeling that
competitive threat now. So they've fully embraced and are fully adopting it. On the enterprise side,
this is so new that right now, I think a lot of the corporate strategy teams at these large
enterprises around the globe, they're really thinking about how are we going to integrate
this new technology and how are we going to prepare for
the impacts of this on our business? Where are the competitive threats? How do we need to
reallocate our resources so that we're investing for the future and the implications that broad
deployment of AI will have on us? So I actually think it's top of mind for every enterprise around
the globe, but it's going to unfold very
different. There's a lot of opportunity in this space for companies to take advantage of these
new resources, to lower costs, use digital twins, upskill their workforce, integrate autonomous
support in their factories, manage their energy footprint, as we were just talking. But when I
think about how it compares to what we've already experienced,
I think it's helpful to go back in time. And that's probably going to inform us for what this
future two decades is going to look like. And you and I both watched enterprises of all sides over
the last two decades embrace virtualization, cloud capabilities at different rates. There was no one
journey. There wasn't one trajectory. It depended on the
company, what their competitive environment looked like, what their financial profile looked like.
Enterprises by nature are pretty risk averse in terms of new technology deployment. They don't
want to disrupt services for their current customers or their existing revenue streams.
But what you are going to see is just this massive shift. I think Gartner's
predicting it's going from like 6% in 2003 to 80% of enterprises are expected to have deployed
some kind of POC by 2026. So we're in this moment where everyone's trying to figure it out.
We know that the pace of cloud adoption took off for a few reasons, but those reasons were
really centered at the heart of user pain points or ease of use.
And it was the developers who found it easier to build new services in a cloud instance
than it was to call their IT department, wait nine weeks for a server to show up, another
nine weeks for that server to get deployed.
It just provided that instant access.
Some of those software as a service providers, Workday, Salesforce, they provide new capabilities
over what existed in the monolithic solution. And it saved IT time to configure all of this.
And now I think what we're seeing with this microservices taking off is that a programming
shift has happened. And that's going
to help accelerate the adoption of these techniques. And you're seeing some of the
largest GPU providers put forth tools to help onboard these developers to take advantage of
the compute capability in the easiest way. I think when you streamline that for the developers,
that's really going to be what helps enterprises
move most quickly through this adoption is when they're able to take advantage of the new tools
and techniques that exist. Now, I want to go back to that multi-rack configuration that you were
talking about earlier. And I want to talk to you about the rising power that some of these
configurations are taking and the change to power and cooling technologies
that are going to be required.
There's been some statistics that say
that data center power demand is going to grow
well beyond the 2% of the world's consumption
that it sits today to alarming rates.
And I don't even want to say any of them
because there's so much divergence
in people's thinking on this.
But how do you see
the fundamental metrics of power in the data center growing in the AI era? And how do you
see that influencing both how people think about greenfield and power delivery and think about
cooling technologies and dissipation of heat from all of this infrastructure?
Yeah, it's a great question.
I'll go back to, if we look to the past
to help inform the future in this space,
what we know is that there has been
ongoing data center energy efficiency happening.
From 2000 to 2010, we saw increased processor core counts.
We saw virtualization, cloud computing,
consolidating servers,
which provided a significant boost in the compute work done per watt or kilowatt.
Intel's reported from 2010 to 2020 that they saved a thousand terawatt hours just from the
innovation of processor in 2010 to 2020. We're going to continue to see that happen. We know
that hyperscalers like Google are
publishing their own results about how they take their existing infrastructure and are able to
focus on energy management, on utilization of their fleet and say 40% of their power.
You're going to keep seeing that. We've seen the PUEs reduce globally from over two to down to a global average of around 1.6, world-class at less than
1.05. That type of innovation is going to continue to happen. So I'm very optimistic that this won't
be a renovate energy explosion as some folks are potentially forecasting. In the short run,
you're seeing brownfield data centers bump up against that wall that we talked about. And so they're just acquiring fuel cells, things that they can do to meet their energy needs today. While for their greenfield sites, they're planning out megacities with long range power procurement planning with the local utilities, with the developers to ensure that they have the headroom for what might be a linear
trajectory, which is the right thing to do, right? How do we prepare for the future? But in the long
run, I think just like those last two decades, we're going to be surprised by how much more
efficient all of this gets, not only with new hardware that comes out, which can provide
tremendous advantages in performance per watt, but the software innovations
that have yet to unfold. We've yet to see the application of AI on the software stacks,
which in some companies I've seen just by analyzing your code and rewriting it,
it can save 30% of energy. There's also a lot of innovation happening in the industry on this
space. Collectively, there's something called the MX Alliance, and they are innovating on next
generation, more efficient data formats.
So what used to be at FP16 moved to FP8, this floating point, there's a new data type
called MXFP4.
This is narrower.
It's less precise, but it allows those AI models to take up much less
space inside the computer. It requires less fetches from the memory, but runs more efficiently
overall. And we're seeing companies supporting this in next-gen hardware. And even before they
do that, they're putting out software solutions. And in some of the software alone examples, you're getting a six and a half X improvement
in per watt.
So to me, that's just a taste.
We're at the very beginning of this journey of the innovations that are yet to unfold
in the software space.
And I just think it's super exciting what's yet to come.
There's, of course, open source frameworks like Python
has libraries that automate popular model optimizations. There's quantization, pruning,
knowledge distillation across multiple deep learning frameworks. You're really going to see
100X and then we all need to see the 1000X improvements in that combination of not only the chip hardware, the data center
hardware, but also in the software solutions and how tightly coupled that software is into this
infrastructure moving forward. It's so interesting what you said about AI driving software innovation
in terms of efficiency. It makes perfect sense. And we're talking about the fact that coding is
going away. We haven't really broadly discussed the implications of that. And I can't wait to
hear more about this particular topic. And we'll definitely be bringing more about that onto the
tech arena. You didn't talk about cooling, so I've got to go back to it. Cooling is always
an interesting technology in terms of the historical debate of
when is the moment that liquid cooling is going to become the predominant technology in the data
center. And I feel like we finally made it to that moment. What do you think? Do you think that
this is just a hyperscale trend? Do you see liquid cooling playing out in other environments in the
enterprise or at the edge?
Yeah, it's a great question. I peppered it in. I didn't go deep. I think liquid cooling has a long history in all of the supercomputers around the globe, HPC at enterprises. I know Ford's had
it deployed for 10 years in their HPC clusters. We are at a point now where some of these chips have exceeded a thermal density
that no longer makes air cooling a possibility. And so we have to remember, we're talking about
heterogeneous compute environment. Even when you talk about hyperscale, I think where you're going
to see the liquid cooling take off first, and it already is. That's when I was talking about those pods.
When you're designing at a rack level, you can integrate cold plate cooling, for example,
as your standard unit of delivery, that everything will be as efficient as possible.
It will be liquid cooled. It now, of course, offers you that opportunity for energy reuse. I think we're going to see more regulations. Expect that of data centers because unfortunately data centers are not the economic boost in a
community that they could be. They don't have a lot of labor force required, but there's other
benefits they could be providing that community like that energy reuse. So I think you're definitely
going to see it take off in these large accelerator deployments where they're reaching
a kilowatt, two kilowatts
per chip, you're going to need really more efficient solutions. I know everyone's continuing
to look at immersion cooling as well. And when that will take off because it provides so many
other benefits inside a data center, you can make your data center smaller. It doesn't have to have
these really tall ceilings for airflow movement. Everything can change in the design of it all, but there's some barriers to adoption
there. And they include, one, there's the facilities and that's true for both the cold
freight and liquid cooling. And that's one of the things I think the industry is starting to
come together on. How do you standardize what that looks like? There's also barriers in terms of maintenance.
And how do you now have a million nodes and you're changing the ergonomics of how somebody
has to do the predictive maintenance, the failure maintenance on a server?
That's very different.
It changes everything if you've seen a tank.
You now need a crane to pick it up.
I think this intersectionality actually of AI, and you're seeing more and more folks talk about robotics and autonomous robots in factories, that could be one of those inflection
points that helps some of these technologies that challenge the dominant logic, the dominant
way you train a data center technician to manage
the servers, I think that's actually going to be an accelerant to that most sustainable solution
of immersion. When we talk about enterprises, again, they adopt technology at a lower rate
if it's addressing a pain point that they have. Like at the edge you mentioned, if you can deal
with the harsh conditions in a desert,
in extreme heat, or with high pollutants in the air, I think you're going to see more adoption
more quickly in locations like that where the technology is really solving the customer
pain point. Inside a traditional enterprise data center, things are likely just going to move a
little bit slower. They're juggling a lot of things. A data center in most cases isn't their business like it is for a hyperscaler and their
investments, they'll have a lower risk tolerance for what that takes, what that means. But I know
that the expansion of POCs is going and growing. Given the stakes of this moment, we can expect to
see winners and losers across industries. There's so much technology inflection going on.
What advice would you offer IT leaders to ensure that they have the right strategies
in place?
I think IT leaders really need to be mindful about how many different projects they're
taking on, choosing partners to navigate this journey together because nobody has the solution today. And you're going to need
an ecosystem to support the deployment of these new use cases, whether it's in your factory,
whether it's just in managing your internal infrastructure, whether it's helping to
improve the productivity of your knowledge workers, there's going to be different experts
and there's going to be a lot of noise in the system that really finding a partner that you can go on that journey with, I think is
like the number one piece of advice I'd have in terms of managing your infrastructure,
really understand your current compute footprint.
Are you utilizing all of it today?
Do you have waste?
Is there underutilization?
You're wasting money if you have zombie instances that somebody spun up and nobody's
touched in three years. So there's a lot of opportunity, but you really need to take that
first step. And you can start with the cost profile. I know a lot of large enterprises
have done that. And that can help you to focus on how to manage your infrastructure.
And specifically in this AI realm and moment, we talked about the model sizes, right?
It will be bimodal.
Enterprises are going to deploy smaller domain-specific models, emphasize data quality over the quantity.
A smaller data set is going to use less energy when you're training that model.
You'll have lighter ongoing compute and storage implications from that as well.
Think about the level of accuracy you need. When you look at the, you throw more compute resources
at it, you're going from like a 99.7 to the 0.8 to 0.9% accuracy. What's really needed for each
use case. These domain specific models are really going to be helpful for enterprises
to work with partners who have
even done training with their specific language of your healthcare. There will be healthcare
specific models, legal specific models, and how can you accelerate your time to deployment
by choosing partners that really deeply understand your business and that can help you on that
journey? Because at the end of the day, every enterprise is going to be orchestrating a lot of different models to seek outcomes and transformation in their business.
And I think the most important thing is just don't wait. Every company is facing this moment together
and the future is on the line, right? How do you support your teams so that they can get upskilled quickly to help you solve for your future AI strategy.
So I've been thinking about your story about Centrino throughout this interview,
and I have a question for you.
If you were going to be able to go back in time and talk to that young engineer
that was working on Centrino about what you've seen and what you've witnessed in the industry. What would
you tell her and what would you tell the young engineers that are working on AI today in terms
of their opportunity to be part of this larger story? I think when I go back and sit in the
shoes at that moment, we were building the future and putting forth this bold vision.
And it seemed like it was going to be impossible at the time. You literally didn't have wireless
hotspots in an airport. You had to make sure, okay, who on the team is going to go work on
the ecosystem partnership so that we hit the major airports like San Francisco? Who's going to work
with a Starbucks?
So really thinking about a clear vision of what you think is possible and get advice and support, whether it's within your own company, advisors, there's many advisor networks around the globe,
to help you think through that holistic strategy, because it wasn't all engineering. It was
understanding who those go-to-market partners
would be to help fulfill that vision. And so if I move over today to the AI space,
the future is in the hands of every AI engineer today. I think you're getting some taste about
what that could look like from some of the leaders in large keynotes that they have painting a vision
for you.
What vision do you want to be a part of?
What conversation do you want to be a part of or story to be telling 20 years from now
that you made happen?
And I don't think there's any limit to what you can dream.
It's just having a clear goal and a vision, knowing what role you play and what help you
need to build out that vision, whether it's internally, if you're at a corporation, or through an advisor network, if you're just
getting started.
That's fantastic.
Jen, one final question for you.
I'm sure that people who are listening are going to want to engage with you.
Where do you want them to reach out and talk further?
Thanks, Allison.
You can always find me on LinkedIn, where you will continue to see me advocating for sustainable computing so that together we have a sustainable future.
Thanks so much for being on the program today.
Thank you. My pleasure.
Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by The Tech Arena.