In The Arena by TechArena - Driving Sustainable Data Center Infrastructure for the Next Wave of Greenfield Buildout with Jon Summers

Episode Date: May 3, 2024

TechArena host Allyson Klein chats with Research Institute of Sweden’s Jon Summers about the latest research his team has conducted on efficient infrastructure and data center buildout in the wake o...f massive data center growth for the AI era.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Tech Arena. My name is Alison Klein. We're recording from OCP Lisbon this week, and I am delighted to be joined by John Summers, Scientific Leader in Data Centers at the Research and Statistics Suite. Welcome to the program, John. Hello, I'm glad to be here. So you have quite a background in terms of data center technologies, looking at IT, looking at cooling technologies, looking at all sorts of things. You haven't been on the program before, so why don't you go ahead and introduce yourself and your role here at OCD. Okay, yeah. So, I mean, it goes back quite a few years.
Starting point is 00:00:32 I was working in a university as a lecturer in engineering, teaching in several dynamics. But I also used computation a lot, so we used computers as part of what we were doing recently. And I'm closing from the antics and I realized that it's a common dynamic in computers, but there is a problem with high-speed computers to construct a generally accurate and even dimensionally accurate software. So what we usually do is we put a coin on the line and we get it really in the head of the man. So, HVC, the high-performance viewing lens, embraced WPG during many, many years.
Starting point is 00:01:07 Sure. And this was a new idea for AIRFAR. It was right back to the WPG. So, we are invading the I2 space. But I've done lots more lab, started a lot of research, and I've been able to do a lot So we helped Invade the I2 space And But I Started working with Companies that start off
Starting point is 00:01:32 When they're setting up this area Quickly extracting the heat And The lab I had was only 40 square metres So then I I had the opportunity To meet Triggan And join RISE Where we have a lab which is a thousand square metres. So then I had the opportunity to move into Twigham and join RISE where we have a lab which is a thousand square metres. So we've got a lot of space and that's the
Starting point is 00:01:52 main deal. It's got a lot more power than I ever had a great deal of equipment. So we have a lot of opportunities. So we've built some tests to try and help the industry deal with some of that voltage refinery, some of that thermal management issues, some of the operational issues, so we've got IT and we're looking at different ways of updating the equipment and some of the operations. It's a very interesting environment to be in. As a result of say, it's a very interesting environment. As a result of that, and as a result of coming to the end of the talking event, from Bo research, we have become an OCT spring in the system.
Starting point is 00:02:35 That's primarily the point. We are located in North West Briden, right next to a large data center with one of the founding members of Story3D, who is the network data center. And they have donated a lot of their work, a lot of their work to us as researchers to you, and allow the environment. So it's given us a great opportunity to explore different things.
Starting point is 00:02:57 When you think about 2024, we find ourselves in a moment where there hasn't ever been more demand on data centers, but the advent of AI and incredible capacity build out of very high powered data centers. You come from the HPC arena, so you know that AI training clusters are very similar to HPC equivalents. And we also are at a moment where the sustainability of data centers, circularity, energy efficiency is becoming more and more important,
Starting point is 00:03:32 not just to green initiatives, but also to companies' bottom lines. What do you think, where are we in this? And where do you think the industry has made great progress? And where do we need to invest more to develop technologies to address this challenge? Yeah, okay. I mean, that's a very good question, wasn't it?
Starting point is 00:03:55 I think what's the case at the moment is HPC is kind of an erudite of a lot of the technology. But now, their requirements was, I want the compute as close together as possible, because I've got users that have applications that require access to all the computes, but the intercom through the compute needs to be low,
Starting point is 00:04:19 low latency, so you bring everything close together. Because you bring all that huge compute together, you still need to get that key. So it's an obvious space for what you could. And I've been doing it for a long time. I think what happens, what's happening with AI is it's got the same requirements to HPs, but
Starting point is 00:04:35 we're in the commodity space. We're in a much more... It's in everybody's face, if you like. The person the day goes by, somebody on the news is not talking about the next issue or the next great thing about AI. So AI has driven us into this, to looking at technologies that were developed for HMC
Starting point is 00:04:58 and saying, well, can we use this in these power-hungry data centers? We have a problem in the fact that it falls. You want to deliver the compute, but you have to factor in the fact that you need so much power that you're driving. Of course, all that power is coming in to the data center. It just stops there. The only way to get it out is through more mobile, managing more parts. And, of course, the sustainability side of this is that we're putting in more resources, material resources, and we're drawing more power.
Starting point is 00:05:36 So there's a real pull in different directions. It's a very tricky situation. Now we're trying to opt in, actually. That's the great thing about engineering. Engineering is always a complex. You're always trying to optimize your system. It's always about pattern. Maybe, you know,
Starting point is 00:05:56 another thing about liquid cooling would come with you can elevate your temperatures a little bit higher than you can with air cooling. Then you've got the possibility of having less food to work with the heat and you're rejecting the heat outside. I question your view. If it's not what you want, so when you're consuming power off the grid, converting it into a low-rated piece, it should be changed. We use it as a strong push for feature recovery, which challenges data centers because they out-provide digital services.
Starting point is 00:06:35 They're not a provider of the language. Right. It's just a consequence of the problem of the women's area, which is waste-saving recovery. So I think data centers can do its own. It's perfectly possible. Partnership, too. So there are technology things.
Starting point is 00:06:51 The technology we've got to focus on is, you know, why do those processes need so much power? Maybe to ask or break a little of serious questions about, if we use PQ, how can we compute with less energy? Right. Can that be done? And what are the technologies in that space? Of course, that opens up four kinds of reverse-port computing
Starting point is 00:07:13 to conducting computing, to cryo-gigit, and you get to quantum computing. There are a lot of things happening in labs today that could create a paradigm shift in the industry. But not everything has to be AI. Not everything has to be intense
Starting point is 00:07:33 work. There's a lot of useful things that we get from IT systems and standard workloads that will still continue to report in a traditional way, in a traditional data center. Perhaps we're trying to get those workloads closer to the end user. And then there was this little kind of a fight between AI, which was influence, which was transactional.
Starting point is 00:07:58 So you're interacting with an AI system and an influence. But there's also lots of technologies and applications that we use. So we need to not lose the spirit of that. We just need a heterogeneous environment. So it's not a complicated or technical issues, which are great, but I think there are also social issues,
Starting point is 00:08:30 learning issues, and collaboration. Because OCP is all about collaboration, so this is only going to happen through which kind of things you are doing in Europe with some compute projects. I've been hearing interesting stories about heat reuse and the collaborations between data centers and different organizations to take that heat input into something productive. What are you seeing across Europe in this? What are the most optimistic looks at how we can actually capture this heat and do something useful with it. I think the most interesting thing, or the most reliable approach that has been adopted from those of the Hispanic, those of the Brazilians, is connecting to a district heating net,
Starting point is 00:09:21 or being part of a district heating and a district cooling kit. So, and of course, you know, district heating networks in particularly in the Nordic, in Reichenborn, where there are a lot of things, a lot of heat is distributed through net. It's not the case everywhere in Europe. Some parts of Europe they use, they have gas grids, like gas feed to residential commercial buildings. That's converted into heat at the site. But with a data center, that running product is safe. If you were extracting the heat of a roadway from fairly high rains to require network, you've got to upgrade that heat.
Starting point is 00:10:03 The only way to upgrade that piece is using So there are a huge There are some good examples in France, for example, and a couple of other places where they're using Well, it's too fast. Either you're a data
Starting point is 00:10:20 center that you say, I'm going to do the upgrade myself, rather than from your network. It's all good. The energy that goes into the heat is energy consumed by the data center. Okay. It ruins, it increases your P-E-E-N-ness, power usage, it's a metric that's being used in the industry. But there is also another metric which is called the energy re-reflect,
Starting point is 00:10:51 which is to measure how much of your energy is being used outside of your boundary. Not only used internally, but on the energy. So they can claim that, which is the energy re-reflect, because they're operating the heat. But it damages the P-E value, so it's the heat pump.
Starting point is 00:11:11 So that's that aspect. Or there's the other aspect where you give the heat away to a district heating network, and they operate it for the network. And that heat pump that they have is not factored into your... Sure. It's not energy the data center is consuming. So these two things are different, very different. The cooling that comes back to the data center, you should factor that into your theory
Starting point is 00:11:36 because you're getting energy cooling back to your data center and you're using that to reduce your energy. You should actually factor in the value of that, which is that reduction in the cold, the chilled water coming back here. That isn't done to it, so it isn't really a level playing field. And I know we shouldn't be using these metrics to compare one data center against another. And I know the metrics are being used for mandating
Starting point is 00:12:03 the build-outs of data centers in different geographical locations, but it does really challenge the fact that it doesn't write a lot of point. I think I hear from the guy, I don't sit on any of the standards, but I talk to the guys that sit on the standards. There is a revision to take all this away. It seems like it has an incredible potential to extend the value of the energy investment.
Starting point is 00:12:32 Yeah, and I think that's, yeah, because it will help data centers. I think they need all the help we can get, particularly for the fusion data centers of the use of the heat, and trying to sell space to end user.
Starting point is 00:12:47 But that has a requirement for the PUE to be lower than a particular value. And because they're coming to a data center that's reusing the heat, but the PUE is slightly higher. So they will say, well, I want to go somewhere else. But it's more factored in, the fact that they're reusing the heat. Now let's talk about choline itself. I know that you're talking about emerging choline here. Yeah.
Starting point is 00:13:12 You talked about the fact that liquid has been something that has been embraced by the HPC community for a long time. It's now making its way into the hyperscale community. Is immersion in liquid superior to air? How should we view the trade-offs associated with the various cooling techniques? Yes, that's a good question Alison. I think we're in a situation where
Starting point is 00:13:33 I think it's very confusing because you do have predominantly two ways in waving the IT with liquids and traditionally HPC has always used direct chip so the data center looks it's just up to with the little pipes coming out of all your IT that's connected to some kind of manifold, you're sending it to what we're in.
Starting point is 00:13:59 But that has been done in HPC less immersion HPC from the days of old has not really embraced immersion I mean it has happened but I don't think it's as mainstream as the interesting thing about immersion
Starting point is 00:14:19 is that for starters you are basically messing all of them up're investing all of the energy you want to extract in the heat, all of the heat, you radiate the electrons. They're intense. So basically, they have wax and then you tickle that element. And so the heat is going to be in there. Because we have gravity to account for with the immersion time. But I think in that situation, it's useful that you are extracting ink
Starting point is 00:14:52 with now the requirements of memory, DVR 6, for example. So if you're not pulling the memory for the other peripheral components with liquid direct chip, you still have to manage some heat and it's going to generate a great service. So you probably use a combination of the rear door chip. What might be the differences with this that you have to have that on a set as a fancy that they do it um and the most
Starting point is 00:15:27 you know i think it's i think it's not so much a technological
Starting point is 00:15:35 issue perception issues will it be embraced by a lot of the uh
Starting point is 00:15:41 the hyposkeletons will it be embraced by you know footwear workflows i think it's still impossible you know to actually immerse A lot of the hypostellars would have been raised by footwear or workflows. I think it's still a good thing, really,
Starting point is 00:15:47 because actually it must be really expensive. There's other problem ones, actually, in some ways. I feel confident. Two benefits. It's just that I can manage to be a curve, a trust curve. I'm sure. People get to me a curve, a trust curve. I'm sure. People get to me, yeah, I trust it.
Starting point is 00:16:10 But I think there's also, they may see that it works, but there's also the, you've got to overcome the different approaches to maintenance. But there are technical issues and there are technical issues and there are racing technologies
Starting point is 00:16:31 I think we miss this conversion but I think I'm not sure how we can say who's going to win this race but it is a race and we can say who's going to win this race. There is a race. And we could vote together. They offer different programs.
Starting point is 00:16:52 Now, regulations need to be discussed when we talk about these topics, because there is a lot of regulatory work going on, especially in the EU. What is that doing in terms of the technology for challenges for operators in terms of the energy regulations for their performance? What do you see changing for them?
Starting point is 00:17:15 I think if you look at the EU's request for data, it's a visible, it's a visible data. Whatever regulatory organization is in the member states. I think it's an EU-wide directive. It's not EU-wide work, but it's got to be managed
Starting point is 00:17:35 in member states. It's really not a preliminary executive or a regime that directs it. I'm not sure of the mechanisms. I mean, that's sort of an illiterate of a tooth. But first and foremost,
Starting point is 00:17:50 these guys have to work basically. But you have to collect data from everything. Because they're asking for mutual utilization. How to utilize this. Because that is really difficult. And they're asking
Starting point is 00:18:07 for location for that. The guy is the alpha white customer, so he'll be in LHT. And they're asking those guys to say, can you get your collocated partner to give you
Starting point is 00:18:23 data about the rolling swing? That's nice. Yeah. And so I don't know the kinds of systems they're using to collect this data. And what take we see is if we do that, what's the data collection side of things is probably where the technology gets rid of the push point. The pain point is at the moment. First thing is you have got one year's worth of data by September 24th.
Starting point is 00:18:47 Which means they're collecting the data now. How are they collecting the data? So I think what you're going to have is you're going to have a long run in and it's day one is going to be a disaster. We're going to have the data
Starting point is 00:19:03 and they're going to say, well, this doesn't make any sense. We're not going to be a disaster. Right. We're going to have the data, and they're going to say, well, this doesn't make any sense. Wait, we're not going to be able to do anything with this. But they need to refine it. So they need to engage, they need to work with the community, the citizens of the community, to say, well, I think,
Starting point is 00:19:19 how can you help them give us what we need? But why do they need it? The question is, why do they need it? So I think what they want to find out is they want to accurately tell the EU taxpayer in Europe, and the consumer as well, about how energy is used. Because you have pain points in Europe where there is quite an intense use of energy from the data centers. I think Tenerife, one of the major companies in Europe, is to our own definitely, is an ancient percentage of their group,
Starting point is 00:19:54 the rest of the population. Denmark and Netherlands, too, I'm asking. You can see that these three are trying to address these issues, and they have very different tax sanctions for those guys that come in and do does rip this through the community so I'm not saying I just want to build
Starting point is 00:20:18 another beta center we are having blackouts or brownouts because there are intermittencies, and we're trying to move towards more renewable sources of energy, so we're reducing our base load to bring in more intermittent challenges for the world.
Starting point is 00:20:40 So you're getting these brownouts, and it's making people upset. I should have done this, I could get a little upset. Why should a data center get all the fire? Brand war. Yeah, what about our hospitals? That's really good. It's a fair community. They're fair questions. Yeah, it's a fair question.
Starting point is 00:20:56 And I think we need to engage everybody. We need to understand, because the first thing that will happen is if you say, well, okay, we're going to turn the data centers off for one day we don't have any access to data
Starting point is 00:21:08 so we're absolutely bugging because they won't be able to do anything how many
Starting point is 00:21:15 times do you have an IT problem at work or at home it kind of stifles
Starting point is 00:21:22 you to work or even plan your day or interfere with your day. We are credit to the system. And I don't think, on the one hand, we get the problem of fossil problem access and everything. Digitally first. On the other hand, they're saying, centers, you know, to use in some materials. Except we depend on them for everything.
Starting point is 00:21:52 Final question for you, John. I know that you're doing research all the time in data centers. What is the focus for 2024, and is there something exciting that you're working on that you'd like to share? Yeah, I think I'm working on this mainly around how can we look at the roadmap of the IoT and what can we do when we're more for forces.
Starting point is 00:22:18 Caleb. That's it. In an efficient way. But I also, it area of expertise when I also hear reading around the physical limit. It's one of the things that
Starting point is 00:22:34 I know is linked because there was a famous paper written by Ralph Lander, he was an IBM researcher in 1961. He wrote a paper called Irreversibility and Deep Generation in the Computing Process. written by Ralph Lander he was an IBM researcher in 1961 he wrote a paper through heat generation irreversibility
Starting point is 00:22:47 and heat generation in the computing process so it focused on the fact
Starting point is 00:22:52 that computer architecture is irreversible and as a consequence of that it generates
Starting point is 00:22:58 heat that's what we're moving because we have the heat today heat today,
Starting point is 00:23:05 heat today, no heat tomorrow. I don't know either. So there could be a paradigm shift. It has happened before. We went from light solar technology to the heat flux has dropped. Because you've been
Starting point is 00:23:21 really, and it's at that point, we don't need the 4.0. Then we do everything there. Since that point, oh, we don't need the 4mm, no? Then we'd do everything there. Right. Because since that point in the 80s, you know, a heap of that daily eating. Our draw of these microquests has just been going up and up and up. I don't see, I don't see it leveled up. It's a bit, you know, the chip manufacturers, the DLN, the body design, you just need
Starting point is 00:23:45 more and more from you. We're going to be buying recent GPUs and video games. We need more GPUs. We need faster GPUs. We need more power. So, where's it going to end?
Starting point is 00:24:00 The focus from where we start is really on extracting the heat. Fantastic. Well, thank you for really on extracting the heat. Fantastic. Well, thank you for being on the program today. I learned a lot. I'm sure my listeners learned a lot too. Where can folks engage with you and continue the conversation? They can continue to reach out to me on LinkedIn.
Starting point is 00:24:17 You'll find me, John Summers, or look on rise.ri.se and just search for my name. I do really a guest affair. Happy to answer questions. All the help is out there. Fantastic. Thank you so much for being on the program today. It was so much fun to talk to you. That's all I have to do.
Starting point is 00:24:43 Thanks for joining the Tech Arena. Subscribe and engage at our website, thetecharena.net. All content is copyright by the Tech Arena.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.