@HPC Podcast Archives - OrionX.net - @HPCpodcast-76: TOP500 at SC23

Starting point is 00:00:00 Are you in Denver at SC23? Be sure to visit Lenovo in booth 601 on the show floor of the Colorado Convention Center through November 16th, 2023. You can also visit lenovo.com slash HPC to learn more about Lenovo's HPC solutions. Aurora is getting 55%. This is the ratio of R max over R peak with a matrix size of 22 million. Frontier is getting 75% efficiency with a 24 million size matrix. And the fact that every nation on the planet

Starting point is 00:00:42 really should strive to be a high-performance computing nation, that handwriting is pretty much on the wall now, and AI is making that even more clear. From OrionX in association with InsideHPC, this is the AtHPC podcast. Join Shaheen Khan and Doug Black as they discuss supercomputing technologies and the applications, markets, and policies that shape them. Thank you for being with us. Hey everyone, this is Doug Black. I'm Editor-in-Chief at Inside HPC and Shaheen, great to be with you again. Today we're talking about the top 500 list released at the SC23 conference in Denver.

Starting point is 00:01:17 To get right down to it, Frontier remains the number one system. This is the HPE Cray system powered by AMD chips coming in at 1.2 exaflops. Probably the biggest piece of news about this list is Aurora, the new Intel HPE Cray system is coming in at number two. It's at 585 petaflops. That is considerably less than the two exaflops performance that the system is expected to deliver. Number three is Eagle. This is a new system on the list installed by Microsoft in its Azure cloud. That's 561 petaflops. Fugaku, former number one system, is now number four. This is the ARM-based system, 442 petaflops. Lumi, HPE Cray EX system, AMD powered, is number five now at 380 petaflops after an upgrade. Number six is Leonardo at the Cineca Supercomputing Center in Italy.

Starting point is 00:02:20 This is an Atos Eviden Bolsaquana XH2000 system, 238.7 petaflops. Summit, which has been on the list since I believe 2018, the IBM-built system, 149 petaflops. Marinostrum, this is another new system at the Barcelona Supercomputing Center. Again, this is an Atos Eviden Bulls Aquana system. 183.2 petaflops. The new EOS system is number nine. This is an NVIDIA DGX SuperPod system. It's at NVIDIA, and that's 121.4 petaflops. And number 10 is Sierra IBM system, 94.6 petaflops. That's the top 10. The fact that this is the 62nd edition of this report is always significant to me. They've been doing it for 31 years and that's quite wonderful. I do in agreement with you think this is a step function this year. We're really getting more comfortable with over 500 petaflops and the Aurora number seems to be for half of the system. I would have

Starting point is 00:03:26 expected even half the system would be close to what Frontier would have. But when I look at the efficiency that they're getting and the matrix size that they use to get that number, Aurora is getting 55%. This is the ratio of R max over R peak with a matrix size of 22 million on a side. Frontier is getting 75% efficiency with a 24 million size matrix. So Aurora has more efficiency to gain by optimizing the system. And that's what Rick and Mike were alluding to in our podcast with them a couple of days ago. That's Rick Stevens and Mike Papka of Argonne. Their case was that Aurora is a nine-month installation process to get the system fully installed and running, and they're about halfway through that. Exactly. The other thing, as you

Starting point is 00:04:15 mentioned, is there are four new systems in top 10, and that's quite a good-sized shakeup. We don't always see that. That also pushes a bunch of other systems down. As you mentioned, some of the systems that continue to be pretty beefy are now like way down and some of them don't even make it to the top 10 anymore. So that also is an indication of the competitiveness of this entire area. And the fact that every nation on the planet really should strive to be a high performance computing nation, that handwriting is pretty much on the wall now. And AI is making that even more clear. The ability to build these exascale systems is really a national capability that people are investing money in. Yeah, Shaheen, you know, in my eight years of covering the top 500, I've never seen such a shakeup. It's an obvious indication that more countries, more regions

Starting point is 00:05:02 are investing more in HPC. As many of our guests on this podcast have observed, HPC is really broken out from being a niche into a category, a type of technology that's really driving so much that's important in the world of computing and IT with broad, broad implications. For sure. So speaking of countries, we did a tally of who's got how many. Now, it's true that China basically stopped playing. So whatever presence they have is not growing as fast as it probably otherwise would have. But right now, the US has 161 systems on the list. Europe collectively has 134. And that's including anywhere from Belgium with one system to Germany with 36 systems. But you add it all up and I included the UK because it's Europe, not European Union. The UK

Starting point is 00:05:53 has 15 systems, France has 23, etc, etc. So Europe collectively has 134. China shows up with 104, despite kind of not playing. Japan is 32. South Korea is 12. And then you have Canada with 10, Brazil with 9, Saudi Arabia with 7, Australia with 6, Taiwan with 5, India with 4, and Singapore with 3. So really, one notable item here is India only having four systems. One would expect that would be a much larger number, and hopefully it will in the future. Okay, what about the green 500? The green 500, the list doesn't change a whole lot. I think, you know, Aurora, because the efficiency is lower than it needs to be, and I think they just submitted the number to participate,

Starting point is 00:06:43 and I'm really grateful that they did. Once they get their efficiency, they'll be higher up. But right now, the Henry system at the Flatiron Institute, that's a Lenovo system, continues to be the number one. And that's at 65.4 gigaflops per watt. The Frontier testbed shows up at number two, and Frontier itself shows up later on. I know that you're an advocate of this list, which not everyone agrees with, but you see real value in top 500.

Starting point is 00:07:12 Share some of your thoughts about this list and its significance. I do. I think it's a repository of data over many years and that right there is very valuable. It indicates the kind of performance that you get for various architectures. There have been a lot of different architectures over the past 30 years. And some of those architectures may become interesting again in the future when different technologies advance at different rates. And what used to work 30 years ago may in fact just be interesting again. And it gives you that sort of historical perspective.

Starting point is 00:07:45 And of course, it has a very important predictive power in my mind. I think if you know how to interpret the number, you can put two and two together and say, okay, if you're doing this with HPCG and this with HPL, and here's the matrix size, and here's the efficiency you got, and here's the interconnect, I can therefore conclude that you would perform like this for this particular application. And I think that's also highly valuable.

Starting point is 00:08:10 There are a few other things that we looked at. One of them is vendors. Lenovo is number one with 170 systems, and that includes 112 systems that are outside of China. So it is important to note that Lenovo's global presence and where they are playing is increasingly systems outside of China. So you'll see quite a few systems in Europe, in the US, etc. HPE Cray is 105. That includes two that they did with Hitachi some time ago when Cray was handling it. Evident is at 48.

Starting point is 00:08:41 And that's interesting because they, besides HPE, really, they're the only other one that can play at the exascale level. And they are with the systems that we see in Europe. Inspire is 34, but they've kind of stopped playing. Dell is at 32. NVIDIA is at 19. So for those of you who don't think NVIDIA is a systems player, think again. That's a pretty nice number. Fujitsu is at 13, so they're behind NVIDIA. NEC is at 13. Sugon is at nine, but they've kind of stopped playing too. And Microsoft is at seven.

Starting point is 00:09:16 So kudos to Microsoft for playing. And kudos for Microsoft to have the number three system on the list. This, I believe, is the first time that a cloud-based system is showing up at such a high entry. Well, you know, it's fertile ground that we're talking about. And I think, again, it all reflects the growing significance of HPC, especially HPC in combination with AI. More countries and companies want to invest in it. And what's going on is tremendously significant. Thank you. The few other things we've done, let me take you through that in terms of CPUs. Intel has 339 systems. AMD is at 140.

Starting point is 00:10:34 So together, they have 479 systems that are x86. So the rumors of x86's demise are highly exaggerated. Fujitsu is really playing with ARM, and they've got eight systems. IBM Power is at seven. Both of those chips are absolutely fabulous, and you would think that they should do better, and I hope that they do. NEC's vector systems has five, and that basically kind of gives you, there's one system with the Shen Wei chip. Generally, it's an x86 play. Now in terms of accelerators, surprise, surprise, NVIDIA is 166. And that includes 10 systems with their latest H100. There's still a lot of systems there with the A100 and the previous generations that they

Starting point is 00:11:19 had. AMD has 11. Intel Max has four. And then there are 314 systems on the list with no accelerator. So there's fertile ground for accelerators to penetrate the list. Yeah, and I think looking forward, that'll be a big continuing change that we see as Intel and AMD join NVIDIA with accelerators coming onto the market, and then more systems adopting that. Yeah. This will continue to roil the list. That's right. So then you look at interconnects. InfiniBand shows up with 219 systems.

Starting point is 00:11:54 There are 39 systems that are custom and proprietary interconnects. So that implies high end. And then there are 33 systems with OmniPath that continues to be a player. So you add them all up and you get on the order of 300 systems that have high end interconnects. There are 209 systems with gigabit Ethernet. And that, of course, with the Ultra Ethernet Consortium that has emerged, that number may increase with PCIe switches, that number may increase. CXL on top of that might also change something. So those are all developments to watch. One other item, if you're in the mood? I am, Shaheen. All right. So that is efficiency, like how much are you

Starting point is 00:12:38 getting out of your system? The top efficiency is an HPE Apollo system in Japan showing up at number 80, and they're getting 98% of their peak performance. So that indicates what you could do if you really wanted to optimize the heck out of the system. So there are 17 systems on the list that are getting better than 90% of their peak performance. Wow. There are 60 systems that get between 80% and 89%. 93 systems between 70% and 79%. 98 systems between 60% and 69%. So really the bulk of the market is somewhere between 60% and 80% of peak. However, there are like 167 systems that are between 50 and 59%. Now that

Starting point is 00:13:28 indicates to me that folks just run the benchmark and don't spend a whole lot of time trying to optimize it. And maybe that's just as well, because that's not their day to day operate, you know, workload anyway. Yeah. But it also shows just how much more performance there is to gain if you want it to really optimize HPL for these systems. And then in terms of power requirements, there are three systems that have between 20 and 30 megawatts of power requirements, five that are between 10 and 20. So really, the two of them together are like eight systems that are more than 10 megawatts. About 80 systems are one to 10 megawatts. And the rest 411 systems are less than one megawatt. So really, the bulk of the list is less than one megawatt of power needed. Yeah, it'll be interesting as the list gets broken down

Starting point is 00:14:18 the percentage of total floating point operations, if you will, come out of the top 10 versus the bottom 490. This is very true. Yeah, yeah. I mean, one exaflop, like Frontier, 1.194 or 1.2, I think, as you rounded it up, that's at least the next two systems combined. And then you put the top 10, and that's probably more than all the rest of them combined. You should do that analysis. One final thing, and then we can call it good, is the total number of cores. So the Sunway Taiyu Lite of some years ago continues to lead with 10,649,600 cores.

Starting point is 00:14:56 Frontier is second with 8,699,904 cores. Fugaku comes third at 7,630,848. And then Tianyi 2A, another Chinese company that was based on accelerator technologies, comes in at nearly 5 million cores. And the smallest number of cores on the list is 1,664 cores. And that comes in at number 442. So scale continues to be where a lot of these challenges are, and it shows up all over the place in terms of how hard it is to get these things going. Amazing, amazing stuff. All right, Shaheen, thanks so much, and we'll talk again soon. All right, take care, everybody. See you at SC23. Cheers. That's it for this episode of the At HPC podcast. Every episode is featured on InsideHPC.com and posted on OrionX.net.

Starting point is 00:15:53 Use the comment section or tweet us with any questions or to propose topics of discussion. If you like the show, rate and review it on Apple Podcasts or wherever you listen. The At HPC podcast is a production of OrionX in association with Inside HPC. Thank you for listening.

@HPC Podcast Archives - OrionX.net - @HPCpodcast-76: TOP500 at SC23

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.