@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20260504

Episode Date: May 4, 2026

- Nvidia’s secret weapon - Why CPU-only supercomputers - China’s LineShine Supercomputer - Japan’s Fugaku.next [audio mp3="https://orionx.net/wp-content/uploads/2026/05/HPCNB_20260504.mp3"][/au...dio] The post HPC News Bytes – 20260504 appeared first on OrionX.net.

Transcript
Discussion (0)
Starting point is 00:00:04 Welcome to HPC Newsbytes, a weekly show about important news in the world of supercomputing, AI, quantum computing, and other advanced technologies. Hi, everyone. Welcome to HPC Newsbytes. I'm Doug Black, and with me is Shaheen Khan. One of my favorite Wall Street voices, name of Josh Brown, a regular panelist on CNBC, came out recently with an interesting note about Nvidia in his newsletter, in which he makes the often said, but still provocative assertion that NVIDIA's secret, quote-unquote, weapon, isn't its hardware, its software. Of course, to people close to the industry,
Starting point is 00:00:44 NVIDIA and its software development platform, Kuda, is no secret. Brown's note comes under the headline, can NVIDIA double from here, meaning the stock price? And he seems to be saying that with NVIDIA's Kuda advantage, the answer is probably yes. On the harbor side, Brown looks at the growing adoption of ASICs, which are GPUs designed for specific tasks. When targeted effectively at a given workload, ASICs don't need all the bells and whistles of the NVIDIA Blackwell, which is a general-purpose GPU. Brown writes, quote, some workloads will be sent to GPUs while others are handled by ASICs that are purpose-built cheaper to run and good enough for the job.
Starting point is 00:01:27 we'll see hyperscalers getting a lot more focused on which chips, for which workloads, for which customers, at what time of day, etc. And so people worry that Nvidia will somehow be forced to exceed market share. But Brown doubts this line of thought, citing skyrocketing AI CAPX spending that indicates enormous growth in the total available AI market. Andy cites Google as an A6 powerhouse that hasn't slowed Nvidia's growth. quote, Google has been building its own custom AI chips, TPUs since 2016, NVIDIA's data center revenues still grew 142% last year.
Starting point is 00:02:06 But more than that, Brown says, broad adoption of Kuda as a de facto industry standard for AI application development effectively locks up the AI market for NVIDIA. Citing the 20-year head start Kuda has over competing development platforms, Brown said, the genius move wasn't the technology, it was the go-to-market. NVIDIA made Kuda compatible across its entire product line from cheap gaming cards all the way up to data center hardware.
Starting point is 00:02:35 A student could learn it on a $300 GPU and apply those exact same skills to a machine running a frontier AI model. That continuity created an army of developers who knew one platform deeply and had no reason to learn anything else. Shaheen, very curious, your take on Brown's note. Well, I don't quite see it that way. To be sure, Cuda has been an important and even critical competitive advantage, but I see it as less so over time. Code portability, which is really what Cuda moat is all about, while preserving performance and maintainability, is hard. But you can chip away at it and eventually get close enough.
Starting point is 00:03:18 That's what AMD with Rockham and others have been doing. What I see as the real modes, however, are number one and most importantly, capacity. NVIDIA can ship more GPUs than anyone else by a large margin and has capacity with TSM and other supply chain players locked up into the future. The second mode is NVIDIA's growing software portfolio beyond Kuda, which is expensive and critical to its future success. It includes major capabilities like the AI enterprise, universe, DGX Cloud, the Invidio Drive platform, Bio Nemo for healthcare AI, NIMS for Microservices
Starting point is 00:04:00 for inference, ISEC for robotics, Holoscan, Metropolis, Groot, and I'm sure others that I'm missing. Chann has announced it is developing a new supercomputer named Lineshine, which they say will deliver two exaflops of compute performance. This would put it in the top 10 on the top 500 list of supercomputers, if that is, China still shared HPC performance numbers with the top 500. The project was unveiled at the National Supercomputing Center in the city of Shenzhen, and according to a story in the WCCF Tech publication, the system will be built in two phases, and it's reported to be an all-CPU system. It will reportedly be different from China's older Sunway machines based in Wushi.
Starting point is 00:04:48 Bynchine represents the Shenzhen. Gen Gen lineage, quote, focused on managing traditional scientific simulation with modern AI training capabilities. Yeah, the Line Shine project is described in two phases. Phase one is reported as being operational now and designed to test the software and liquid cooling architecture. But all in all, it is pretty underwhelming. It is just a mid-sized data center cluster, not a supercomputer that would break into the top 500. It's 100 standard Huawei servers. with 100 to 200 Arm V8 Kongpen 920 CPUs and a total of 12,800 cores. Phase 2 has all the interesting stats, 20,480 nodes, 47,000 ARMV9 LX2 CPUs, and about 2.5 million cores.
Starting point is 00:05:42 But it is estimated for delivery in 2029 or 2030, and a lot can happen in four years. So it is interesting because of what it might say about what a high-end supercomputing strategy based only on indigenous technology in China could look like. It is also interesting because China has opted not to participate in the top 500 kind of competition, like you said. So why would they go public now and when it's just an expression of intent? And the other question is, why is China adopting a CPU-only approach? I mean, reports had it that China has something like,
Starting point is 00:06:19 30 ex-scale systems out there, and that tells me they have multiple projects in this area, including indigenous GPUs. Seems to me that a CPU-only system has a few advantages. It complements other approaches that use accelerators. It is more general purpose, which has been a weakness of Chinese systems. But it also simplifies the architecture, which allows them to bypass challenges and leverage strengths. The challenges are trade sanctions on high-end chip manufacturing and GPUs, and we can presume strengths in the 2029-2030 time frame would include on-package, high-bandwidth memory,
Starting point is 00:07:00 and large-scale interconnects, which the project mentions. The top 500 trend has been towards CPU plus GPU, so-called hybrid systems. Back in 2010, China's Tiani 1A was the first number one system that China had. It was a Zeon-NVIDIA system, but also included a Chinese CPU for system management. That system was followed by Tiani 2,
Starting point is 00:07:26 which changed the accelerator from NVIDIA's Tesla GPUs to Intel's Xion Phi. It later changed the accelerator again to use the locally produced Matrix 2000 GPU. And then we had the Sunway Tai Yu Light, which was in the number one spot for two years using purely domestic processors. The Chinese systems looked much more special. purpose than the US systems, but they did well, especially on the LimpAC benchmark. Then they stopped participating in the top 500 list, just as the US was reasserting itself in the top 10. So now the US has the top systems, but as you said, it is understood that
Starting point is 00:08:06 China continues to have massive systems that could be among the top 10. On the top 500 list, a counter-trend emerged for CPU-only systems when Japan's Fugaku took that approach. with a focus on real-world day-to-day performance and productivity instead of just benchmarks. These systems are favored for workloads that require high precision, sequential logic, or unpredictable memory access patterns, all of which are areas that GPUs struggle with. China's line shine moves to this all-CPyu philosophy. This is probably just as well, since, as we mentioned, an all-cPU system also reduces the impact
Starting point is 00:08:49 of trade sanctions that are focused on GPUs and the high-end interconnects that they need. The Line Shine system will use the LX2 chip, a 304-core ARMV-9 chip. The other notable chip in this genre would be the Fujitsu Monaca chip, which pushes the limits of manufacturing, initially built on two-nanameter technology and moving to 1.4 nanometers soon after that to power Fugaku. Next. It targets a wider set of applications from general-purpose air-cooled systems of various sizes to liquid-cooled configurations that let you add GPUs as needed. The LX2 plans to integrate high-bandwidth memory, HPM, directly on package, and touts a massive 4-terabytes-per-second bandwidth to feed all those cores.
Starting point is 00:09:41 HBM on-on-package was a feature of Fujitsu's previous supercomputer, the K-computer, that used. used the A64 FX chip, but Fujitsu is abandoning it in the monaca-based system in favor of on-chip SRAM caches backed by the lower-cost DDR-5 memory. That approach is consistent with a more general-purpose target and consumes less electricity. The Lineshine system will use the Linkshee Interconnect, another proprietary system developed also in Chenjin to replace InfiniBan. It aims for the low latency of Infiniband and the scalability of Ethernet. It will reportedly be able to provide 1.6 terabits per second per node and supports 20,480 nodes in a so-called dual-plane multi-rail fat-tratory topology, which is common for high-end systems for traditional HPC and new AI workloads.
Starting point is 00:10:40 Sovereign is the word, and so this interconnect is described as a sovereign fabric that would neutralize export controls on NVIDIA-Melanox hardware. Right. A lot of this is really about sovereignty and self-reliance, but if it is going to take them three to four years to build this, what does that say about the competitive landscape? We know China has worked on local CPU architectures for decades. They started by licensing the deck, Alpha, and MIPS technologies, and adapting their instruction sets.
Starting point is 00:11:13 That led to the Sunway system based on Alpha, and the Long Sun project that was based on MIPS. In recent years, they also have invested in Risk 5 chips and systems. The Mips-inspired Long Sun has evolved into the important Long Ark architecture, which aims for 100% Chinese sovereignty, including independence from Risk 5, and positioned as a general-purpose CPU whose different ISA adds to its security. They also have adopted Arm V9 for high-performance exoscale systems like Lanshine.
Starting point is 00:11:51 Yeah, China manufactures about 15 to 20% of the chips in the world and does assembly test and packaging for about 40% of the chips in the world. But not at the high end. They are believed to lag behind TSM, Samsung, Intel, and Rapidis in sub-5 nanometers, if not sub-7 nanometers fabrication. So it will be significant if they manage to complete this project from the ground up and build a functioning system with two plus exoflops in 64 bits even in 2030. Line shine also shows that China expects to have 2.5D advanced packaging
Starting point is 00:12:27 by then, which can integrate multiple small chips into a larger one, and that bypasses some of the sanctions. So you could say line shine is a statement of self-sufficiency, but it really also shows how hard it is to go it alone. China is signaling. that it will push for self-reliance and has ways, albeit suboptimal ways, to maintain its AI and scientific pace despite being cut from Nvidia and AMD. One critical element we haven't covered is physical and cybersecurity. We'll leave that for another session, but it is a massive and often overlooked issue for these large systems. All right, that's it for this episode. Thank you all for being with us. HPC Newsbytes is a production of Orion X.
Starting point is 00:13:14 Khan and Doug Black host the show. Every episode is posted on Orionx.net. If you like the show, please rate and review it. Thank you for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.