@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20240624

Episode Date: June 24, 2024

- GPU price wars? - Rectangular wafers for more/bigger chips - AI assistant for science - AI Supercomputer in Japan, New exascale in the EU [audio mp3="https://orionx.net/wp-content/uploads/2024/06/H...PCNB_20240624.mp3"][/audio] The post HPC News Bytes – 20240624 appeared first on OrionX.net.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to HPC News Bytes, a weekly show about important news in the world of supercomputing, AI, and other advanced technologies. Hi, everyone. Welcome to HPC News Bytes. I'm Doug Black. Hi, Shaheen. Let's start with GPU pricing. We all know that NVIDIA's amazing profit margins are a key factor in driving the company to becoming the most valuable in the U.S. by stock valuation, surpassing Microsoft. But that said, GPU price competition could develop over the next 6 to 18 months. AMD MI300 and Intel Gaudi 3 GPUs at a third to half the price of NVIDIA H100s, which appear to be generally priced at $25,000 to $30,000, depending on volume. Now, I suppose I have to preface this by saying this is not investment advice, but it makes sense for competition to recognize they're coming from behind
Starting point is 00:01:02 and they need to deliver substantially better economics. But it has not happened yet. And NVIDIA keeps moving the goalposts, especially with their annual upgrade cadence and continuous optimization of their software stack. They are executing like crazy and expanding their total available market by moving into software, into edge, the enterprise, quantum emulation, et cetera, et cetera. And as much as the basics can now be done with other chips, they have a commanding lead in learning versus inference, and they have the most complete ecosystem. So I'm not seeing competition at the expense of NVIDIA anytime soon, but I do see it in addition to NVIDIA. What can change that equation, however, is something like an AI winter, or if hyperscalers suddenly find themselves with overcapacity,
Starting point is 00:01:50 or if everyone's focus switches to inference where there is more competition. With all the focus on AI, it is easy to take your eyes off of traditional HPC. I recall, for example, the Chinese AI chip Baron that did not support 64-bit operations and hardware at all. So one thing I do like about NVIDIA and AMD is they continue to push on 64-bit computing too in their GPUs, but also CPUs. That makes them more attractive for mixed-use installations that don't exclusively do AI. So we are seeing more Grace Hopper and AMD-based HPC sites, like the one in Los Alamos in the UK, and now in Japan. Yeah, related to that, HPE announced an NVIDIA-powered AI supercomputer to be housed at
Starting point is 00:02:35 Japan's National Institute of Advanced Industrial Science and Technology with $200 million worth of AI servers. There's other big systems news. Euro HPC JU announced that its second Exascale system will be located at GEN-C, the French research organization, called Alice Racoque. It will be operated by the French agency CEA. They cited what looked like an all-inclusive total cost of ownership of 544 million euros. According to reports last week, TSMC, Samsung, and Intel are looking at rectangular panel-like
Starting point is 00:03:14 wafers rather than traditional circular wafers. This would allow more chips to be placed on a single substrate and not have to cut the edges off it. It would provide more or larger chips, addressing demands for more powerful and denser compute required by AI. Reportedly, TSMC is experimenting with substrates measuring in the 20 by 20 inch range more than three times the usable area compared to 12 inch circular wafers. Although the research is still new and may take several years before mass production, this represents a significant technology shift. Yeah, very interesting. The prevailing manufacturing process for wafers naturally produces cylindrical crystals, but it can produce larger crystals than the current 30-centimeter round
Starting point is 00:04:03 wafers. This is not the first time the industry tries to push beyond that. 60 centimeter wafers were proposed some years ago but did not catch on for lack of demand. TSMC is presumably playing with wafers of 51 centimeters by 51 and a half centimeter size. You also have to remember that the entire supply chain and the tool chain and the equipment must support the larger wafers, and each of them needs enough demand to justify the effort. If this happens, however, it will increase yield for larger chips, but will really boost what you can do with wafer-scale chips. Efforts to use AI for science have been a constant and growing trend by using lower precision but faster computations and using AI methodologies as part of the algorithms. So we see more talks and conferences
Starting point is 00:04:52 focused on the best ways to use historical scientific information, things like papers, texts, presentations, and structured science data in foundational AI models. The current trend is to use LLMs, but also non-LLM approaches. Within LLMs, the topics are the right scale for the number of parameters, noticing and taking advantage of sparsity in models where coefficients would be zero and therefore do not need computation, the best ways to use RAG, retrieval augmented generation, where you leverage external information to remain more relevant, MOE, mix of experts, and multimodal uses, blending text and images and audio, et cetera. Yeah. For example, in a presentation from Ian Foster of Argonne, he talked about a conversational AI scientific assistant. The target would be an AI-based system with scientific
Starting point is 00:05:46 skills, effective assistance, as in trustworthy with no hallucination, and we'll have to see how they accomplish that, proper communication skills, and increasing autonomy, starting with learning existing workflows, but eventually suggesting improvements or entirely new workflows. This is a growing topic of interest and something to look forward to at SC24, which, by the way, is only five months away. All right, that's it for this episode. Thanks so much for being with us. HPC News Bites is a production of OrionX in association with InsideHPC. Shaheen Khan and Doug Black host the show.
Starting point is 00:06:25 Every episode is featured on InsideHPC.com and posted on OrionX.net. Thank you for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.