@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20240624
Episode Date: June 24, 2024- GPU price wars? - Rectangular wafers for more/bigger chips - AI assistant for science - AI Supercomputer in Japan, New exascale in the EU [audio mp3="https://orionx.net/wp-content/uploads/2024/06/H...PCNB_20240624.mp3"][/audio] The post HPC News Bytes – 20240624 appeared first on OrionX.net.
Transcript
Discussion (0)
Welcome to HPC News Bytes, a weekly show about important news in the world of supercomputing,
AI, and other advanced technologies.
Hi, everyone. Welcome to HPC News Bytes. I'm Doug Black. Hi, Shaheen.
Let's start with GPU pricing. We all know that NVIDIA's amazing profit margins are a key factor in driving the company to becoming the most valuable in the U.S. by stock valuation, surpassing Microsoft.
But that said, GPU price competition could develop over the next 6 to 18 months. AMD MI300 and Intel Gaudi 3 GPUs at a third to half the price of NVIDIA H100s,
which appear to be generally priced at $25,000 to $30,000, depending on volume.
Now, I suppose I have to preface this by saying this is not investment advice,
but it makes sense for competition to recognize they're coming from behind
and they need to deliver substantially better economics. But it has not happened yet. And NVIDIA keeps moving the goalposts,
especially with their annual upgrade cadence and continuous optimization of their software stack.
They are executing like crazy and expanding their total available market by moving into software,
into edge, the enterprise, quantum emulation, et cetera, et cetera. And as
much as the basics can now be done with other chips, they have a commanding lead in learning
versus inference, and they have the most complete ecosystem. So I'm not seeing competition at the
expense of NVIDIA anytime soon, but I do see it in addition to NVIDIA. What can change that equation,
however, is something like an AI winter, or if hyperscalers suddenly find themselves with overcapacity,
or if everyone's focus switches to inference where there is more competition.
With all the focus on AI, it is easy to take your eyes off of traditional HPC. I recall,
for example, the Chinese AI chip Baron that did not support 64-bit operations and
hardware at all. So one thing I do like about NVIDIA and AMD is they continue to push on 64-bit
computing too in their GPUs, but also CPUs. That makes them more attractive for mixed-use
installations that don't exclusively do AI. So we are seeing more Grace Hopper and AMD-based HPC sites,
like the one in Los Alamos in the UK, and now in Japan.
Yeah, related to that, HPE announced an NVIDIA-powered AI supercomputer to be housed at
Japan's National Institute of Advanced Industrial Science and Technology with $200 million worth of AI servers. There's other big systems news.
Euro HPC JU announced that its second Exascale system
will be located at GEN-C, the French research organization,
called Alice Racoque.
It will be operated by the French agency CEA.
They cited what looked like an all-inclusive total cost of ownership
of 544 million euros.
According to reports last week, TSMC, Samsung, and Intel are looking at rectangular panel-like
wafers rather than traditional circular wafers. This would allow more chips to be placed on a
single substrate and not have to cut the edges off it. It would provide more
or larger chips, addressing demands for more powerful and denser compute required by AI.
Reportedly, TSMC is experimenting with substrates measuring in the 20 by 20 inch range more than
three times the usable area compared to 12 inch circular wafers. Although the research is still new and may take
several years before mass production, this represents a significant technology shift.
Yeah, very interesting. The prevailing manufacturing process for wafers naturally produces
cylindrical crystals, but it can produce larger crystals than the current 30-centimeter round
wafers. This is not the first time the
industry tries to push beyond that. 60 centimeter wafers were proposed some years ago but did not
catch on for lack of demand. TSMC is presumably playing with wafers of 51 centimeters by 51 and a
half centimeter size. You also have to remember that the entire supply chain and the tool chain
and the equipment must support the larger wafers, and each of them needs enough demand to justify the effort.
If this happens, however, it will increase yield for larger chips, but will really boost what you can do with wafer-scale chips.
Efforts to use AI for science have been a constant and growing trend by using lower precision but faster computations
and using AI methodologies as part of the algorithms. So we see more talks and conferences
focused on the best ways to use historical scientific information, things like papers,
texts, presentations, and structured science data in foundational AI models. The current trend is to use LLMs, but also non-LLM
approaches. Within LLMs, the topics are the right scale for the number of parameters,
noticing and taking advantage of sparsity in models where coefficients would be zero
and therefore do not need computation, the best ways to use RAG, retrieval augmented generation, where you leverage external information
to remain more relevant, MOE, mix of experts, and multimodal uses, blending text and images
and audio, et cetera. Yeah. For example, in a presentation from Ian Foster of Argonne,
he talked about a conversational AI scientific assistant. The target would be an AI-based system with scientific
skills, effective assistance, as in trustworthy with no hallucination, and we'll have to see
how they accomplish that, proper communication skills, and increasing autonomy, starting with
learning existing workflows, but eventually suggesting improvements or entirely new workflows.
This is a growing topic of interest and something to look forward to at SC24, which, by the way, is only five months away.
All right, that's it for this episode.
Thanks so much for being with us.
HPC News Bites is a production of OrionX in association with InsideHPC.
Shaheen Khan and Doug Black host the show.
Every episode is featured on InsideHPC.com
and posted on OrionX.net.
Thank you for listening.