@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20240415
Episode Date: April 15, 2024- Intel Vision 2024 Event, Intel Gaudi 3, Xeon 6 - Meta MTIA accelerator chip - Nvidia GPU shortage easing - Category Theory, Categorical Deep Learning, Geometric Deep Learning - China economic growt...h plans, high end manufacturing industrial policy [audio mp3="https://orionx.net/wp-content/uploads/2024/04/HPCNB_20240415.mp3"][/audio] The post HPC News Bytes – 20240415 appeared first on OrionX.net.
Transcript
Discussion (0)
Welcome to HPC News Bites, a weekly show about important news in the world of supercomputing,
AI, and other advanced technologies.
Hi, everyone.
Welcome to HPC News Bites.
I'm Doug Black.
Hi, Shaheen.
As you know, I spent most of last week at the HPC User Forum hosted by Hyperion Research. And as usual, it was a content-rich event held
this time in the Washington, D.C. area. But probably the top story of the week came out of
the Intel Vision event in Arizona. Intel announced their Gaudi 3 AI-focused accelerator using TSMC
N5 process. It's expected to compete with NVIDIA and AMD. Gaudi 3 will be available initially
through Dell, HPE, Lenovo, and Supermicro. With so much focus these days on GPUs, we shouldn't
take our eyes off of CPUs. They continue to get better and include new functionality. So Intel
also announced new Xeon 6 chips set to launch later this year with two main variants, an all-performance core chip called Granite Rapids and an efficiency-focused core codenamed Sierra Forest. and put it ahead of the NVIDIA H100 for both learning and inference. That's better than expected.
It's more than enough to generate demand and adoption.
But in the end, 2024 is about how many you can ship and not how much better you are.
And NVIDIA is significantly ahead of everyone else on that crucial metric.
The situation will change when Intel can use its own high-end fabs.
They also announced new work with the Ultra Ethernet Consortium,
UEC, focused on open standards for AI networking. An inference card for that is targeted for
availability in 2026. Related news, 350,000 equivalent H100s in 2024,
and presumably interested in another
20 to 25,000 NVIDIA Blackwell chips as well.
So if you're buying that much
and if building it is practical
and you have a big enough use case
to optimize a chip for it
and you have the wherewithal to pursue it
for a couple of few
generations, then maybe you can build your own while buying from outside for other uses. Big
cloud providers and extreme scale sites fit that description. So you see Amazon, Microsoft, Google,
Tesla, Meta, and others roll their own. So Meta announced the latest version of their custom chip using also the TSMC N5 process?
Yes, they call it MTIA for Meta Training and Inference Accelerator, which is part of a
multi-pronged approach to AI spanning chips to models using internal and external technologies.
Meta said the new MTIA more than doubles the compute and memory bandwidth of their previous
solution and that it's designed to serve their rankings and recommendations models.
A report from a Taiwan news source quotes a senior Dell manager in that country saying
lead times for NVIDIA GPUs has decreased substantially from 8 to 11 months down to
3 or 4, which is roughly standard for the industry.
This is the result of TSMC expanding GPU packaging capacity,
and they are continuing to do that as we reported last week.
This is great news for NVIDIA.
The talk of all this capacity inevitably raises the question of overheating
and a possible glut in the market.
It doesn't look like that right now, but AI vendors would do well to spend time now
to make sure their customers will be successful with their projects.
When it comes to the future of AI,
it seemed like large language models
and therefore generative AI was going to be it for a while.
We've had geometric deep learning, GDL,
that uses so-called non-Euclidean data, like
data graphs and meshes and such, and is used in computer vision for learning complex 3D
objects.
A branch of geometric deep learning is categorical deep learning, CDL, and that is to bring even
more structure to the space.
It's a framework that generalizes geometric deep learning.
It uses
category theory, which originated in abstract mathematics, and it looks pretty promising.
Yes, there's a new research paper on this on the Archive site, in which the researchers argue that
previous attempts in this area lacked a coherent bridge between specifying constraints that models
must satisfy and specifying their implementations.
They say applying category theory as a single theory, quote,
elegantly subsumes both of these flavors of neural network design.
The paper concludes with this.
We thus believe that this is the right path to AI compliance and safety,
and not merely explainable, but verifiable AI.
We want to end with geopolitical news about China.
The Economist provided a trip report of a business summit in China attended by a global cast of senior executives. It projected improvement in how the US-China relationship and competition
is being managed, but cast doubt on China's longer-term economic growth plans.
It's an excellent article and associated podcast, but to roughly summarize,
the economists said China's plan prioritizes high-end manufacturing, which they don't think
will work, ahead of addressing the real estate property bust, which they think they must focus
on first. There's also the backdrop of demographic challenges as the country faces significant population decline due to aging and the old one-child policy. Also, global free trade
was a key factor in China's tremendous economic growth for the past 30 years, but trade restrictions
are going into effect around the world, another barrier for growth in China. All right, that's it
for this episode. Thank you all for being with us.