@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20240812
Episode Date: August 12, 2024- UK Govt. cuts to exascale and AI projects - Investment in AI inference rises - AI models: open source vs. guardrails in - Will SC24 be the largest supercomputing event ever? [audio mp3="https://ori...onx.net/wp-content/uploads/2024/08/HPCNB_20240812.mp3"][/audio] The post HPC News Bytes – 20240812 appeared first on OrionX.net.
Transcript
Discussion (0)
Welcome to HPC News Bites, a weekly show about important news in the world of supercomputing,
AI, and other advanced technologies.
Hi, everyone.
Welcome to HPC News Bites.
I'm Doug Black of InsideHPC, and with me is Shaheen Khan of OrionX.net.
Let's start in the UK, where the Department for Science, Innovation and Technology
said last week that they have cancelled plans for an 800 million pound exascale class supercomputer.
The system was announced last October with much fanfare, heralded as the start of a new era for
supercomputing and scientific research in the UK. Disappointing news. The system would have been housed
at the Edinburgh Parallel Computing Centre,
the EPCC, in Scotland,
one of the top sites in the world.
News accounts state that the centre
has already spent £31 million
to prepare the physical facility to house it.
The plans were for a system 50 times more powerful
than the UK's current top-end system,
Archer 2, an HPE AMD system,
also at EPCC, which is number 49 on the top 500 list. Next down for the UK is a system called
Dawn, a Dell Intel system at number 51, and then iSymbard, an HPE NVIDIA system built as an AI
supercomputer in Bristol at number 128. In addition to the 800
million pound cut to the Exascale program, there's another 500 million pounds reduction
for what they call the AI research resource. The usual remedies have been proposed,
look for funding from the private sector, tag along other projects that might have excess capacity
or be able to share resources or strengthen
partnerships to pool resources together. The UK is in a position to do all of that,
so I hope they manage to make that work. We follow the evolving fate up or down for AI
chip companies that are challenging the established vendors, what some call the
quote specialty AI chip companies. I imagine the companies themselves probably
prefer the challenger label. Last month, we discussed SoftBank's acquisition of UK-based
GraphCore for an undisclosed sum after the company laid off roughly 20% of its workforce.
It's commonly assumed that the GraphCore deal went for a lot lower than the $2.8 billion valuation it had in 2020. AI learning, or perhaps
we should say AI training, was decisively won by NVIDIA, though it will evolve with more players
over time. That has left AI inference as the next battleground. AI inference with large language
models is all about tokens, the time it takes to generate and display the first token, and the
throughput of tokens after that. A token is typically about three quarters of a word for
the English language, something like a four-character snippet. Armed with that, we heard from
California-based company Grok, spelled with a Q, which managed to do pretty well in LLM inference
benchmarks, and has parlayed that into a new cool $684 million
funding round. Another trend in chip vendors is to go beyond building their own systems and
actually build their own clouds or partner with emerging GPU as a service providers.
Grok will use the funding to scale its tokens-as-a-service offering and add new models and features to Grok Cloud, including more than 100,000 Grok AI LPUs, which stands for Language Processing Units, a new label for their processing units to signal a focus on LLMs.
The chips will be manufactured by Global Foundries by the end of Q1 2025. Rock said it has
grown to over 360,000 developers building on Grok Cloud, creating AI applications on open models
such as Meta's Lama 3.1. It aims to be the largest AI inference deployment of any non-hyperscaler. Speaking of LAMA 3.1, the model announced last
month, Wired Magazine has a piece on it saying that the new model is open, free of charge,
and powerful, but also risky. The newest version will make AI more accessible and customizable,
they say, but it will also stir debate over the dangers of releasing AI without guardrails.
Meta says it trains LAMA in a way that prevents the model from producing harmful output,
but sources tell Wired the model can be modified to remove such guardrails. You know, I recall the discussion we had with Red Hat and Control IQ, CIQ,
about changes in open source software.
That's the big discussion here.
And it is even more complex because it's AI
and the good and the bad are amplified.
Exactly how much of a model is open sourced
and how it gets regulated are big decisions
with big implications.
Meta has positioned Lama to compete
with commercial offerings from the likes of OpenAI,
Google, or Anthropic.
And it says Lama will allow AI to develop similarly to the way
Linux impacted technical and later cloud computing starting in the late 90s. Some AI experts say
meta's assurances against harmful results can be skirted and that the 405 billion parameter model
might be used to empower the bad guys and enable nefarious uses. But Wired also quoted the
director of the Center for AI Safety saying Meta has done a good job of testing its models before
releasing them and that the LAMA3 release will enable researchers to conduct much-needed AI
safety research. Let's end with a quick mention of the SC24 conference to be held in November in Atlanta.
Interest is building and it promises to be the biggest SC ever.
Yes, SC and ISC are, of course, the two biggest HPC events of the year.
Both are many months long projects with hundreds of volunteers and people in the community
are already talking about speakers, vendor events,
and other aspects of the gathering that we all look forward to. All right, that's it for this episode. Thanks
so much for being with us. HPC News Bites is a production of OrionX in association with Inside
HPC. Shaheen Khan and Doug Black host the show. Every episode is featured on InsideHPC.com and
posted on OrionX.net. Thank you for listening.