@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20250303

Episode Date: March 3, 2025

- Government layoffs funding disruptions cause uncertainty and concern for scientific leadership - Cat Qubits: AWS Ocelot, Alice and Bob - ASIC vs GPUs vs Accelerators [audio mp3="https://orionx.net/...wp-content/uploads/2025/03/HPCNB_20250303.mp3"][/audio] The post HPC News Bytes – 20250303 appeared first on OrionX.net.

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to HPC News Bites, a weekly show about important news in the world of supercomputing, AI, and other advanced technologies. Hi, everyone. Welcome to HPC News Bites. I'm Doug Black of InsideHPC, and with me is Shaheen Khan of OrionX.net. We'll start with a news development with direct impact on many of our listeners, and that is funding disruptions and layoffs of federal employees at the national labs and science labs by the new administration.
Starting point is 00:00:34 Reports from various departments about the atmosphere among employees include quotes like, people are terrified, or it feels as though our government is being dismantled. To be sure, some people like that action is being taken towards efficiency, but others prefer effectiveness versus what they see as random cuts. But in any case, there is disruption and uncertainty, and that is expected to impact funding, planning, recruiting, and ultimately the state of the HPC market. It's a commonplace to think that supercomputing resources and other advanced technologies should not suffer cutbacks because the mission is so vital. And in fact, support for US supercomputing has enjoyed rare bipartisan support in Washington
Starting point is 00:01:18 because the mission is seen as just that, very vital. So I have to think that like the National Nuclear Safety Workers, who were recently laid off and then recalled that the same could happen with HBC employees at the science labs. By the way, the new Secretary of Energy, Chris Wright, visited Oak Ridge Lab last week and called research work in AI the start of a new Manhattan project
Starting point is 00:01:42 and that the US needs to lead the global AI race. Words indicating he may come out swinging for DOE's mission and employees. We'll see. I read your report, Doug, on a nonprofit organization called the Coalition for Academic Scientific Computation. It's been around for 35 years and has 105 member institutions, pretty much all the universities and research computing centers in the United States. They advocate for what everyone agrees with, that
Starting point is 00:02:10 advances in science is the foundation of innovation, therefore national competitiveness, national and global security, economic growth, and military strength. And from their vantage point, they seem to be seeing how their members are being impacted. Well, they issued a pretty strongly worded statement saying they quote, strongly condemn this action, which undermines the foundation of American scientific leadership. The loss of so many dedicated professionals in such a short period creates immediate disruptions and long term damage to the nation's research enterprise. They urged the Congress and the administration to, quote, take swift action to restore these critical positions and reaffirm the country's commitment
Starting point is 00:02:52 to scientific excellence. It's a big deal. Last week, Amazon Web Services joined fellow big techers, Microsoft and Google, making announcements in the quantum arena. The company announced Ocelot, a prototype quantum chip that tests for error correction. AWS said it reduces error correction by up to 90% and consists of two integrated silicon microchips, each with an area of roughly one centimeter, bonded one on top of the other
Starting point is 00:03:20 in an electronically connected chip stack. And now Shaheen, you being our resident quantum expert, please share your insights on this. Yeah, at your service. Well, AWS is the third member of the so-called Magnificent 7 to showcase its own quantum processing unit, QPU. And that's following Google's Willow three months ago
Starting point is 00:03:40 and Microsoft's Majorana last week, which we covered in last week's episode. AWS has joined collaborations with and provides funding at a few top universities, in this case, the California Institute of Technology and their Center for Quantum Computing. That's where this work was done. Back in 2018, Professor John Preskill, one of the leading lights of this field and holding the Richard Feynman Professorship at Caltech, coined the phrase NISC, Noisy Intermediate Scale Quantum, to describe where we are in quantum computing.
Starting point is 00:04:10 More qubits than the early days, but not a lot of qubits, and dealing with a lot of noise and errors. That was fine, since getting more qubits was the hurdle to pass at that time. Now it's all about noise mitigation and error correction. While the ultimate goal remains fault tolerant quantum computing FTQC at high enough scale to let you run actual code with better performance or price performance than GPUs. So as the industry aims for utility scale systems, it is incorporating error correction as a foundational requirement designed in from the beginning. This is what we saw with Willow, with Majorana, and now with Ocelot. Willow used new quantum error correcting code, QECC,
Starting point is 00:04:52 and Microsoft is pursuing topological qubits. Ocelot rhymes with Camelot and Lancelot, but it is a mid-sized species of cat. Well, because the system is using so-called cat qubits, named after Schrodinger's poor cat that is commonly put in danger to make the point about quantum superposition. Thankfully, no cats have been hurt in the process, and there's no evidence Schrodinger actually had a cat, though there are some reports that said he had one in 1935 called Milton.
Starting point is 00:05:20 Anyway, cat qubits massively reduce one kind of error, bit flips, in exchange for a small increase in another type, phase flips. Another company called Alice and Bob also pursues CatQ bits with a different implementation. You have to remember quantum computing is based on wave functions and probability amplitudes, so errors can happen in more ways than in traditional digital computers. AWS also provides its own quantum computing development environment called Bracket, while also featuring access to quantum computers from IonQ, IQM, Quera, and Rigetti. Microsoft is similar. They offer Azure Quantum and the Qsharp programming language
Starting point is 00:06:01 and access to IonQ, Quontinium, Rigetti, Pascal, and quantum circuits. Now, as we look forward to NVIDIA's GTC conference in two weeks, what's been called AI Woodstock, let's look at an emerging trend in the GPU arena, ASIC semiconductors that are taking on more prominence in AI computing. ASICs are application-specific integrated circuits. Both Jensen Wong and AMD CEO Lisa Su refer to ASICs used for AI
Starting point is 00:06:30 as something different from their own more versatile GPUs built for the broader AI server market. Both Marvell and Broadcom have reported growth in AI ASIC sales to big cloud companies. Broadcom reported 220% growth in AI ASIC sales to big cloud companies. Broadcom reported 220% growth in 2024 to $12 billion in revenue, which exceeds AMD's $5 billion in data center GPU revenues last year, according to our podcast friend and chip industry analyst, Ian Cutrus, while Nvidia reached $95 billion in the AI GPU market last year.
Starting point is 00:07:05 And then there are the hyperscalers AI chips, such as Google's Tensor Processing Units and Amazon's Tranium and Inferentia chips. These of course aren't sold on the open market. They are built into Google Cloud and AWS for AI workloads. Yeah, I suppose it is helpful to be able to refer to all the GPUs that are not developed by merchant chip vendors.
Starting point is 00:07:26 Of course, the word GPU itself is not quite valid anymore since the G stands for graphics and many of these chips never do any graphics. Likewise, ASIC is a misnomer. They are all programmable accelerators and they all optimize for different set of workloads. The big battle is between cloud providers who have enough internal volume and the means to build their own chip and sway the market their way. And then merchant chips, vendors like Nvidia, AMD, and Intel,
Starting point is 00:07:54 which have focused and massive development forces, serve cloud vendors as well as every other system vendor so they can leverage volume across many vendors. Very interesting to watch. All right, that's it for this episode. Thank you all for being with us. HPC News Bytes is a production of OrionX in association with Inside HPC. Shaheen Khan and Doug Black host the show. Every episode is featured on insidehpc.com and posted on orionx.net. Thank you for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.