@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20260202

Episode Date: February 2, 2026

- Microsoft Maia 200 for Inference - Nvidia H200 (not H20) for China - Corning-Meta Data Center - Nvidia-CoreWeave AI Deal - Neoclouds’ role in Chip vs Cloud competition [audio mp3="https://orionx....net/wp-content/uploads/2026/02/HPCNB_20260202.mp3"][/audio] The post HPC News Bytes – 20260202 appeared first on OrionX.net.

Transcript
Discussion (0)
Starting point is 00:00:04 Welcome to HPC Newsbytes, a weekly show about important news in the world of supercomputing, AI, and other advanced technologies. Hi, everyone. Welcome to HBC Newsbytes. I'm Doug Black of InsideHPC, and with me, of course, is Shaheen Khan of OrionX.net. Early last week, Microsoft introduced Maya 200, an ASIC for AI inference, that Microsoft said, is engineered to improve the economics of AI token generation. It's built on TSM's 3-nometer process with native FP8, FP4 tensor cores. Microsoft claims Maya 200 is the most performant silicon from any of the hypers and the most efficient inference system Microsoft has ever deployed.
Starting point is 00:00:54 Shaheen, can we safely assume the chip is intended to reduce Microsoft's reliance on NVIDIA for AI processing? Well, they actually were at pains to say they will continue. to buy chips from merchant chip vendors, Nvidia, AMD, and perhaps others. But in the fast-changing market that we see, Microsoft up the ante with a very interesting approach. Maya 200 is a 140 billion transistor 900-watt 3-nometer chip built by TSMC. At Microsoft said it is being deployed in its data centers, albeit also, as I said, they're going to continue to buy. Maya 200 was described as inference first, which means it is optimized for token generation speed, which also means
Starting point is 00:01:39 floating point 4 and floating point 8-bit and mixed precision speeds. It uses standard Ethernet to fully direct connect four accelerators to each other. They call that a fully connected quad. This enables and also favors local communication to reduce latency. It has what they call software managed memory that lets compilers to further optimize data locality, memory placement, data movement, and therefore latency. That includes the relatively large 272 megabyte static random access memory S-RAM on each chip. All of these chips have S-RAM on them for caches and fast buffers. There's also a new direct memory access, DMA model, that supports more complex data movement that can match AI data structures and layouts,
Starting point is 00:02:34 and it helps move them more efficiently. And it includes liquid cooling integration, which pairs the chips with closed-loop liquid cooling system, which, of course, is required, because the power envelope really is 750 to 900 watts, and it's going to run hot. So while Nvidia continues to rule, the strategic competitive landscape carries on
Starting point is 00:02:56 in chips around the world. Most notable of these are big cloud providers like Microsoft, AWS, and Google with their own chips, and then the chips that we can expect out of China, Japan, Europe, and we should also expect increasingly India. Moving over to the AI chip export arena, China has okayed the buying of Nvidia's H-200 processor by large Chinese customers of NVIDIAs, including Alibaba and ByteDance, according to published reports. This came during a trip to China by NVIDIA CEO Jensen Wong. Shaheen, this story has gone through so many convolutions in recent months. With NVIDIA buying, then being told it couldn't export H20 GPUs to China,
Starting point is 00:03:42 followed by Wang, reportedly convincing President Trump that H-200 exports should be allowed for the China market, followed by the Chinese government banning or limiting purchases of NVIDIA GPUs to now this. Both sides have been moving the export goalposts in real time. For the U.S., it wants China to remain somewhat dependent on Western AI processing technology, though not the latest AI chips, and it wants the revenue from AI purchases. For China, it wants more advanced AI chips than it can currently manufacture on its own, while also pushing the development of a domestic supply chain of AI processors. Yeah, and just a few weeks ago, we were covering an analysis
Starting point is 00:04:27 that advocated for very strict export controls to minimize the amount of available compute capacity, it being the most important parameter that can advance AI. I'm going to say what I said then. There are many parameters, but it comes down to value from volume. How do you get, can you get value from volume? A question that has very much been part of the HPC industry
Starting point is 00:04:51 and what led to Beowulf clusters versus big vector supercomputers in the 1990. And that question remains for China and really for everyone, including most NVIDIA competitors, whether they can use a larger number of cheaper, slower chips to achieve similar performance as fewer faster chips and do so at acceptable levels of complexity so they come out ahead on total cost. Big AI deals continue apace involving staggering amounts of money, something that was steadily heating up and then went to event. higher gear more than a year ago soon after Trump took office. Among them is a new $6 billion
Starting point is 00:05:33 agreement to speed up data center build out between Meta and Corning, which will supply its latest fiber cable and connectivity products for Meta-I data centers. Corning will expand manufacturing capabilities in North Carolina, including its optical cable manufacturing facility in Hickory, North Carolina, where Meta will serve as the anchor customer, according to the publication data center dynamics. Yeah, it's interesting. We were just talking about Corning last week. And of course, glass means fiber optics,
Starting point is 00:06:05 which has come from outside of the data center where telcos use it into the data center, and it's penetrating everywhere, including inside the chips. It's a smart deal for both. The global buildout does not show any signs of slowing down. And many of the participants like meta and public clouds have deep pockets and global reach, and they can show improvement to their business that is traceable to AI.
Starting point is 00:06:30 But there's been growing concern about whether all these data centers can actually get the real estate, the power and cooling, and be built and paid for as promised. And especially since AI companies are rapidly building your reputation as money-losing operations driven by strategic goals. So it begs the question, is this even possible? And who's going to pay for all of this and how long before the music is? stops. More news on the massive AI spending front, Nvidia announced plans to invest $2 billion in AI Neal Cloud Company, CorWeave, funding CoreWeev's build out of more than 5 gigawatts of AI factories by 2030.
Starting point is 00:07:11 This would be roughly the fourth major investment round by Nvidia in Corweave. The first of them kicked off talk of circular financing in the AI industry in which Nvidia invested money in core weave, the core weave then used to buy Nvidia chips. Two aspects of the strategic competitive landscape I mentioned above are what is happening between big chip vendors, big cloud vendors that build
Starting point is 00:07:35 their own chips, and big nation states that believe they must build their own chips. Neoclouds play a big role here. They provide a capacity cushion for cloud providers and faster deployment for chip vendors. They do not have a legacy architecture for management and control and such,
Starting point is 00:07:51 so they can move fast with a GPU-centric architecture and keep up with the pace of innovation. Their suppliers so far have been big chip vendors, and their customers have included big cloud providers. So they are interesting to chip vendors because they buy in volume and do not have their own chips to siphon off workload away from merchant chips. But they show how the market is getting blurry. Cloud guys build chips and chip guys build clouds, and they both want territory that they can
Starting point is 00:08:20 control. All right. That's it for this episode. Thank you all for being with us. HPC Newsbytes is a production of OrionX in association with InsideHPC. Shaheen Khan and Doug Black host the show. Every episode is featured on Insidehpc.com and posted on OrionX.net. Thank you for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.