@HPC Podcast Archives - OrionX.net - HPC News Bytes – 20260406
Episode Date: April 6, 2026- Nvidia invests $2bn in Marvell - NVLink Fusion - Why big companies invest in small ones - Andrew Ng on anti-AI activism - 32, 16, 8, TurboQuant at 3.5 bits... do I hear 1? - PrismML’s Bonsai-8B 1...-bit LLM [audio mp3="https://orionx.net/wp-content/uploads/2026/04/HPCNB_20260406.mp3"][/audio] The post HPC News Bytes – 20260406 appeared first on OrionX.net.
Transcript
Discussion (0)
Welcome to HPC Newsbytes, a weekly show about important news in the world of supercomputing,
AI, quantum computing, and other advanced technologies.
Hi, everyone. Welcome to HPC Newsbytes. I'm Doug Black, and with me is Shaheen Khan.
Early last week, Nvidia announced it has invested $2 billion in Marvell as part of what Jensen Wong said
is the company's effort to increase its AI Tam, or total available market,
by making it easier for customers to use the custom AI chips that Marvell designs with
Nvidia's networking gear and central processors.
Put another way, some customers are opting for custom processors because they're cheaper
than Nvidia's pricey ones.
This is the market factor Nvidia is tapping into.
Industry observers have said price is a potential Nvidia liability or vulnerability in the
long run, so this may be a response to that concern.
In its story on the deal, Reuters quoted analyst Jacob Bourne at e-market saying,
it broadens Nvidia's ecosystem to include more specialized silicon, which helps Nvidia remain a key
access point for increasingly diverse AI workloads.
And that's a key point.
The AI market is evolving and broadening, and Nvidia wants to be everywhere AI is going.
Shaheen, Nvidia never seems to take its foot off the gas pedal.
most other tech companies we've seen over the decades that have reached Nvidia's level of dominance
haven't maintained the entrepreneurial drive we're seeing out of this company.
Go big or stay home is a common mantra in Silicon Valley, and they exemplify it.
Being on the leading edge of a history-defining technology certainly helps.
If you already have a market valuation that is over $4 trillion, why would anyone buy your stock?
if it isn't because they think it's headed to even higher values.
And doing that requires you to grow the TAM.
And they are doing that in multiple directions, software, edge, robotics, investments in
new clouds, sovereign AI, and proliferating their technologies through licensing and partnerships.
A larger topic perhaps is why large companies invest in joint ventures or startups.
And that is fundamentally to accelerate creation or meeting,
of market demand, to gain access to broad range of emerging but unproven technologies,
and ultimately to reduce risk balancing their own scale with external agility, so to speak.
To say it more explicitly, the motivation to do so can include time to market, since investing
achieves more or less instant access to developed technologies instead of waiting for
internal R&D, shared costs, where partnering with venture investors distribute
financial risk and also some financial reward, but the balance would be in your favor.
Talent acquisition, since startups can attract specialized, culturally aligned entrepreneurial talent
that is often not available to large companies, even the ones doing very well.
Market validation, because startups prove their value in real world conditions and avoid
internal echo chambers.
Strategic alignment as investment versus simply buying a product,
creates deeper partnerships and also gains more influence than just the client-vender relationship.
Supply chain security, because as an investor, you can do more to ensure critical technologies
or components remain viable and available, and financial upside, where the success of the startup
yields equity returns, not just procurement benefits. We've seen this in spades in neoclops.
Together, these factors can make investing a faster, lower risk and more strategically flexible approach
than building everything internally. And NVIDIA has done that with Marvell, with synopsis,
and a raft of others. We saw last week that AI entrepreneur and Stanford computer science professor
Andrew Ng has a LinkedIn post discussing anti-AI political tactics. According to Ing,
surveyor in the UK have found that the most effective anti-AI messaging focuses on
AI-enabled warfare and environmental damage, along with AI eating jobs and causing harm to
children. While those messages are seen as the most effective, Ing said the anti-AIs, if you
will, quote, found that saying AI will cause human extension has largely failed. Doom's
were pushing this argument a couple of years ago, and fortunately our community beat it back,
unquote. All this is interesting, Sheen, because we continue to hear all of those negative messages
from AI Dumers. I saw one last week from opinion writer Noah Smith who said, quote,
AI has the worst sales pitch I've ever seen. Our product will make you economically useless and
possibly kill you. This Smith asserted is not a value proposition. That is a really clever way of
putting it, but it's hard to say AI has a marketing problem when so much money is being directed to it.
But yes, there are important concerns. I mean, anytime you say a technology is going to redefine
humanity, you have to expect a bunch of concerns. And we should note that Inc. owns up to
at least some of those concerns. I'll just read a passage from his article.
To be clear, I find AI-enabled warfare alarming.
We need to continue serious efforts to monitor and mitigate the environmental impact of AI.
Any job losses are tragic and hurt individuals and families.
And as a father, I hold dearly the importance of every child's welfare.
Each of these topics deserve serious attention and treatment with the greatest of care, end quote.
He mentions also companies that have blamed AI for lay.
that they would have done anyway, or AI companies that blame competing open source AI projects
for making AI unsafe. So, okay, if we all agree there are big concerns, then any difference
is about what we think should be done to eliminate or at least mitigate risks. It would be good
for the pro-AI community to publish their proposed plans to address those concerns.
AI companies have said that they want and welcome some kind of oversight and regulations,
and it doesn't come across as reverse psychology,
like a way of gaining trust that they believe they will manage it well without regulations.
But as we've said here before, the global scene is more complicated
and must rely on norms and not just regulations.
The ideal scenario is to go as fast as possible without losing control,
and there lies a fundamental complexity.
With the rise of AI, we've also seen the emergence of lower precision 32-bit, 16-bit, and even 8-bit computing.
Fewer bits means faster processing.
But this only works for workloads where lower-precision computing can still deliver accurate results.
Just last week, we covered the new turbo-quant algorithm from Google that could compress AI inference into 3.5-bit arithmetic,
T. Three-bit for all the data and one-bit for half of it as error correction, all based on the usual,
very complex math. So last week, PrismML, a Caltech originated startup, introduced Bonsai,
a one-bit model that they say addresses deep structural constraints on AI brought on by big models
with trillions of parameters that require big data centers, more GPUs, more memory, more electrical
power and more cost. Of course, big models deliver tremendous value, but PrismML is focused on
uses and computing environments that aren't so big, such as phones, laptops, vehicles, robots,
and edge devices. They say Prism ML is building, quote, concentrated intelligence guided by a core
belief that the next major leaps in AI will be driven by order of magnitude improvements in
intelligence density, not just sheer parameter count. Their Bonsai 8B is a one-bit model designed
across the entire network. It's a true one-bit model, they say, end-to-end across 8.2 billion parameters.
So, Jehyn, what do you make of all this? It's pretty exciting to have these significant announcements
so close together, though they've all clearly been working on these for many months. So maybe the
first announcement accelerated the second. Anyway, Prism ML.
introduces a new approach to AI that appears both credible and potentially very important.
The key idea is what, as you said, they call intelligence density, which is an efficiency
metric rather than a pure capability metric. Conceptually, it is the ratio of how well a model
performs relative to its size and resource cost. More specifically, it measures observed
performance like accuracy, reasoning ability, and task success on benchmarks per unit cost,
like the number of parameters, memory footprint, compute, or energy. It is not attempting to measure
intelligence, but rather how efficiently a model converts resources into performance. A second key idea
is standardizing on end-to-end one-bit operations, rather than training in higher precision and
compressing afterwards. This makes training significantly harder and is likely where much of their
innovation would be, but avoids the losses associated with post-training quantization. To enable this,
they use grouped scaling, where blocks of weights share higher precision scaling factors to preserve
accuracy. Remember that quantization is a specific method of lossy compression used in AI. Then they
optimize inference for the low power processing of edge devices. They reported models about 1.15
gigabytes in size versus 16 gigabytes for a 16-bit equivalent, 8 billion parameters, 2 bytes each.
So that's 14x smaller. It is unsurprisingly also faster and more energy efficient than traditional
16-bit models. They report 43 tokens per second on a high-end iPhone. That's pretty significant.
and 4 to 5x improvement in joules per token energy metric.
This approach differs from prior low-percision efforts such as BitNet
by enforcing binary weights across all layers
and then co-optimizing the full software stack.
If validated, PrismML could signal a shift
from bigger models to more efficient ones,
with implications for both edge inference and data center scale AI.
definitely want to watch.
All right, that's it for this episode.
Thank you all for being with us.
HPC Newsbytes is a production of OrionX.
Shaheen Khan and Doug Black host the show.
Every episode is posted on OrionX.net.
If you like the show, please rate and review it.
Thank you for listening.
