Limitless Podcast - Exploring the Tech that Enables AGI: Claude Mythos and NVIDIA's Next Generation
Episode Date: April 23, 2026We explore the game-changing release of Claude Mythos and NVIDIA's Blackwell chip, a leap toward artificial general intelligence (AGI). We discuss AI hardware evolution, the rise of "neocloud...s," and the ethical implications of Mythos.------🌌 LIMITLESS HQ ⬇️NEWSLETTER: https://limitlessft.substack.com/FOLLOW ON X: https://x.com/LimitlessFTSPOTIFY: https://open.spotify.com/show/5oV29YUL8AzzwXkxEXlRMQAPPLE: https://podcasts.apple.com/us/podcast/limitless-podcast/id1813210890RSS FEED: https://limitlessft.substack.com/------TIMESTAMPS0:00 The Rise of Claude Mythos1:18 The Power of Hardware3:56 The Evolution of AI Models5:55 Accelerating Towards AGI9:41 Defining AGI and Its Implications14:59 The GPU Market Dynamics17:26 The Role of Neoclouds19:06 Inference and Its Importance19:56 Future Prospects and Challenges------RESOURCESJosh: https://x.com/JoshKaleEjaaz: https://x.com/cryptopunk7213------Not financial or tax advice. See our investment disclosures here:https://www.bankless.com/disclosures
Transcript
Discussion (0)
A couple weeks ago, we covered the Claude Mythos release, the model that found decade-old security flaws overnight and scared the hell out of basically anyone who is following the AI story.
So much so that the federal government is involved.
But the part that we didn't get into is the back end that powered this model.
Mythos was built on a chip from March 2024 that Jensen pulled out of his pocket on stage at GTC, which was the Blackwell chip.
It had 2008 billion transistors.
Everyone treated it like the future had arrived.
And yet, it took two years of fabrication for us to get the first manifestation of,
that, which is Claude Mythos, 24 models from keynote to a working model. It happened with Hopper,
it happened again with Blackwell, and it's going to happen again with their future models.
But the difference is we have a series of future models that exist today that we can kind of map out
to where we're going to be heading based on this trajectory that we've seen with the previous
chips. And it's pretty awe-inspiring to see where we are going to go, considering there are
three generations of chips that have already been announced since Blackwell. We have Vera Rubin,
Ruben Ultra and Feynman, each one many multiples more powerful than the less.
And when you look at what Blackwell already produced in the very first version,
it gets impossible to imagine a world where we don't reach AGI on hardware that's already been designed.
Everything that's been announced that is going into production almost certainly is going to produce models indistinguishable from AGI.
At least that's what it seems like on surface level.
Yeah, so the story here in a single sentence is AGI,
AGI, like AI models, are already here.
We just haven't distributed it because we haven't powered up the GPUs that enable it.
So everyone is obsessed with AI models.
We talk about our favorite models, how we prompt them, how intelligent they are.
But very few people are talking about the fact that the hardware is the thing that powers these things.
They train these things.
They inference these things.
And it's still about 70% of the influence of how intelligent your model is.
And the prime example, most recent example of that, has been Anthropics Mythos.
release, right? You just mentioned it. It's discovered a bunch of different cybersecurity flaws.
It is this all-being powerful thing that the governments around the world, including the U.S.
Government, the Federal Reserve, they're sharing meetings with the top banks.
To talk about the craziness of this model, we must prepare. There's a lot of duma news out there
in the future. Little do you know that this is powered by a GPU, or this was trained by a
GPU that was built 20 months ago. So we're talking about almost two years ago. It's called
Blackwell. And I want to give you guys an idea of
of the timeline of what this looked like.
So in March 2024,
Nvidia GTC,
which is like their developer conference,
Jensen Huang comes on stage,
and he presents this gargantuan scrap of metal.
It looks very pretty, by the way.
And he goes,
this is Blackwell,
GB200, GB300,
a brand new GPU.
We can train frontier models on it.
Everyone gets so excited.
Their stock price absolutely ascends, right?
The thing is,
people couldn't get their hands on this
until exactly a year later.
So to give you guys an idea of the timeline, he announces it in March 2024, then by the middle of the year, they discovered there's like a bit of a design floor and they amend that.
And then by the end of 2024, early 2025, they start shipping these units of Blackwell GPUs out to the top frontier AI labs.
But there's an important nuance here, which is, it's just the GPU sitting in a data center.
They aren't actually powered up.
It's not until six to 12 months after that fact that these GPUs were finally power.
up used to train models, which is why we now start to see these new AGI-like models like Open AIS
Spud and Claude Mithos come to fruition. So the point is, there is a long gap between the frontier
GPUs being announced and rolled out to them actually being powered to train the models. We talked
about Elon Musk and XA.R a lot on this show before. They actually have the largest arsenal of
these Blackwell GPUs. They bought about a million of them. The crazy part about this now is they're not
like one, two, but three new
Nvidia GPU models that have been announced in the
recent Nvidia GTC.
So there is a major lag
between frontier hardware
and the new AI models that are being released
and people don't understand this and we want to tell you
the story. You just remember GPT4
how long ago that was
and how that felt like the huge
most pivotal model that Open AI ever released.
I mean, that was the big one right after
ChatGPT came out. That was trained
using the Hopper chips.
You know, the most recent model,
Hopper is a word I haven't heard in a while, Josh.
Yeah, well, you know, GPT 5.4, the most recent model that we're using every single day on ChadGPT,
that was also trained on Hopper chips.
The same chips are chaining models from GPT4 to GPT5.4.
And it's a testament to how the efficiency gains of software can actually increase the throughput of hardware.
And I think I want to use that as an example because what we just got recently with Mythos,
through Anthropic. That seems to be the first real implementation of a true Blackwell model.
And rumors are that Spud, the new Open AI model, is going to kind of be the same in terms of power
that is coming as it relates to the first Blackwell model. And even if we don't actually iterate on
the hardware, the amount of progress we're going to get from Blackwell models alone seems like
it is going to be difficult to imagine it doesn't become some sort of an AGI. It's like when you
think about the difference of intelligence between GPT4 and GPD5.4.
and how far we've come, that applied to Blackwell at this new scale seems crazy.
But that's not even the crazy part because we have an entire roadmap of these three generations
of chips that are coming that we can very clearly map to the gains that we're going to see.
And I think that's when things get particularly disturbing.
Because on this chart that we're looking on screen now, we have Blackwell.
That's where we are right now.
Blackwell is a significant improvement over the previous model.
But then we have Vera Rubin, which jumps from 20 petaflops to 50 pedophobiles.
to 50 petaflops. That's a two and a half to five times multiple on the compute. Then we have Rubin
Ultra, which is scheduled for the second half of 2027. That is a 14 times multiple. And then we
have Feynman in 28, which is an estimated 30 to 50 times multiple on the current chip stack that we
have today, assuming that we get no software progress at all. And what we saw with the hopper chips is
that we got a tremendous amount of progress just from software. So when you combine this 30 to 50 times
multiple with a maybe another 100 times multiple on software if we make another breakthrough,
we're looking at some pretty insane improvements here that are really hard to wrap your head around.
I want to point out that these improvements, these multiples that you just mentioned,
are just on the speed and power of these hardware modules, right? So it's going to work
3x harder or 14x harder, but it's also going to cost you a lot less to be able to do.
train the same type of intelligence or model. So the intelligence per density, which is a unit that
we completely made up, and we don't know if it exists, but it somehow rhymes, or in my head at least,
is improving and it's going to be cheaper with each successive model. But if you want to get a bit
of context as to like what that looks like in terms of like the models that you use today and what
it's going to look like tomorrow, we have this other table here, which kind of like maps it out.
So with Blackwell today, you get about a 2 to 3x more intelligent, crazier model, right?
That's what called Mythos is supposedly meant to be.
It's like a larger size.
It's trained on these Blackwells.
You're going to see a bunch of models similar come out from Open AI and XIA over the next couple of months.
And just to pause you there, these are already models deemed too dangerous to release for the public.
Yes.
Just like, just like there are emergency meetings literally being called by the federal chair, top banks.
Actually, I read something yesterday that the NSA is using.
or conferring or re-engaged with Anthropic,
as well as the Pentagon and the U.S. Defense Department,
after banning and blacklisting Anthropic because it's so powerful.
And that's where we are today.
That's today.
So that's right here, 26, 2 to 3x, right?
Yeah, crazy.
Now, you might notice that by next year,
we have a larger multiple on the original multiple.
By next year, we're going to have a 10 to 50 next improvement
purely through Vera Rubin, GPUs.
Now, I must emphasize, this does not.
include post-training. This doesn't include all the fine fancy techniques that AI labs themselves
will implement to make a smart model. This is just the hardware. It's like buying the hardware
and training a model today versus next year, you're going to get a 10 to 50 next more intelligent
model, but it gets even scarier. 2028, 30 to 50x. 2029, 100 to 200x. Now, I haven't seen these
multiples in any other industry for any kind of performance or hardware improvement. So this, this
I can't wrap my head around this because it looks like just a few small numbers that are getting larger,
but these are multiples of its predecessor, which means that like we're probably going to get AGI by
honestly by the start of next year. And they're trained on hardware that currently exists and is
rolling out. I don't know. I'm just kind of scared reading all of this, to be honest, because
like what happens if we have universal access to this? Like there's going to be a load of malicious
actors which can use these models for various different things. But also, I don't know what
these models are going to be capable of, they're going to be so much smarter than humans themselves.
The disturbing thing is that this technology is here.
Like this is, it's no longer an engineering problem.
It's just a matter of actually producing the thing and plugging it into an outlet and
putting it online.
And this is coming.
Like, there are no novel breakthroughs required to make this a reality.
Now, what that looks like on the other side, I don't know, but I think it's safe to assume
the velocity of improvement we're going to get is certainly not slowing down.
It is turning more closely resemble vertical line than anything else.
And I think it begs the question, like, at what point do we reach AGI and how do we even
define that?
Because I'm not sure we spoke about that much on the show, but EJA, when you say AGI, what do you
mean by AGI?
What would you be looking for to declare, okay, we have finally reached AGI?
Okay, so this is like my own made-up definition, but it's what will make me go,
okay, this is AGI.
It would be a single,
AI model, not many, but a single AI model that advances the frontier of three key major
industries autonomously. So I'll pick these industries as examples. Financial industry, so it trades
better than the average world, sorry, than the best hedge fund or investor. It is able to make
assessments better than any of the financial analysts, the top experts, etc. in that industry.
In science, it has discovered a bunch of medical cures for some major disease.
such as cancer, Alzheimer's and stuff like that,
that scientists, top scientists at their top level
could not figure out.
It accelerates their research.
And maybe one of the industry that I can't think of right now,
but it's when these models start doing things
that the best of the best humans right now
couldn't figure out themselves and couldn't have seen themselves.
Do you have a similar definition?
Yeah, I think that sounds right.
I think, and again, it's very fuzzy.
Everyone kind of has their own custom definition
of what they believe AGI is going to be.
but for me, it's just AI that's smarter than the smartest human at pretty much any cognitive
task that exists. So you can go to this model and it will be better than anyone else who you can
ask on planet Earth about anything. And the problem with models today is they're very spiky.
Like you can do this for code probably and it can code better than every human on Earth.
But if you ask it, you know, a generalized question about something that you really know a lot
about, there's a lot of times where it's not completely accurate or it will respond as if it has the
intelligence of a three-year-old, it fails the reasoning tests of a lot of simple things,
it still feels like it's this very spiky entity. Once it is fully developed, once it is actually
better at every cognitive task, that includes physical things too. That includes like understanding
physics of the real world models. That feels like AGI. And then artificial superintelligence
ASI feels like it is smarter than all humans combined. So it's like if we put all of our brains
together, no matter how long we tried, we can never come up with the things that artificial super
intelligence will come up with. And I mean, will we get there using this chip architecture?
Possibly. I'm seeing a 50x multiple, not including the software multiples. And like those compounding
on top of each other at the rate that we're moving seems like the only real constraint is going to be
physical. It's going to be actually rolling out these models and powering them on.
Well, another crazy thing is I think a lot of people including myself would assume that with every
chip upgrade, it's going to be more expensive.
and it's going to be bigger.
It's going to be clunkier, right?
Like the data centers are going to get bigger.
It's going to be more expensive.
I wish I had a chart to show this,
but it's actually the complete inverse.
And I'll give you some examples,
some numbers to explain that, right?
So a reasoning task that costs $1 on Blackwell
costs 20 cents on Vera Rubin,
which is rolling out as we speak or later this year.
And it'll only cost seven cents on Rubin Ultra,
which starts to get released by the start of next year.
So cost is going down pretty massively.
Now, by 2028, Jensen announced the Feynman GPU, right?
A single rack of that.
So we're talking about just a couple of that, right, blocked on top of each other,
will process more compute than was required to train GPT4 that you mentioned earlier, Josh.
So the point is less is more, but somehow more powerful,
but also somehow more cheap relative to the intelligence that you're building.
And if you assume this intelligence is going to reach this ASI, AGI-like state, it's going to make you money as well.
So you end up just having, I guess, I'm afraid to say this, but the best of old worlds, both worlds.
I don't know what humans are going to be doing, but it's great for AI, basically.
Yeah, there's no world in which things don't get better.
And it feels like right now we're really just constrained by this compute power.
There's this great meme that I saw online.
It said mythos is too powerful public release, but the reality is that they're just completely out of compute.
and Anthropic can't actually supply the tokens required
to give mythos to the world.
These optimizations, these cost structures, yeah, there it is.
We got on screen now.
Great meme.
Great meme.
But these cause structures that are going to incur from these new models
are going to completely destroy that factor,
at least for now, until whatever that next generation of model is
that is so powerful that it's constraining GPUs.
And the interesting thing is that OpenAI has the same exact thing going on.
All these models are kind of converging on the same spot,
but they all seem to be compute constrained.
I think what critics will push back on though, Josh,
for everything that we've said so far,
is, okay, cool, you can buy these new hardware things,
but why would you do that if you could just wait a few months
or six months and buy the next thing?
Jensen's just shipping out these products.
He's making a load more money.
It doesn't make sense.
These things are depreciating assets.
By the time you've bought the first one
and you've ramped that up with power
and training your next model,
there's already three other new chip architectures.
and he would be right, that critic would be right,
except that they're massively, massively wrong,
and we have proof for that, right?
GPUs have now become this anti-depreciation machine.
One of the most amazing things about this phenomenon,
and it feels like a narrative violation,
is the idea that the GPUs that were released three years ago
are actually more valuable today than they were at the time they launched,
which is a pretty bizarre idea.
We have this artifact on screen that shows a chart,
and an H-100 from Nvidia cost $30,000 when it launched in 2023.
At its peak, because of the scarcity, because everyone needs these things,
it was selling for a four-times multiple at $120,000 per H-100.
This is kind of outrageous.
It was a little exorbitant.
We don't need to be paying that much money.
But now that they are old, they're not depreciated, but there's much better hardware out there,
they're still holding their price at $30,000.
In fact, you can see a rebound that happens in late 2025, where the cost of these
H-100 GPUs actually ticks upwards. And I think a lot of the people, Michael Burry most famously,
who is the guy behind the big short, he created an entire short thesis around the idea that the
depreciation schedule of these GPUs wasn't aggressive enough. And they were actually going to lose
their value and therefore the market was going to deflate because the companies weren't
marking these down properly. The reality is that not only are they not going down, they're starting
to trend back up because the incremental cost for a token is so low with these. And
Everyone's so desperate for compute that they're like, well, might as well spend some extra money, get the H100s and start generating inference tokens with them.
It's this pretty amazing phenomenon that's happening.
Yeah. So if you're wondering why this is happening, explicitly, its AI demand is growing faster than chip supply can expand.
We don't have enough fabs or the manufacturing prowess or the energy grid to support creating and generating more GPUs to satiate the demand that we're seeing in AI across all these different industries, right?
It's a very pervasive bit of technology.
Now, the data that we're showing you on the screen right now isn't siloed to like a few research papers.
This is happening in the market right now and it's incredibly liquid.
So a new phenomenon of companies in AI whose stocks have all skyrocketed are these things called Neo Clouds, right?
So these are like, think of it as like AWS.
They supply compute to train your AI models by setting up their own data centers and they kind of like provided to you in like a cloud.
or data center specific structure.
Examples would be
Call Weave, for example.
The idea here is these data centers
or these GPU providers,
70% of the GPUs that they're running
are old GPUs
that we're showing you on our screen right now.
And they're booked out,
I'm not exaggerating,
six to 12 months in advance.
In fact, they're done so in contracts
and the same providers renew the contracts
three months before the contract
needs to be renewed
just to make sure that they get access
to these older GPUs.
the point I'm trying to make, and you mentioned this just now, Josh, is all that matters is,
can I get AI tokens generated to do the thing that my company needs or answer the prompt
that I have? And if the answer is yes, and it's for a reasonable price, I'm down to go for that
because the value that you can build and earn on top of that is invaluable. They can have a large markup
on that. So it makes sense that these assets are kind of like in high demand. And to your earlier
point. Michael J. Burry shorted the entire market saying that these are depreciating asset,
and he got that completely wrong. And his thesis specifically was based on it can't train
frontier models. And he's actually right. The older models can't train frontier models.
But what they are being used for is one thing very specifically, inference, which is,
if someone has a question, how do I get them the answer? How do I process the prompt? That's what the
older GPUs are being used for, and they're really damn good at it. And the reason why it's
important and essential for AI labs specifically who are training models, who you might think
might want the expensive models is they have a ton of inference. They have the use inference to even
train the new models. So it's this new paradigm where all these do, these old GPU architectures are
being refound or repurposed for this really important thing that is inference. So important context
to understand if you're investing in some of these companies, for example. Yeah, and why is it so
valuable? Well, it's a testament to the software improvements, right? So we have those software efficiency
the improvements that we didn't have three years ago. So that same hardware generates a lot more
value. And if we scroll down to the value multiplier section of this artifact, it shows that the
cost of a chatbot inference in 2023 was $3 an hour. And now autonomous agents completing these
complex task is $30 to $300 per hour. So the value that you can charge for these tokens is significantly
higher than it was in the past. And the amount of tokens that you're able to generate efficiently,
that higher quality is much higher as well. So there's this.
all of these converging forces that are just making the market desperate for compute.
Nobody has the compute required that they want.
And Nvidia is trying to put it online as fast as they can, but it's not fast enough.
And I assume as we go through this, we're going to continue to see varying bottlenecks,
and the efficiencies will move to where there aren't bottlenecks, which creates new bottlenecks.
Right now we're seeing some convergence around CPUs, and CPUs seem to be like they're going to be
hitting a shortage somewhat soon because we're out of GPUs.
Let's move to CPUs.
And it's this really interesting dynamic.
but that is the idea on this Nvidia episode,
or just the chip episode in general,
that it is hard to imagine a world in which we don't reach AGI
given the currently announced infrastructure.
It doesn't require any breakthroughs.
It's just if Nvidia does what they announced on stage
through Jensen Huang through these next three chips,
it is almost impossible to imagine
what the world of intelligence is going to look like.
And I think it's important to understand
is that mythos is trained on a two-year-old chip.
And no one's really talking about that.
So it blew my mind.
Hopefully it blew yours as well.
At least found it a little bit fascinating.
And that is our episode today.
Thank you guys so much for watching.
We really appreciate it.
And I know some of you are probably thinking,
there's a bunch of challenges here.
And Josh actually just mentioned one of them,
which is like you've got CPUs.
We don't have enough energy.
We don't have enough memory.
And that's like, you know,
another episode that we can get into.
So all of those things assumed will be leveled at some point.
And we're going to see all those industries grow versus being constrained.
Like, people are throwing trillions of dollars into this industry, so all of those problems should theoretically be fixed.
But rest be sure, we will be the first show to cover it and give you those thoughts before it happens, by the way.
And Intel is a sneaky one to get into.
But we'll talk about that another time.
Thank you so much for listening.
If you are not subscribed to us, please subscribe.
It helps us out massively.
We are having banger weeks on YouTube, Spotify, Apple, and wherever you listen to us, please rate us.
Leave us a comment.
We love hearing your feedback.
There are like thousands of newbies that are listening to the show.
Welcome.
And also give us feedback about stuff that we may not be covering that you want to hear more of.
We're always open to feedback.
But until then, I guess we'll see you on the next one.
