Limitless Podcast - THIS WEEK IN AI: Chat GPT-5.5 Beats Claude Mythos, SpaceX Cursor Rumors, Google's New TPU's
Episode Date: April 24, 2026Big week this week. We discuss the launch of OpenAI's ChatGPT 5.5, showcasing its exceptional coding and problem-solving capabilities that surpass competitors like Claude Mythos. Highlights i...nclude its implications for fields like mathematics, innovative applications such as a space mission simulator, and a 3D dungeon game. We also cover industry news on SpaceX's partnership with Cursor, Amazon’s investment in Anthropic, and emerging competition from Chinese open-source models.------🌌 LIMITLESS HQ ⬇️NEWSLETTER: https://limitlessft.substack.com/FOLLOW ON X: https://x.com/LimitlessFTSPOTIFY: https://open.spotify.com/show/5oV29YUL8AzzwXkxEXlRMQAPPLE: https://podcasts.apple.com/us/podcast/limitless-podcast/id1813210890RSS FEED: https://limitlessft.substack.com/------TIMESTAMPS0:00 Launch of ChatGPT 5.54:53 Comparing with Mythos5:30 Inference and Pricing7:27 New Applications11:04 Impressive Capabilities13:01 Beyond Coding15:30 SpaceX Partnership25:23 Anthropic's Funding26:35 Rise of Open Source Models28:12 Anthropic vs Figma30:44 President's Comments on AI------RESOURCESJosh: https://x.com/JoshKaleEjaaz: https://x.com/cryptopunk7213------Not financial or tax advice. See our investment disclosures here:https://www.bankless.com/disclosures
Transcript
Discussion (0)
The most powerful model in the world is here right now.
In fact, it's so good that it beats Claude Mythos.
OpenAI just released ChatGPT 5.5, and it crushes Claude on every single benchmark.
It's the new number one coding model.
It can do 20-hour tasks that expert software engineers sometimes can't do.
It's already discovered groundbreaking solutions in maths and frontier sciences, such as genetics,
and it's cheaper than GPT 5.4.
This is the result of two years worth of frontier research released in this one single model.
In fact, it's so good that an Nvidia engineer said, and I quote,
losing access to GPD 5.5 feels like I've had a limb amputated.
I think a lot of people are going to compare this to Opus 4.7, and that's fair.
But I really think the true comparison is to Mythos,
because Sam Elman recently, he just posted something as the model was coming out,
that felt very much like a jab at Mythos.
And we're going to get into the benchmarks comparing them,
many of which will actually beat the Claude model.
But what I find most interesting about this post is the second paragraph where he says,
we believe in democratization.
And he mentioned specifically, we have been tracking cybersecurity as a preparedness category
for a long time and have built mitigations we believe in that enable us to make capable
models broadly available.
So this is very much a dig at Mythos, which is, as we all know, privately available,
only gated to the companies that are given allowance to it.
ChatGPT and OpenAI are like, hey, we're going to give you the powerful cybersecurity.
we're just going to bake in the precautions into the model so that everyone could have it. And it ends by saying it's this really sweet thing. It's like we we love you and we want you to win. We believe in everyone having access to this intelligence. And I really respect that. And I think it's an awesome way to set the precedence for what the next generation of these models is going to look like. But before we're going any further, let's talk about the model itself. It's out right now. If you have a chat GPT membership, you can go and use it, go and play with it. EJS, what's the TLDR? What are the high level things that everyone should know? What's most new and noteworthy about GPT 5.5. Okay. So, in
By your mythos comparison, the first question that popped into my head is I use Claude Opus 4.7 every single day.
So I'm like, is it better than this?
Like, should I be switching back to chat GBT right now?
The answer might be yes.
So if we look at the benchmark score right here, GPT 5.5 on the left over here absolutely crushes all the standard benchmarks that these frontier models are weighted against.
And if you look on the right over here, Claude Opus 4.7, it either doesn't even measure in a particular category or it's completely begued.
by GPD 5.5. In fact, the only stat that GPT 5.5 doesn't beat Opus 4.7 in is something called
software engineering benchmark verified pro or something like that. It's like the pro
software coding situation. But there's a footnote at the bottom of this blog where Open Air
States Anthropics publicly said that they might have gamed that particular benchmark and they
need to be re-evaluated. So we might have a complete clean sweep for 5.5 as we see today. So it's an
incredibly powerful model, but a question that popped to my head is, does it actually beat
Mithos? And we have a direct comparison right here. Yeah, so it shows that it does across some
benchmarks. Now, again, these benchmarks are pretty fuzzy. We don't know which ones are game to do what,
but there is a world in which GPT 5.5 will outperform Mythos on some things, which ones were not entirely
sure. I think as we kind of figure out ways to describe GPT 5.5, it seems as if it's their first attempt at
making a model built for autonomy instead of answers. I think a lot of the benchmarks that they're
working on is in agent decoding, things like it handles tasks that are 20 hours long. We'll get into
that. It's doing 85% of OpenAI's internal work already. And it also helped rewrite the infrastructure
that built it. There was this amazing quote in the blog post that said, OpenAI says 5.5 itself helped optimize
the stack that serves it. Codex analyzed weeks of production traffic and wrote custom heuristics
for load balancing that boosted token generation speed by over 20%.
So they're using the model to actually build the model and make it maximally efficient
based on the data that it's collected from users like us who are interacting with the model on a daily basis.
So it's very smart, it's very clever, it's not just there to give you answers.
It's there to think deeply and actually solve problems for you in a way that I think
mythos and a lot of these other frontier models are kind of pivoting towards now.
The great thing about this model release is it reveals a few things that OpenA has as an advantage
against, say, a frontier lab like Anthropic.
Like, it's clear looking at these benchmarks compared to METOS,
which, by the way, the entire world is spiraling because of this model,
because it's going to, like, have the cybersecurity ability to take over any kind of government system.
This model is pretty close, and Sam is going to be releasing this publicly,
or Open Area is going to be releasing it publicly for everyone to use.
So a question that pops to my head is, does this mean that it's a matter of compute,
and Open Area just simply has more of them?
Certainly, if you compare Sam Altman's,
ability to acquire compute and spend all these trillions of dollars to acquire it versus Anthropic.
Anthropic has been extremely conservative and now they're struggling. Like, you know, they recently
signed a $5 billion deal with Amazon, which we'll get to later on. But the point is, this is a tale
of two stories. Either Open Air has enough compute and they're about to leapfrog called because
of that, and they're proving that through this model that is a very good answer to Mythos or,
and this is the alternative side, Anthropics' mythos model is just plainly better than 5.5. And these
benchmarks are actually verified, which is technically kind of true because I don't know how
official these things are. These are just through tests that a small set of users have done. So it's a
game of both. I'm sure Anthropics is watching this and thinking, hmm, maybe we should roll out
mythos, but they don't have the compute. Yeah, they don't have the inference. In fact,
is speaking of the inference, Sam actually made a post saying that he's really excellent work by the
inference team to serve this model so efficiently. He wanted to really highlight the fact that to a
significant degree, they have become an AI inference company now. And I think that's a really big
difference than what has previously stated. Like, Anthropic has a really tough time serving compute,
and we see that. And even if they had mythos available in a way that was safe, they can't serve it.
Open AI can. And we see it reflected in pricing because, I mean, we have some pricing for this model,
right? And it seems as if it's roughly at par with 4.7, if not slightly better.
It's slightly more expensive, but not by much. So for every million tokens input, it's both
the same for Anthropic Opus 4.7 and GPD 5.5. It's $5.5. But the output is $30 for 5.5 per
million tokens and $25 per million tokens for 4.7. So it's a little more expensive. So it's a little more
expensive, but here's where you actually have more of a bargain using the more expensive model 5.5.
It is cheaper than GPD 5.4 and it uses tokens way more efficiently to think. So what does that
mean if you are an enterprise that wants to, you know, plug in this AI model and not worry about it
and just have it power your entire profit engine. Well, you end up using less token so you hit
your rate limits in a much slower rate, which means that you end up getting more bang for
your buck as long as you use the model like 24-7, or you use it effectively. If you are just
kind of out there using 5.5 to like ask questions that you should maybe be asking Google,
this is probably not the model for you, but otherwise it's super powerful one. Yeah, and if
these prices don't mean anything to you, that's fine. As long as you have a $20 a month subscription,
in fact, this is going to be available to free users fairly soon, I believe. But anyone who is a
subscriber has access to this. You don't need to use the API. There's nothing fancy. You open up
your app on your phone. You go to the web browser. It's there. It's available, ready to go.
Now, there's a few interesting things that you can do with this model that haven't previously
been possible. And although we don't quite have access to it just yet, we're recording this right
as the model got launched, we do have a blog post from OpenAI themselves who are showcasing a few
demo. So again, take these with the grain of salt. These are straight from Open AI, but they are
seemingly pretty impressive and pretty noteworthy as to what they're capable of doing, starting with
this space mission application, which is pretty cool and very reminiscent of the moon mission that we
just had. Yeah. So if you guys don't know, Josh has a sequel, he has many secrets on this show. One is
he's a massive space fan. And when he's not hanging out with me, he's doing space simulations on whatever
he can do, right? Well, okay, maybe part of that is a bit of a lie. But, um,
With this new app that we're seeing in front of us right now, this was completely vibe-coded using 5.5.
And it's used to simulate a specific space mission.
Now, if this looks very similar, it's because we just had a space mission.
First time we visited or went back to the moon in 53 years.
Pretty big deal.
And we can see a pretty accurate simulation going on right here.
So as you can see, there's various different toggles.
The physics of the entire thing is very important.
And that's another point I want to make about this model.
It is being used for frontier research, not just in AI, but in mathematics.
in genetics. It made frontier progression on both of these fronts. And so what we're showing here
is this is a model that goes way beyond just text and telling you what could be. It actually
implements this into a lot of different things and understands the world around it, which is
extremely powerful. But we have another one here. We have an earthquake tracker. For anyone who
wants to make websites, it's so good at making websites. And this appears to be one of the strong suits.
In this case, there's a few things to highlight on this earthquake tracker. One of them,
being that it's one just like a pretty elegantly designed website but two all of the graphics are
interactive you'll notice that they update dynamically as you hover over them as you click it looks very
clean i assume that it is pulling up-to-date information from an API somewhere that it's set up
it is just truly competent and capable of doing these kind of longer tail tasks that are a bit more
complicated than a static landing page but have dynamic data have the richness that you would expect from
a high-end high-quality polished website except just built with an AI model from the
someone who doesn't need to know anything about coding at all.
And then for the gamers, also, there's another great example of a dungeon game,
which is they're describing as a playable 3D dungeon arena prototype,
built with Codex and GPT models.
Now, I think this is something novel to this setup,
where Codex handles the game architecture, the combat systems, the enemy encounters,
and then the character models, the character textures and animations,
those were created with third-party asset generation tools using something like ImageGen 2.0.
So this is also one of the earlier signs where you can actually
merge a lot of these tools together to build something dynamic in a way that you previously
couldn't have done before. Yeah, actually, the quality of this game looks like something out of
a League of Legends or something like that. At least that's what it reminds me of. Like,
these games are getting way more high-depth than I expected. I know it's just like, it's pretty
basic for anyone that's watching this, they can kind of like pick with a fine ride, but it's cool.
But for those of you who prefer like the more traditional side of games, this might be something
that you can kind of vibe code in a couple of minutes. Now, it may look basic, but theoretically,
this is like a 3D spatially aware game
and that's not something that you could achieve
at least very easily with previous models.
What I love about this as well is it's also,
they've also created or included the prompt
for all of these things.
So this is something that you can try right now.
Like look at this.
And the prompt is no more than what's this one to do for like 12 lives,
12 lines, dude.
And you can have like a fully functioning game.
You can probably then add an extra step
or extra prompt saying,
hey, can you deploy this to VersaL
and send that to your friends.
Now you can use, you have a game.
you're a game creator, you're a game developer.
So the applications for this model cannot be understated.
I'm going to be very honest.
I thought this model was going to be just an iterative upgrade.
I didn't think it would get anywhere near Claude Methos.
Two stories have now revealed themselves, which is,
one, it's the answer to Claude Methos,
and two, it's really damn good.
I am now convinced that compute is everything,
but not in the way that I thought it would be useful.
I thought it would be largely for pre-training.
But to Sam's tweet earlier on,
and also in Greg Brockman, the president of OpenAI's recent interview,
they're going all in on inference, test time compute,
which just means that if you have more compute
and if you have a good enough model, it can do the thing.
This thing, like I said, built itself.
It's a self-improving model.
Very, very impressive.
It's good for solving hard problems.
It's good for thinking for a long time.
In fact, they marketed it as a model
that can now think for 20 hours coherently,
which is almost a full day it can work on a problem.
And what you're noticing from this prompt that's on screen
is it doesn't take that much to get it going.
You don't need to kind of spoon-feed it all the way through anymore.
It can make decisions on its own.
It can infer conclusions on what you want just based on the knowledge architecture that it currently
has.
It's amazingly impressive.
In fact, one of the people who got access to it early just posted on X that he's posting
live as his prompt is seven hours into its task.
It has been running for over seven hours.
He said, this has literally never happened before.
The models would maybe run for 30 minutes or so.
Wow.
Or if you really shouted them after two to three hours.
But he's on seven plus hours.
I think this is going to be fun for people with complicated things.
If you really want to make a AAA feeling video game or a simulator or a really complex website,
this is the model to try out and to use it with codex and see how all these things kind of piece together.
It's really, I mean, I wasn't, I didn't have my hopes very high based on the Opus 4.7 to 4.6 incremental improvement.
This seems like a very solid improvement over 5.4.
Absolutely.
And listen, if you are listening to this and you're like, listen, I'm not a gamer.
I can't waste my time with that.
I focus on more serious things.
Well, for you serious people, if you're a manager at a top company or whatever that might be,
this isn't just a toy or a model used for coders.
A lot of the examples that we just gave are around coding.
You can use this for just admin stuff or managerial work.
Like the capability of this model to think more strategically and long term
and understand the context of the tasks that you're working towards.
Like we said earlier, for coding specifically, it can work on 20-hour-long expert task.
That also applies for administrative stuff or things that are more generalized white-collar worker work.
And so in this example, Noam Brown says, I'm a manager at Open AI, but I'm using this model to basically manage my entire team and make sure we're focused on the right things.
And guess what?
The output of this team and this product has been pretty amazing.
So all around really excellent work by the entire team and the inference team specifically, as Sam Altman says here.
And yeah, I'm looking forward to using this thing.
I don't have access to it right now.
I've refreshed my account probably like five times at this point and it hasn't appeared.
So maybe it's like a slow rollout.
But if you're listening to this and you've tried it out, let us know what you're using
it for.
Let us know what it amazes you.
Like, I really want to hear more.
Yeah, Open AX had a pretty incredible week.
And this comes on the back of their new ImageGen model that they just released, which
was also unbelievable.
If you haven't seen that episode, we just recorded it yesterday.
So I would go advise you to see because, oh, my God, it is amazing.
We also recorded an episode on Apple's new CEO this week and what that means for the company,
as well as the hardware race and how this, I mean, this model, Opus, no, not Opus, this is GPT.
GPT 5.5 is very much part of the AGI class of models that is built on Blackwell chips.
And we've recorded an entire episode all about that.
Very interesting, very fascinating.
Also interesting and fascinating because, as always, this is the weekly roundup.
We have a few other topics to talk about.
We have some news out of SpaceX, which is a pseudo acquisition.
Now, they haven't quite acquired Cursor being the company in question, but they have
at least partnered with them with the option to buy cursor for either $60 billion or pay $10 billion
for the right to actually work together. This seems like a big deal. This seems like, I mean,
XAI, we could call it SpaceX, but SpaceX AI is taking AI very seriously. They're currently behind.
They clearly don't want to be behind. This is a huge step and a huge kind of trust of support
in cursor with this minimum of $10 billion into accelerating their progress and trying to get
themselves into this game. This is actually a genius deal. And there are a few stories.
why it makes that so.
So let me explain.
If your space XAI,
which by the way is a ridiculous name now,
like we'll just call them XAI,
you are currently harboring
one to 1.5 million
of the frontier GPUs,
mainly Nvidia,
in a warehouse.
There's one issue.
You're not really utilizing all of it
because XAI has had a bit of a slow start
to training their models.
What's a genius idea?
Hmm.
If I rent those out
to another company,
to train their own model, then we can make money from that.
Okay, so that's win number one for SpaceX.
But then they've thought of another thing, which is,
huh, GROC isn't really good at coding,
and we are losing the race every single day
we don't update our model like coding
because Anthropic and ChatGPT 5.5 is completely running away with it.
So how did they leapfrog and get ahead?
They should acquire the company that is using their own GPUs
to train a frontier coding model.
So then the question becomes,
well, who the hell is Cursor? What's the mode that they have? Like, why do they have a good shot of training
a better coding model than Anthropic and GPD 5155? Aren't those two companies way ahead? Well, the answer is
not quite so. Cursor, for the longest time, was the number one platform and tool for people to use
to do their vibe coding. Why? Not only did they have access to frontier coding models from Claude and
Chatt GPT, they also had something called an agent harness. Now, you'll notice in GPD 505, it's really good at coding,
because of something called agenic coding.
That is something that Cursor pretty much pioneered.
It's basically the harness, the prompts, the environment,
that they mold the model,
or rather that they mold around the model,
that makes it so good and intuitive
and remembers the context across every single project,
like menial things, like understanding your GitHub branches
and working on separate flows at the same time.
A lot of the top software engineers in the world right now
use tools like Cursa and Agentic Coding
to be able to pull this up.
So Elon Musk thought, hmm, if I give you the GPUs to train a better coding model, which gives you a better product, I should have the option to acquire you.
In inquiring you, I can integrate you with GROC and GROC somehow becomes the number one coding model over the next year or so, depending on if this deal goes.
And if the deal falls through and they create a really bad model, well, you pay me $10 billion for the service.
Or I pay you $10 billion.
Not a bad deal.
Yeah, it seems like they're going to be continuing to work with other companies to accelerate.
in places that they're weak at currently.
Because, I mean, they're so strong at building out the hardware
and creating these huge data centers.
They need someone who could take advantage of all those GPUs.
Hopefully, this will help serve that cause.
And that's not the only SpaceX news this week.
The other is that they have officially filed an S-1,
which for those who are not familiar,
it means they're going public.
It's officially official 100%.
They will be going public this year.
If there were any doubts, please let them be relinquished.
Here we have it.
SpaceX will be going published.
So the most interesting thing from this was,
I think the share structure.
sure of how they're going to be organizing this for Daddy Elon who's going to be getting quite a big
payday if he does well. So we have on screen here just a series of some of the financials. I mean,
we know Starlink as a business has been doing unbelievable. They have about $25 billion in cash,
$92 billion in assets, $50 billion liability. Dude, that's quite a lot of liabilities on this.
My God. They got a lot of debt, man. I don't know. We'll see once they finally publish everything.
I'm very excited for the first earnings report where you really get a true peek behind the scenes of what's
going on there.
But it looks like it's going to be going public at a $1.75 trillion evaluation.
Now, in terms of pay structure, Elon is posed to get 60 million shares, which is 11 tranches,
vesting in $500 billion market cap increments from $1.1 trillion to $6.6 trillion share price.
Oh.
So for those unfamiliar with the current ceiling, I think it's Nvidia.
Invidia is what?
5 trillion, under 5 trillion, close to a 5 trillion?
Under 5.3.
five. Okay, so not even close. They're like 20% away from five trillion. SpaceX needs to be,
what is that? Like 20 something percent, more valuable than the most valuable company in the world.
But if they do, Elon gets 60 million shares. Now, I haven't done the math on exactly how much that is.
But if we make some assumptions here, the total value at Vest looks like it could be about a quarter of a trillion dollars.
So pretty good payday for Elon. I think the most important thing is that he's getting a lot of control over this.
it seems as if he's going to have 40 something percent control of the company,
which is really ultimately what was most important to him as they went public.
So really exciting news.
I am hopeful that it happens this June, which we can expect.
And it's without a shadow of a doubt going to be the largest IPO in history.
I think everyone's going to be talking about it.
There is a new vehicle in which some people are investing in.
We're actually going to have the founder on the show soon.
So keep an eye out for that one.
And yeah, the SpaceX news is very exciting.
Now, in the world of AI hardware, many people think that Nvidia has run away with the win.
And you could argue that with a $4.300 market cap, not many people are competing, except that there is one company, Google.
Now, you might be thinking Google does all my search engine and stuff.
Well, Google is the only vertically integrated Mag 7 company that is involved or has a frontier capability at every single layer of the AI stack.
Now, right at the bottom are these things called Google TPUs, tensor processing units,
and they're their version of the GPU.
In fact, fun fact, Google's Gemini models has never trained on an Nvidia GPU.
It's all been their own internal warehouse infrastructure,
and they've been working on this thing for 10 years.
Now, just today, or rather, this week, they released their latest generation of TPUs,
the TPU8T and the TPUAI.
Now, the TPU8T, T stands for training or pre-training.
It is highly optimized for the pre-training part of an AI model.
So this is like the bulk, arguably the more expensive part of training a model.
It's like teaching it like, hey, these are words, these are the general fundamental set of facts that you need to know before we can kind of like put you out into the world and present you to our users.
TPU8I is specialized or hyper-specialized in inference specifically.
Now, the important part about inference is it's being used for so many different things.
Number one, it's to answer all your different prompts.
Whenever you write a prompt and you submit it to an AI model, it is known as inference.
It's getting inference.
It needs to query the model and make sure it like does the right types of thinking and gives
you the right answer.
But the other part of inference is post-training, where a lot of people train the model
and then they do more training after the facts by using it to help the model reason
and think of other alternative facts before it presents you the actual answer.
and that's what that second TPU is.
Now, Google's TPUs have been used extensively.
In fact, their largest customer is a little-known AI Lab known as Anthropic,
which currently runs 1.5 million TPUs,
so the argument can be made that TPUs are largely responsible for Clords and Opus's success.
So very impressive all around, but there's some other facts about this, right?
Yeah, well, I love the dual architecture training setup that they have here,
being hyper-specific.
I mean, the 8-T chip in particular, it's built to reduce frontier model development cycles.
they said, from months to weeks.
And then we have the AI, which is the reasoning engine,
which is specifically served for agentic use
to deliver tokens really quick as fast as possible.
And as we know, Anthropic is working closely with them.
And also, I mean, Google is making these for themselves.
So I think whoever is working with Google,
whoever's kind of focus on these accelerators,
is probably in for a nice little windfall
as it relates to increased velocity of the training
and also increase ability to distribute these models.
as we know Anthropics is having a very difficult time with this.
Now, Nvidia and Jensen are probably feeling a little shook.
They've got to be feeling a little bit of pressure here.
And it seems as if that's why they're pushing to be open source,
because if you are in a closed source world
where everyone is making close-host models on their own architecture,
then the Nvidia edge very quickly disappears.
And, I mean, I'm looking at these ships in hand.
They look beautiful.
They're taped out, ready to be manufactured.
And I think you could start getting kind of excited
about this new world of accelerated hardware.
And we're seeing this happen again and again because Amazon just made another big investment in who else other than Anthropic.
And the deal, I think, is like, this has to be close to a record deal.
They're owning a tremendous amount of this company now.
Yep.
So the news here is Amazon announced they're investing $5 billion into Anthropic.
They've just raised $5 billion.
Congrats.
And so the reason why this is important is, well, there's a few reasons.
Number one, Anthropic knows that they don't have enough compute.
The argument could be made.
That's why Claude Mithos hasn't been rolled out.
Well, hey, hey, Presto,
now you have $5 billion worth more of compute.
Now, for those of you who didn't know,
Amazon is a primary investor already in Anthropic.
Before this announcement, they owned around 17% of Anthropic.
After this announcement, it's closer to 20%.
So we're talking about one company that's publicly tradable right now
that owns a fifth.
Is my math right?
Yeah, a fifth of the world's leading AI lab,
which is pretty crazy.
Now, if we look into the stats of this,
this is a 5-gagawatt deal,
which is more than any single data center
that's currently live.
It's actually a multiple of five.
I think SpaceX AI's Colossus 2 is the largest right now
with their 1 million TBs.
So it's going to be 5x larger than the average data center
that we're seeing right now for AI specifically,
and they're aiming to get one gigawatt online by the end of the year.
Now, the reason why this is so good for both teams is
Anthropic already has a close relationship.
with AWS and Amazon's cloud computing department.
So spinning up more compute clusters is going to be so easy for them.
They have a working relationship.
They're used to training code models on this.
So it shouldn't be too hard to ramp this up.
If you're Amazon, hey, welcome back.
That $5 billion is going to come right back to you.
So I don't know what kind of like circle economy this is, but it's back and it's very
impressive for them.
Is it ironic that today Amazon hit an all-time high?
Oh, gosh.
Maybe not.
I'm holding stock.
I got the stock.
Clearly, clearly they're doing something right.
is a phenomenal company there, the largest shareholder in Anthropic. It's hard not to be bullish on them.
It's hard not to be bullish on the accelerated computing stack. And I think that's probably
what Jensen is getting nervous about. That's by NVIDIA is pushing open source. And the good news is
is he has some help. He has some assistance from the folks overseas in China who have been pumping
out unbelievable models all week long as it relates to Kimmy and Kwen, our Chinese favorites.
We have Kimmy K2.6 and Quinn 3.6. There's a lot of digits and numbers. All you need to know is
that the best open source models in the world didn't exist last week, they now exist this week,
and they are better at pretty much everything, but exceptional at coding. In fact, word on the
street is that some of these models are as good as GPT 5.4 was and only a few points off
of Claude. I mean, these are pretty amazing open source models that, again, are free to run
locally on your machine if you have the machine capability of doing so. That's a big, this is
big game changer. Okay, so typically the story we tell with these open source models is,
Wow, aren't they so amazing?
Yeah, they're the good younger brother.
They're not as good as the frontier AI labs.
That completely changed this week.
So Kimi K2.6 is the latest model from a Chinese lab
called Moonshot Labs.
I believe it's Moonshot or Moonshot AI.
And they released their model, which ends up being as good as coding or at coding
as Opus 4.7, and it's 100% open source, like you mentioned, Josh,
which means that maybe you could run this on a local device.
Now, the answer that you would typically get back from this is,
hey, like, listen, it's too large to run on my laptop.
But that is true.
But with the latest Quen model, which is a 3.6 version,
you can run it as an 18-gibite-sized model,
slightly quantized, on your laptop today.
So the point that I want to make about these models
isn't exactly the specifics,
but across all benchmarks,
it's not as good as the frontier AI labs,
but it's a few points.
That difference and gap has closed massively
over the last couple of months,
which tells me two things.
Number one, China has figured out some kind of groundbreaking way to train their models that they haven't told the West about,
and they're going to keep it closed-guard and eventually close-source their model releases going forwards.
And number two, they've figured out a new way to use inference to their benefit.
Like, one thing I'm going to highlight here is this new KimiK2.6 model can code continuously for 12 hours straight using 300 agents.
So the unlock here isn't one model itself.
It's spitting up 300 versions of itself and getting it to attack the problem.
That's something Sam realized and what he's implementing in 5.5.
That's something Opus 4.7 realized and is doing probably similarly with Mithos.
So I have this question here, which is like how they have to try to do this.
Well, I think every three months that there's a new open model that gets released,
they're making these jumps because they're using these models to train themselves.
We proved that with Kimi K2.5.
There's too many two point whatever's.
And the same thing is happening with Quinn.
It's just all around pretty amazing stuff.
Yeah, China's crushing.
Okay.
So before we go, we have two quick things to hit.
The first being, one that we missed.
last week, which we need to touch on quickly. Anthropic has a design tool now. If you are a designer,
if you are interested in building webpages, videos, graphics, slideshows, pitch decks, any type of visual
asset. Cloud now has an entire design suite built just for this purpose. It's called Cloud
design. It exists separately. You can access it through the desktop app or on your browser. And it
basically allows you to build visual assets in a way that you couldn't previously. Previously with
Claude, you had artifacts, an artifact you could generate something dynamic. It could kind of build
do a webpage, this takes it to a whole new level. You could generate wireframes if you want to
try it to use less tokens. You could fill it out and create properly created prototypes that are
actually clickable. It's amazing. The video we're seeing on screen highlights a few of them. Unfortunately,
there was a big loser in this because this sounds like a lot of what that little design company
named Figma does. Yeah, the little company. The stock market did not love the reaction to that,
did it? Nope, nope. It is down almost 20% on the week. I actually tracked the stock price after
the announcement was made.
So it wasn't even readily available.
It was literally just the tweet.
20 minutes after it was tweeted,
the stock was down 6%.
So the point being,
whether this is market speculation or not,
like, listen,
Claude design isn't as good as Figma.
They're working with a few of these different partners
such as Canva,
but two weeks ago,
one of Anthropics' former most execs
left the board of Figma
and the rumors was
because they were building a competitor.
So it's pretty clear,
Anthropic is going off
to every single sector,
whether you're a designer, a software engineer, a mathematician, a research scientist, it doesn't matter.
They're going off to everything because the model is applicable to everything.
And I don't know what this means for certain modes that companies like Figma holds,
but it's certainly going to affect the stock price.
Can you do me a favor and click the Max button real quick for me just to show the chart?
Oh, yeah.
Yeah, minus 86% since IPO for those who are not watching on screen.
It's been a pretty bad, rough run for Figma.
We have to start naming Anthropic the stock killer, Josh.
This is like every single tweet is tanking a strong.
No, it's tough.
It's brutal.
It's brutal.
We had one one last thing that you wanted to mention.
I know.
We got to end on this strong.
What do we have?
How good is your accent or impersonation of your president, of our president, Josh?
Pretty horrible.
Not good.
Okay.
Well, then we're not going to attempt it.
I love to hear your British take on it.
Oh, hell no.
If you're feeling ambitious.
Okay.
So my British take on this is, this is, albeit hilarious and somewhat terrifying,
that the president of the United States is saying this.
he commented, okay, on the government's relationship with Anthropic.
Now, if you're wondering why on Earth he's commenting on it, they're going to be releasing
this called Mythos model.
It might be a security risk.
It's probably good for the government to have access to this thing and prepare necessarily.
The government has been having very important conversations with bankers and governments
all around the world to just try and figure out, you know, how best to prepare for this.
And after having an in-depth discussion with Dario Modi, which, by the way, he blacklisted
that CEO and Anthropic entirely from the government using it, he's now re-execlassed.
kindling it and saying, maybe there's a deal on the line. He goes, and I quote, I'm not going to do the
accent. We'll get along with Anthropic just fine. Trump said on CNBC. We got to let me try.
We'll get along with Anthropic just fine. I think they can be of great use to us.
They're high IQ people. Very good. Very good. They tend to be on the left, radical left,
but we get along with them. I don't know. That's all I got. But that is what he said.
Were you practicing that? That was actually pretty good. I was practicing my head. I was
rehearsing. I closed my eyes once you were doing that whilst I was laughing. And that was like
feel right. It sounded like him. Good. It channeled his spirit. It was there. It was a good effort.
But I believe that's it. That is the end of the roundup.
What a whirlwind man. Josh and I are recording this, FYI. It's 4 p.m. over here. Typically,
we're morning birds. We deliver this in the morning, but we waited for the announcement of SPUD, GPD 5.5 just for you guys.
And we're going to be bringing you the cutting edge news every single week. As Josh mentioned,
we had three other amazing episodes that we filmed earlier this week. Definitely go check them out.
They're all each 20-minute song.
your commute to work. It's your gym session if you're not that active. Definitely go check it out
and let us know what you think. But yeah, Josh, any final thoughts? Call me crazy, but I like the
afternoon recordings. I got good energy. I'm like woken up. I'm 100% right now. I'm rocking and rolling.
I'm feeling good. So I don't know. Maybe we'll have to lean into this a little bit more,
but that's everything. If you've made it this far, if you're still listening to this and you've
heard our other episodes, you're caught up. You're done for the week. You can go touch grass.
Enjoy your weekend. There will be a lot more to talk about next weekend. But for now,
you have fully synchronized with all of the chaos happening on the frontier of AI and technology.
Thank you so much for watching, as always. We very much appreciate it. If you enjoy this episode or
any of our previous episodes from this week, don't forget to share them with a friend who you
also might enjoy it, possibly. We have a newsletter on Substack that goes live twice a week.
Just went live yesterday, going live again tomorrow. The Friday issue is a recap of everything
that happens this week, which is always fun and exciting. In fact, I'm going to go write that
as soon as we finish this episode. So thank you all for watching. As always, don't forget
subscribe, like, comment, all the good things, and we will see you guys next week.
