Limitless Podcast - AGI is Back! Google Gemini 3.0 Crushed Our Expectations
Episode Date: November 19, 2025🌌 LIMITLESS HQ: LISTEN & FOLLOW HERE ⬇️https://limitless.bankless.com/https://x.com/LimitlessFT------Google launched Gemini 3.0, a groundbreaking AI model with advanced multimodal ...capabilities for interpreting diverse inputs. We highlight its impressive benchmark performance, including a 37.5% on Humanity’s Last Exam, and discuss Google’s proprietary TPU infrastructure and the new tool, Google Anti-Gravity. ------TIMESTAMPS0:00 Gemini 3.05:34 Game-Changing Features9:04 Benchmarking Breakthroughs13:21 Cost and Quality Trade-offs17:46 Google’s Strategic Advantage24:51 Predictions for the Future------RESOURCESJosh: https://x.com/JoshjKaleEjaaz: https://x.com/cryptopunk7213------Not financial or tax advice. See our investment disclosures here:https://www.bankless.com/disclosures
Transcript
Discussion (0)
If I had a crown in my hands, I would place it on the heads of Google because they have done it again.
They have the world's best AI model ever in history by a shockingly large margin.
Gemini 3.0 just got released.
It's available now to anybody in the world to go use it.
And the benchmarks are kind of blowing everyone's expectations out of the water, myself included.
And most importantly, it places another data point on the chart that shows we are continuing to ascend up this exponential curve towards AGI.
And the roadmap is still intact and we are very quickly moving through it.
He says, I was just, I was going through the benchmarks before recording this. And I, it's shocking
because we live in this world. And yet somehow I'm still continually blown away by the progress
that's made by these models. So let's get into it. Please walk everyone through. Tell me,
what did Gemini and the Google team just release with this 3.0 update? People probably think we say
the world's best model every week. But this time we really, really mean it. Like they have blown
every single other model provider out the water. The things that this thing can do, well,
How about I just show you?
How about I show you, Josh?
Please, let's see some examples.
We have a thread here.
And Sundar basically says,
you can give Gemini 3 anything,
images, PDF scribbles on a napkin,
and it'll create whatever you like.
For example, an image becomes a board game.
A napkin sketch transforms into a full website
and a diagram could turn into an interactive lesson, right?
So there's two examples I want to show you, Josh.
I want to get your opinion on this.
So number one, there's a short video
of someone playing a pickleball,
here and he or she rather uploads it into Gemini and says hey can you tell me how well
I've done here and how I can improve my game and it analyzes the entire video it knows that
she's wearing a knee brace it analyzes her positions telling her where she can move to better
position herself to score the point that's pretty nuts but before I get your reaction to that
because Josh I know you're an athlete I know you're you're very competitive when it comes to
these things so this is a tool you could definitely use the second thing is probably
applicable to a lot of listeners on this show
They've embedded Gemini 3 into Google search and into new generative UI experiences.
The way I would summarize this is it basically is very intuitive, Josh.
It understands what you're asking for without you needing to really kind of explain yourself.
The example they're showing on the video here is, can you explain the three body problem to me?
And rather than just give you kind of like this simplistic text, which explains the concept,
it decides to create a video diagram from scratch to show you a visual depiction of how
this works. Right, give me your reaction in order from one to two. So starting with the
top. The first example. Yes, sir. So this is really cool, the napkin example, where you can scribble
something down on a piece of paper. It'll generate it in the real world. What all of these examples
are kind of showing me is what we always talk about with Google, where it has this awareness of
physics, reality, and visuals and understanding what it's seeing. And all three of these examples are
leaning into that. So it leads me to believe Gemini really is a multimodal first model.
where it's meant to ingest, meant to understand the world around us.
This example of the chessboard and the napkin is amazing
because a lot of people oftentimes have sketches.
You just draw it down on paper and it intuitively understands it.
But the one that was most surprising to me is the video example
because as far as I'm concerned, as far as I'm aware,
there has never been a model that can ingest video
and understand the video that it sees.
And if it does exist, I've never tried it before.
So the idea that you can, I mean, I play baseball growing up.
If I could take a video of myself swinging
and get a corrective coach to walk me through exactly what was wrong.
A lot of people play golf, I'm sure who are watching this,
if you could have a phone recording of you playing golf,
and it can actually critique it,
and then critique me as if you are Tiger Woods.
Critique me as if you are whoever else is good that plays golf.
I don't know, Rory or McElroy, whatever they are.
But like, critique me as if you are an expert who is really good at golf
and can give me some feedback on how I could better my swing.
And what this offers in this one just narrow case example is now you have this personal
tutor that can do anything.
If you're dancing, if you're doing anything physical, if you're, whatever it is, it can evaluate things for you.
Even if you have a video, a podcast, EGS, we uploaded to Gemini 3.0, it could critique us.
What did we do well?
What did we do not?
What did the visuals look like?
How can we improve them?
And that awareness of video is like really cool.
Yeah, I just want to say, I think the closest we got to this was with GPT where you can upload an image, like, what's under my car bonnet and say, hey, what's wrong?
My car stopped working.
and it can kind of like identify the point that you need to kind of like change, change the
all, blah, blah, blah.
But that's just a static image.
To go from that to live video and for it to analyze all the frames in that video and then
give you a response on that is a massive leap upwards.
We just haven't seen that anyway.
So yeah, you're right.
It's amazing.
And every example we go through, it kind of breaks the mold of what I believe to be possible.
And it's, I find that it's going to be difficult to use Gemini 3.0 because there is so many
possibilities now that have not existed previously.
you kind of need to relearn how to engage with AI because it's so capable.
And there's a fourth example here that I just want to touch on briefly, which was also cool,
is that it works just as well for the other things.
The example is a trip planning one where it starts to plan a trip and a vacation.
And it shows you a full list that is fully interactive of all the places broken up day by day.
And there's an option that you could just choose visual layout.
And you see on the screen here, it'll take every single day of your trip,
break it into images, sectioned it out into this really nice visual grid.
So what I'm seeing the themes here are, okay, real world understanding, video first, and really nice presentation, which I think a lot of models sometimes struggle on.
So demos out of control.
I'm excited to use it.
Everyone else can use it now.
It's live.
Now I want to get to benchmarks, EJS, because this is where things get kind of crazy, where we could actually compare one to another and see exactly how impressive this is relative to everybody else.
So please, we have the card here.
Walk us through what we're seeing in this model card and all the specs that we need to know.
As you guys probably know by now, benchmarks is typically how we evaluate any typical AI model against each other.
And they're measured against a range of different benchmarks.
A benchmark can be considered as sort of like a test.
Now, right at the top, you've got Humanity's last exam.
This is by default the hardest exam that an AI model is tested against.
And it's kind of like an academic reasoning test with no tools accessible to it.
It scored a very impressive 37.5%, which is more than, I think it's about a,
15% increase from its previous model.
Very, very impressive.
But what really blew my mind was the second stat listed here,
which is the Arc AGI2 benchmark.
Josh, when I say this 2xed,
the previous state of the art model,
I absolutely mean it.
In fact, let me just show you this chart here.
Now, you may notice a couple of like busily specs here,
GPT5 Pro, GROC for thinking.
And then can you see that,
that outlier right at the top right? Do you see that, Josh? That's insane. The two outliers.
The two outliers. So these are the Gemini three pro and the Gemini three deep thinking model,
deep thinking being like, you know, a large model that can like kind of give you a more research
response. They are a standout from every single other model. And the reason why this is so crazy,
well, there's a few reasons. Number one, all the other model progressions, as you can see over time,
has been kind of impressive, but kind of small.
Like, they've been a good jump.
It's been impressive, but it hasn't been as impressive to be like,
oh, you know, another model provider couldn't catch up.
These results from Google literally put it miles ahead of every other model.
So when I sat at this chart, I think, wow,
Google probably has the lead for another six months.
And in six months time, they're going to have a more impressive model by then.
So at this point, I'm kind of thinking,
can anyone catch up to Google?
Josh, do you have any reactions to this benchmark?
This is the chart that, like, the first thing I said to myself when I saw this is like,
oh my God, there is no wall.
We are not going to stop scaling.
The scaling wall still apply because these two new data points that we have blow everything
else out of the water.
And this is how exponential growth happens.
It seems like a really small cluster down there in the bottom.
But the reality is that was the top just a couple hours ago.
And Gemini kind of re-kind of factored this entire chart to make it seem like it's so
small because the progress is so high. And although Gemini 3.0 thinking is seemingly the most impressive,
the really anomaly chart is the Gemini 3.0 Pro, which is basically a vertical line up from these
other models, where the score is higher, but the cost is actually slightly lower. And if you connect
a dot between these averages, you start to see literal vertical line in terms of improvement and
acceleration in these models. And that to me shows that there is no Zoscaling wall that we're hitting.
Like we can continue to scale resources, energy, compute, and we can continue along this path towards AGI
in a world where some people were saying,
we don't know if it continues.
The answer to me is very clearly, it continues.
This is a step much closer to AGI.
And again, that real world understanding
makes it feel much closer to AGI than it ever has before
because now it really like intuitively understands the world
through video, through photo, through audio,
through basically every sensory input we have
outside of what taste and feel.
So this to me, I saw this chart, I was like,
oh my God, Gemini, you really outdid yourselves.
I'm just going to be honest.
I think over the last couple of months, I've been getting a little bored with the models that have been released by other model providers.
And it led me to think that we're not going to make many breakthroughs until, you know, some model provider figures out a new, a unique way to train their model.
Gemini or Google has convinced me otherwise with this release.
But I know you guys are probably like fed up with listening to us hop on about benchmarks.
So how about I materialize that for you in a much more easy to understand way, right?
So here are the four big takeaways that you need to learn about Gemini 3.
Number one, for its intelligence, for the intelligence that you're getting, it is not that super expensive.
Google train this from scratch, as this tweet says, using their own GPU infrastructure.
And it used this kind of like layout called a mixture of experts, which basically means that whenever you prompt the model, it's not going to use the entire model.
So it actually ends up being cheaper than what it could eventually become.
1 million token context input,
64K token output.
We'll get to the costs in a bit of a second,
but the point that I'm making here
is that it's not as expensive
as you would expect for the intelligence that you're getting.
Now, if you compared Gemini 3 to GPT 5.1 from Open AI,
on a relative basis, it is more expensive,
but for the jump in intelligence that you're getting,
it's way better.
So it's, in my opinion, worth it.
Number two, when it comes to computer use,
so that means letting the AI model control your computer,
and do tasks for you whilst you go do something else,
it is state of the art.
It is the best here.
They measured it against a benchmark called ScreenSpot Pro,
which is a benchmark which kind of like analyzes its ability
to understand images and visuals on a desktop.
It just absolutely crushes it.
Number three, it is the best AI for math by far.
So again, the point I'm making here or the theme that we're seeing here
is it's not just good at one thing, it's good at many things,
which makes it the best generalist AI model in the way.
world right now, by far. And the final thing, Josh, and this is where it might slip up. I'm curious
to get your take on this. It is insanely good at coding, but we don't quite know if it is the best at
coding yet. What I mean by that is it completely crushed everyone else on one coding benchmark,
but the coding benchmark that matters, which is the software engineering, SWE, it didn't do as well
as its competitor, Claude 4.5 from Anthropic. So those are the four main takeaways. I would much
prefer a model that understands the world than understands how to code. And I think we're starting to see
these subset niches where if Anthropic has the best coding model, that's great. Let them focus on code,
let them narrowly make that the best model. Like Google handle everything else. And I think that's
what Gemini is focusing on. So the code thing doesn't really bother me because I don't care to use
Gemini for code. I'm happy to be in ClaudeCamp for code and then use Gemini for everything else.
And then one of the points earlier that you mentioned on the pricing, I find it a little interesting
because it's a little bit more than just a little bit more expensive.
The pricing, I was looking through it, and it's for over 200,000 tokens,
they're charging $4 for inputs and $18 for outputs.
Now, relative to GPT 5.1, which just got released,
they're charging for a million tokens, $1.25 in, $10 out.
So you're talking about, what is that?
That's about $20 versus $1.25 on inputs.
And that is a fairly significant margin that you're paying for this quality.
So we're starting to see the tradeoffs happening on that parado curve that we talked about in a few episodes earlier, where there are tradeoffs coming in terms of cost and quality.
And it's clear that while Open AI may have optimized for cost, Google is kind of optimizing for a little further up the cost curve in exchange for super high quality.
And it seems like this is kind of a balanced data point for now because unless you are using this via API and you're requiring a ton of tokens, a $20 a month Google membership will get you all of the use that you need.
and that is just fine.
So in terms of a usability perspective,
I think that's okay.
But it's just an interesting thing to know
is that this is a better model.
It is also more expensive.
And it is a tradeoff that was made.
And in the case that Open AI decides
to make this trade off with GPP6
or GROC decides to make this with GROC 5
or GROC 6,
I'm losing track of all these models now,
I think we're going to start to see
the dynamic shift in terms of that Pareto curve
and what model architects decide to remove and add.
And in this case,
it looks like Google added quality,
but they also did add quite a significant
cost increase. I personally don't think it matters. I think it's a nothing burger. I think that if
Google wanted to make it affordable for everyone, including the developers that want to get API access,
that think it might be too expensive, they could subsidize it. They are a cash flow giant. They have
enough money to do that. Open Air has been doing that for so long now that it doesn't even matter.
I don't see any reason why Google couldn't do that. The other reason is Google just released their
latest TPU, which is their GPU that they use to train their models and inference their models.
and typically with every generation, we get a much cheaper cost of inference.
I think by the time that they release their next generation model,
which might be Gemini 3.1, we're going to see a considerable reduction in the cost
for using Gemini 3 Pro and Gemini Pro deep research.
So I'm not too worried about that.
I think it's kind of like a short-term problem and not a long-term problem.
But speaking of TPUs, I just want to take a moment to really kind of belay the point
that using their own TPUs to train a state-of-the-art model
that is 2x better than the previous state of the art model
and probably puts them in a six-month lead
after Google started off on the back foot
creating probably the worst model I've ever seen
and changing that all around in, what's it,
under two years is nothing short of insanity.
TPUs is Google's kind of version of the GPU.
GPU is kind of like what Nvidia controls the monopoly over.
this is the hardware that you used to train your AI and inference your AI.
The unique part here is that Google's never used an Nvidia GPU in any considerable way to train
their models.
They've always trained it in-house.
And that's such a difficult and tricky thing to do because designing and building these
TPUs at scale, these GPUs at scale, is a super hard and complex thing.
You need so much talent.
You need so much expertise and insight to be able to do that.
The unique thing about Google's TPUs, well, there's two main takeaways.
Number one, it's cheaper to train the same amount of intelligence that an Nvidia GPU is.
So it's more cost efficient.
And the second way is, this is their secret source, you can stack those TPUs on top of each other in a really scalable way that you can start training really, really large models.
If you wanted to train the same size model with Nvidia GPUs, it would cost way more and it would take way longer.
So Google made a really risky and big bet about a decade ago saying we're going to build our infrastructure in-house.
We're not going to rely on Nvidia, and we're going to benefit from the full-stack experience.
And this model is a prime example of that bet paying off.
So I just want to call them out.
Like, it's not like Google has gotten lucky here.
They've been planning it for a while now.
The interesting thing to me is that this is the first number one model in the world built
on something other than an Nvidia GPU.
And that's fairly significant because every company in the world is trying, but this is
proof that it's actually possible.
And I think when we talk about Tesla and AI5 and the XAI team,
when we talk about OpenAI working with whoever they're working with to build their own in-house
GPUs, I think this sets a precedent that it is possible. And I suspect that will result in more
companies putting their foot on the gas when it comes to kind of destructing part of Nvidia's monopoly
that it holds over GPUs. So that to me is the interesting takeaway of this. And hearing that it was
fully done, trained on these TPUs, that's very high signal to me saying, okay, there is an architecture
ships happening. There is a real benefit to vertical integration if you could figure out
manufacturing these compute units at scale. And now the race is on for everyone to do this.
Because again, using the Apple example, the M-Series chips, unbelievable, and they unlocked the
best computers in the world. And if companies can really start to refine this vertical integration
of their own chips, you're going to see that exponential curve go vertical times 10. Like,
it is going to, I suspect that is very obviously now how we reach AGI faster than people
previously thought. Because the efficiency improvements from those vertical integrations,
once they're able to manufacture these at scale
are going to be unbelievable
and I'm so excited for that to happen in the near future.
Google has a big head start,
but let me tell you,
the other companies are not far behind.
Well, let me introduce you to another big advantage
of being the big dog Google.
You thought you were going to come on to this episode
and listen to us hopping on about a generalized model?
No.
You're forgetting Google has many other products in their arsenal
and you're forgetting that they can plug in their new state-of-the-art model
into all of them. So Google, not only today, announced Gemini 3, but they also announced a different
product. It's called Google Anti-Gravity, which is basically a new software environment for you to
code up AI agents, except this time these AI agents are going to be super, super smart, because they
get plugged in with Gemini 3. Now, if you remember earlier, I mentioned that one of the cool
benchmarks that this new model sets is in computer use, which means that it can control your
computer, it can do things autonomously for you.
Now, typically the reason why we haven't really spoken about that on this show is that they've been kind of lame.
Like, they can book you a dinner reservation and do different kinds of stuff.
With this model, it's way more intuitive.
It can do way more intelligent tasks and it can take a lot more complex work off of your hands,
such that the value that it produces to you over like the eight hours that you take to sleep overnight
would be considerable for you to actually be serious to use in your enterprise and your business
or just your at-home lifestyle, right?
So the point I want to make around here is Google's moat is not just its intelligence or ability to create new models.
It's not its TPUs.
It's its distribution.
It's the entire product suite that it has that regular users like you and I that use Gmail that use Google Suite can now kind of benefit from simply by plugging in that model.
And I think like products like this anti-gravity.
I bet you, Josh, we're going to see a slew of new Google product releases over the next couple of weeks simply because they created
this model. I hope so. I guess the contrarian take is like, okay, how many people are actually
going to want to use them? We just spoke about how Claude is the superior code model. Everyone
loves cursor. No one really uses the mobile applications of these. A lot of people are engaging with
AI on their phone. So maybe it works for the right type of person. But Google still does have that
product problem where they kind of have a tough time. They have the amazing intelligence. They just
have a tough time using it. I mean, I don't have the Gemini app on my phone. I mostly use GROC and chat
GPT, and there is this bar that they still need to cross that I think they're trying with
Google AI Studio. And we had Logan Kilpatrick on who was the head of that to talk about it when
Nano Banana came out. But there is still a bit of a long shot for them to get good at products
to actually develop this. But what we saw this week is that there was a resounding, overwhelming
amount of support to your point you does, where the market just believes in Google. And in a week
where all of the stocks, all of the Mag 7 was down, Google was the one anomaly. Google was up this week.
And I think it's because the market is starting to realize, one, vertical integration through these TPUs is a huge deal.
Two, Google has an existing business that is not reliant on AI.
And sure, AI places a huge, like, hand on that scale, but it is not everything.
And they are cash flow positive in the absence of AI.
So all of this innovation that they're doing is really just applying later fluid on top of an already great business.
And the market is starting to evaluate that properly.
So Google is positioned very strongly.
They have very high intelligence.
Gemini 3 rocks.
And I mean, again, we continue on the bull train for Google.
I am a believer.
I am a supporter.
I am stoked that they have the crown.
I assumed it was only a matter of time.
And now the question is, who's next?
Who is the next competitor?
Who's going to set the next plot on that chart and set the vertical trajectory on the
next dimensional curve we're on?
Do you have any guesses?
Who do you think it's going to be?
Yeah, well, I don't because I don't think it's going to be anyone for a while.
I said this all in the show and I'm going to say it again.
I think there's going to be a six-month period now.
where either the other model providers don't release a model because it's not as good as Google's,
or they just kind of release these kind of mediocre kind of like consumer products that kind of maybe
benefit certain consumers in one way or another, but doesn't really kind of break the generalized
model standard that Google has just set. Just the last point on the kind of Google bull case thesis,
they may not just play in the same ring as Cursa does. Like I was critiquing,
Microsoft on another episode, Josh, do you remember? And then I got off that episode and I was just like,
Microsoft like dominates the enterprise environment. All the boomer companies and institutions love Microsoft.
And they have all their data and memory. And just because you and I don't use it or just
I'll speak for myself, just because I don't use it. And I think it's boomer doesn't mean that they're
not absolutely crushing. Google just came off a hundred billion dollar quarter of revenue.
That's like the highest they've ever had. So I don't want to be too hasty to say that like Google's not going
make it because they can't make a sick consumer product like Open AI can maybe.
I just think they're maybe playing in different fields.
But to the point around like I don't think anyone else is going to catch up, look at these
comments Josh.
I want to show you two comments, all right?
One is from Sam Altman.
He goes, congrats to Google and Gemini 3.
This looks like a great model.
The other is from the almighty being Elon Musk saying, I can't wait to try this out.
And this is just one of a series of tweets that he's been putting out this week saying,
can you guys just drop Gemini 3 because I need to see how good this thing is?
And the reason why I bring up these two people is both Sam Altman and Elon Musk have released new versions of their models, GPT and GROC respectively.
But it's been the 0.1 upgrade.
It's GPT 5.1.
It is GROC 4.1.
And they are almost identical updates.
You want to know what the biggest and coolest thing about their model updates were?
Personality traits, which don't get me wrong is cool.
Like I would like my model to kind of respond in a very intuitive manner and get me.
But it's nowhere near the state of the art standard that we've just seen broken by Gets.
Gemini. So the point I'm making is, I think these two companies might have run out of fuel for the
near term. Grock is going to be next. You think? They're the next one. By the end of Q1, Grock will
have the crown. Why? And I assume by a fairly large margin. But I assume it will be a different type
of crown. And this is where I'm really excited to see how these models progress. We spoke a little
bit early about how Cursor is kind of the coding model. Google has a very deep understanding of the
real world and physics and video and understanding how that works. Groc and the XAI team are
are very focused on the pursuit of truth and information.
And I think that's kind of the alley that will see them going down.
So they have the real-time data with X.
They have the pursuit of truth.
And where Google and OpenAI and all these other companies are trained on an existing
data set, the XAI team and the GROC team are developing an entirely new synthetic
data set that is maximally truth-seeking.
And we saw that early version with Grockapedia that should provide the most accurate and, I guess,
thoughtful information.
It should be the best of thinking because.
is the closest to source truth. So while I think Gemini will probably be better at physics and video
and understanding the real world for quite some time, I suspect GROC will be really good at just
communicating via text. If text is a modality in which we interface from, GROC should be really good.
And again, the rate of acceleration. GROC has been around for the least amount of time. They're
accelerating the fastest. And I'm very, very, very excited for a GROC 6, GROC 5, whatever we're at
announcement, hopefully early next year. So that's the predictions. That's the episode.
that's Gemini 3.1. It is an unbelievable new model. Everyone can try it out. So here's how
you try it out. I believe you need to be a Google premium plus subscriber, whatever it's called. It's
$20 a month. You can go on the Gemini website and it's just a text box and you can play around
with it. They also have a mobile application. It's very easy to download on your phone. Play around
with it. I'd love to see examples of cool things because I think one of the problems for me and one of the
things I'd love help with from anyone who's listening is how do you use this thing to test
it? What do I ask it? And how are you interfacing with it?
to get the maximum amount of results from it.
Because intuitively, I would never think to record myself
and ask for feedback, but that's a new possibility.
So I guess the challenge to anyone who's listening
is figuring out of these new models
as these new features get released.
And Gemini 3 has just opened up the gates
to a gazillion new use cases.
Yeah, I mean, this is a super cool release for Google.
And weirdly enough, it's not the only release over the last week.
I mean, I've got a list pulled up here.
They've released new Android iOS.
updates, they've got a new search AI mode, they've released anti-gravity that we mentioned earlier.
We've got Seema 2 research, which we demoed on a previous episode.
You should definitely go check that out.
I mean, they are just not stopping.
And they're a force to reckon with.
And kind of similar to them, Josh, just to kind of round this episode out and thank you guys for listening.
We are here in Argentina, in Buenos Aires.
We are kind of meeting some of the fans that are at here.
And we spoke to one just this afternoon, Josh.
And you know what he said to me?
have a guess.
What's that?
He said your podcast limitless is like the state of the art AI podcast.
In fact, it is 2x better than any other AI podcasts that I've ever heard.
And you know what?
Hell yeah, brother.
That sounds very similar to the Gemini 3.
So you could potentially call us the Gemini 3 of AI podcasts.
And so if you're a listener to this, if you are a non-subscriber on our YouTube,
you should probably click that subscriber button.
You should probably click that notification button.
Because guess what?
We've got more episodes coming this week.
And guess what?
The five-star ratings help us out massively.
So if you enjoyed this episode and if you want to hear more episodes of this nature
and of cutting-edge news in AI, you should give us a follow.
And we will see you on the next one.
