The AI Daily Brief: Artificial Intelligence News and Analysis - Does Gemini 3.1 Pro Matter?
Episode Date: February 20, 2026Gemini 3.1 Pro arrives with big benchmark gains and a sharp jump in reasoning, coding, and efficiency—but in a world where the frontier rotates weekly, raw performance isn’t the story. This episod...e looks at what actually matters: cost per task, multimodal dominance, and where Gemini fits in a model portfolio that now demands specialization over supremacy. In the headlines: India’s AI Impact Summit and the Altman-Amodei moment, Walmart bets on AI for growth, Amazon tracks employee AI usage, and Accenture ties promotions to adoption. Want to build with OpenClaw?LEARN MORE ABOUT CLAW CAMP: https://campclaw.ai/Or for enterprises, check out: https://enterpriseclaw.ai/Brought to you by:KPMG – Agentic AI is powering a potential $3 trillion productivity shift, and KPMG’s new paper, Agentic AI Untangled, gives leaders a clear framework to decide whether to build, buy, or borrow—download it at www.kpmg.us/NavigateMercury - Modern banking for business and now personal accounts. Learn more at https://mercury.com/personal-bankingRackspace Technology - Build, test and scale intelligent workloads faster with Rackspace AI Launchpad - http://rackspace.com/ailaunchpadBlitzy - Want to accelerate enterprise software development velocity by 5x? https://blitzy.com/Optimizely Agents in Action - Join the virtual event (with me!) free March 4 - https://www.optimizely.com/insights/agents-in-action/AssemblyAI - The best way to build Voice AI apps - https://www.assemblyai.com/briefLandfallIP - AI to Navigate the Patent Process - https://landfallip.com/Robots & Pencils - Cloud-native AI solutions that power results https://robotsandpencils.com/The Agent Readiness Audit from Superintelligent - Go to https://besuper.ai/ to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Interested in sponsoring the show? sponsors@aidailybrief.ai
Transcript
Discussion (0)
Today on AI Daily Brief, Gemini 3.1 Pro is here, and I think its point is to flex multimodal.
Before that, in the headlines, a lot of talk about AI in India, but is there anything worth listening to?
The AI Daily Brief is a daily podcast and video about the most important news and discussions in AI.
All right, friends, quick announcements before we dive in.
First of all, thank you to today's sponsors, KPMG, Insight-wise, Super Intelligent, and Blitzy.
To get an ad-free version of the show, go to patreon.com slash AI Daily Brief, or you can subscribe on Apple Podcasts.
To learn about sponsoring the show, send us a note at sponsors at AIDDailybrief.ai.
And of course, one more quick reminder about the projects that we launched this week.
Claw Camp, a free self-directed program to build an agent team using OpenClaw.
We have kicked off the first four-week sprint, so come join about 3,500 of your best friends in becoming an agent boss.
Meanwhile, for the enterprises out there who want to figure out how to use OpenClaw and other systems to build agent teams and change how you do things,
we've got an executive sprint coming up.
I will be sending more information at the very beginning of next week.
So if you were interested in that, check out EnterpriseCla.aI.
Lastly, if you want the single coolest job of all time, come apply to be our Clarkitect
and work on Agending Viag coding projects with me across the AIDB ecosystem.
As always, all of this information is linked at AIDilybrief.A.I. for easy finding.
Today we start with the AI Impact Summit.
It's a gathering in New Delhi that has brought together world leaders and AI executives.
This is the first time the event has been held in a developing country with
previous iterations hosted in the UK, France and South Korea. The selection of India as the host
country was symbolically important, allowing the event to platform a political call to address
AI inequality. Earlier in the week, a UN report highlighted that AI adoption is still growing
more rapidly in the developed world, risking a permanent technological divide.
UN Secretary General Antonio Gutierrez wrote in an ex-post,
The future of AI cannot be decided by a handful of countries or left to the whims of a few
billionaires. AI must belong to everyone. AI must be accessible to
to everyone. AI must benefit everyone. AI must be safe for everyone. Let's build AI for everyone.
In a follow-up post, he called for a global fund on AI to, quote, build skills, data, affordable
computing power, and inclusive ecosystems everywhere. Now, this is one of the first times we've heard
world leaders proclaim the need to deliver affordable AI to the global south. Until now,
the discussions have largely been about national or regional interests. By way of example, last year's
summit in Paris was squarely focused on European leaders establishing the need to invest and compete in the
AI race. This year's event was a shift towards recognizing the need to treat AI as a global public
good. The other big theme of the summit was India itself declaring their ambition to become a global
AI power. The event featured huge investment commitments for Medani and Reliance Industries,
who will each spend more than $100 billion on local data centers over the coming decade.
The Indian government also earmarked a $1.1 billion fund to the efforts. Aside from global leaders,
the summit also saw tech leaders fly in, including Google CEO Sundarpe Chai, DeepMinds CEO Demasas
Sabas and Mistral CEO Arthur Munch.
Slightly overshadowing other things going on at the event, Bill Gates canceled his keynote
because of continued scrutiny over his appearance in the Epstein files.
And yet still, with all of that, all eyes were on Sam Altman and Dario Amaday.
Specifically on one moment where more than a dozen tech leaders joined Prime Minister Modi
on stage.
The leaders joined hand and raised their arms in celebration, save for Altman and Amade,
who refused to hold hands.
Beth Jaisos broke down the tape and determined that Dario had been the one to refuse to hold
Albin's hand, but regardless of who instigated, the moment reflected just how bitter the rivalry has
become. While the two were on stage, a chart from Epic AI went viral, suggesting that Anthropic
is on a pace to overtake open AI in revenue terms by the middle of this year. So with that bombastic
framing established, the two AI rivals took to the stage and delivered vastly contrasting speeches.
Dario, it must be said, um, did not his way through a generic and well-trodden narrative read
from an iPhone screen. He said nothing he hadn't said before, and many people
commented on just how bad it looked for him to be reading off his iPhone.
Wrote terminally online engineer on X,
The oral loss is crazy. I take back everything good I said about Anthropic.
Altman was more eloquent, discussing how the fundamental uncertainty of AI
interacts with global issues of democracy, social contracts, and job loss.
His major call to action was for global leaders to continue iterative deployment
and allowed people to access each successive layer of the technology as it unfolded.
Offstage in an interview with CNBC, Altman expressed skepticism over the
present fear of AI job loss, remarking, I don't know what the exact percentage is, but there's some
AI washing where people are blaming AI for layoffs that they would otherwise do, and then there's
some real displacement by AI of different kinds of jobs. Now, it is difficult for me to take very
seriously these global talk fests. I guess theoretically sometimes genuine action arises from them,
but mostly the model is that world leaders arrive, exchange platitudes about the state of the world,
and then return to doing exactly what they were already doing. It's about the silly photo op of the
arms up of all these people, which was incredibly awkward and weird even if there hadn't been
the scruffled between Sam and Dario.
Sean Wang, aka Swix, really nailed it in a post he called,
Why do AI conferences keep not getting AI?
He wrote, I feel for my brothers and sisters in India.
This was their big moment on a global stage and perhaps an inflection point for one and a half
billion people who will have to figure out their place in the new AI-shaped economy,
and yet the powers that B decisively demonstrated that nothing will change.
They care more about bad photo ops and hobnobics.
with celebrities than they care about the builders that are supposed to drive the Indian AI economy forward.
Ultimately, I think the less time you spend caring about what's said at events like this,
and the more time you spend on building things, the better off you're going to be.
Still, we had a huge portion of the big tech AI leaders and a number of sovereign leaders as well,
so we couldn't let it pass completely undiscussed.
Next up, we shift over to business world, where Walmart is turning to AI as their next big growth
driver after a soft earnings result.
The past quarter has been a mixed bag for Walmart.
They've briefly achieved the milestone of becoming a trillion-dollar company.
However, they also lost the crown as the world's largest company by revenue to Amazon after 17 years on top.
This week's earnings report guided lower earnings and revenue growth for the coming year,
reflecting the shaky position of the consumer economy.
And yet, in spite of, or perhaps because of that,
the earnings call focused heavily on Walmart's AI transformation strategy.
Newly installed CEO, John Ferner said,
The way we're using technology in AI is helping us create great customer solutions,
reduce friction, simplify decision-making, and pinpoint where our inventory is, all while maintaining
the trust we've earned from our customers and members. Now, Walmart has, of course, been rolling out
AI into every corner of their business over the past couple of years. Furner flagged that their
shopping assistant, Sparky, has shown early promise and will become core to their strategy moving
forward. He reported that around half of Walmart's online customers have used Sparky, and that
those using the assistant ordered 35% more than those who didn't. U.S. CEO and President David
Gugina noted that AI is driving a complete transformation in the way that Walmart thinks
about their business. He said, Sparky is essentially helping us evolve from traditional search to
intent-driven commerce. From an economic standpoint, better discovery and higher conversion
translates into bigger baskets and greater frequency. Sparky is helping customers find the things
they need, they want, and they love, and it's strengthening our digital unit economics as it scales.
Next up, moving over to the company that dethroned Walmart off the top of the Fortune 500,
Amazon is keeping a close eye on AI adoption with new metrics in their employee tracking system.
The information reports that Amazon has been using an internal system called Clarity
to measure various elements of AI tool use within the company.
The system, which is also used to measure other elements of employee performance,
is now being used to track overall AI usage by teams,
as well as which tools are seeing the most use.
The monitoring doesn't just include Amazon's in-house tools,
but also external AI products that staff are encouraged to use.
The tracking goes well beyond software engineering and standard white-collar functions,
with Amazon also keeping tabs on how the company's supply chain optimization team is making use of AI.
While Amazon has maintained that AI was not the direct cause of their massive recent layoffs,
the framing of the assessment certainly implies a push to realize AI productivity gains.
Employees are asked how they have, quote, accomplished more with less,
and for specific examples where they have remained innovative, force-multipied using AI
and delivered results while reducing or not growing headcount.
Moving over to the big consulting world, Accenture is laying down the law when it comes to
AI use in the workplace, telling senior managers that no AI, no promotion.
The consulting giant has begun collecting data on how some senior employees use AI tools
and explicitly tied the metrics to career progression.
According to an email viewed by the Financial Times, Accenture has told staff that
promotion to leadership roles will require regular adoption of AI.
You might remember that Accenture embarked last year on one of the more ambitious AI
upskilling projects, at the time CEO, Julie Sweet, said that the staff who failed to adopt
AI workflows will be, quote, exited from the company.
This week's email reinforced that initial training is now over and use of AI.
is a fundamental requirement of the job. It stated, use of our key tools will be a visible
input to talent discussions during the summer promotion cycle. In their story about this,
Financial Times noted that AI holdouts are becoming a major problem across the consulting industry.
Three executives at Big Four accounting and consulting firms said that convincing senior managers
and partners to use AI has been a much more difficult task than introducing the tools to junior
staff. One executive said that older, more senior figures at the firms are more set in their ways,
requiring a carrot-and-stick approach. It'll be interesting to say,
see how much internal resistance they find. One person familiar with the policy change said they
would, quote, quit immediately if it affected them, while another source criticized the quality of the
tools deployed at Accenture, describing them as broken slop generators. In a press statement,
Accenture, we explain the need to keep pushing, commenting, our strategy is to be the reinvention
partner of choice for our clients and to be the most client-focused AI-enabled great place to work.
That requires the adoption of the latest tools and technologies to serve our clients most effectively.
and to understand why you only need glance at Accenture's share price.
The stock is down 17% year-to-date and 45% over the past year.
Now, this is pretty interesting to me as a bellwether of where corporations might go.
I think Hedgey at Hedgy Markets on X probably sums up the feeling of a lot of folks
when he writes,
if these tools were actually useful, people will just use them.
You don't need to track logins and tie them to promotions.
The fact that companies are resorting to this tells me adoption isn't happening organically,
which raises questions about whether the tools are delivering value,
or just generating metrics for leadership to point at.
I don't think this is necessarily a super cynical take, but I do think it's wrong.
The biggest issue that we find across all of our surveys at AID LeBief as well as everything
we do at Superintelligent is the problem of time.
People inside enterprises report that they don't have time to learn the technology that
would save them time.
And unfortunately, the vast majority of companies we interact with don't create specific time
carveouts for their people to learn how to use these tools.
They simply expect people to figure out that time on their own.
That creates a situation where people feel negatively about these tools because they're just another layer of stuff that they have to do,
which creates the need for mandates like this.
Now, to the extent we're talking about tool quality, I do think that in many corporations there is an issue of the tools that are approved for work,
being pretty far behind what people have access to in their personal lives.
Probably the second most frequent complaint we see outside of the I don't have time to learn this stuff,
is at home I'm using Opus 4.6, and at work I have a terrible old version of co-pilot.
In any case, I do think that to some extent Accenture is an extreme.
example of this because of the point that they're making that if they are in the business of bringing
this new technology to their people, they really kind of need to know about it. But I wouldn't be
surprised to see more mandates like this in the months and year to come. For now though, that is going
to do it for today's AI Daily Brief Headlines edition. Next up, the main episode.
Agentic AI is powering a $3 trillion productivity revolution, and leaders are hitting a real
decision point. Do you build your own AI agents, buy off the shelf, or borrow by partnering to
scale faster. KPMG's latest thought leadership paper, Agendic AI Untangled, navigating the build,
buy or borrow decision, does a great job cutting through the noise with a practical framework to help
you choose based on value, risk, and readiness. And how to scale agents with the right trust,
governance, and orchestration foundation. Don't lock in the wrong model. You can download the paper right now
at www.kpmg.us slash navigate. Again, that's www.kpmg.us.us slash navigate.
As a consultant, responding to proposals can often feel like playing tennis against a wall.
You're serving against yourself, trying to guess what the client really wants.
That all changes with Insight-Wise.
Now you've got an AI Proposals engine that thinks just like your client.
It returns to the brief time and time again, picking apart your work,
identifying key evaluation criteria and win themes,
and making recommendations to ensure you stand out.
Suddenly, you're on center court.
But this time, you've got a secret weapon.
Insight-wise gets rid of all the time-consuming manual work
so you can focus on winning more business more often.
Generate reports, pull insights from your own data,
build competitive advantage, and go to sleep before 2 a.m.
When it comes to proposals, you only get one shot.
With Insight-Wise, make yours an ace.
Today's episode is brought to you by my company Superintelligent.
In 2026, one of the key themes in Enterprise AI,
if not the key theme,
is going to be how good is the infrastructure
into which you are putting AI and agents.
Super Intelligence agent readiness audits
are specifically designed to help you figure out,
one, where and how AI and agents
can maximize business impact for you,
and two, what you need to do
to set up your organization
to be best able to leverage those new gains.
If you want to truly take advantage
of how AI and agents
can not only enhance productivity,
but actually fundamentally change outcomes
in measurable ways in your business this year,
go to B-Supert.aI.
Blitzy is driving over 5x engineering velocity for large-scale enterprises.
A publicly traded insurance provider leveraged Blitzy to build a bespoke payments processing application,
an estimated 13-month project, and with Blitzy, the application was completed in live in production in six weeks.
A publicly traded vertical SaaS provider used Blitzy to extract services from a 500,000 line monolith,
without disrupting production, 21 times faster than their pre-Blitzy estimates.
These aren't experiments.
This is how the world's most innovative enterprises are shipping software in 20,
You can hear directly about Blitzy from other Fortune 500 CTOs on the modern CTO or
CIO classified podcasts.
To learn more about how Blitzy can impact your SDLC, book a meeting with an AI Solutions
consultant at blitzy.com. That's BLITZY.com.
Welcome back to the AI Daily Brief. Today we are talking about Gemini 3.1 Pro.
But I want to situate it in a larger question. And I will start by saying, sorry to Google
for drawing the short end of the episode naming straw on this one, if it had been OpenAI
that released 5.3, it would have been something very similar.
The context we now all operate in
is one where instead of getting big model releases
infrequently, we get very incremental model releases
much more frequently. There is in fact this meme which
came from 2025, but which is more true than ever,
which is a circular chart that starts open AI,
introducing the world's most powerful model, that moves to
GROC, introducing the world's most powerful model, that moves to
Gemini, introducing the world's most powerful model, that moves to
anthropic, introducing the world's most powerful model, that moves to
OpenAI introducing the world's most powerful model, and so on. In that, with the release of 3.1
Pro, we are now at the Gemini section of that chart. And the point of course is that at this stage,
state-of-the-art in terms of incremental gains on benchmarks, feels less significant as a barometer
of a model's importance than it ever has before. When people say, what is the best model,
it is not only constantly shifting, but also, I think, in practice, a question that is
use case-dependent. So let's talk about Gemini 3.1
The first reactions, both good and bad, and then try to figure out where it fits in the ecosystem
of models.
Now, it is worth pointing out that I think Gemini was absolutely due for a bit of an upgrade.
The conversation for pretty much all of 2026 and really heading back into the end of 2025
has been dominated by Anthropic v. OpenAI, or more specifically, Codex versus ClaudeCode
code.
Despite Gemini 3 having such wide acclaim when it came out towards the end of last year,
Google and Gemini have been really nowhere in the conversation when it comes to this
incredibly important use case of coding. Now, it is worth noting that there are lots of different
categories of AI users, and it is not the case that for all of them, coding is what matters.
It would be completely reasonable, in other words, for Google to put its priority in other areas.
However, it certainly doesn't seem like Google is explicitly not trying to compete in that
area. They're clearly investing a lot in Google AI Studio and anti-gravity, but when it comes at
least to the most enfranchised subset of users, they were kind of, at least in our recent survey
results at distinct third. All of the big models, Claude, ChatGBTGBT and Gemini had some broad
usage in our January AI usage survey. Gemini, in fact, matched Claude with 80% of respondents having
used it sometime last month, both falling slightly behind Chatchabit, which was at 87%. However, in terms
of the number of people reporting that it was their primary model, Gemini was down in third at 16.1%.
And at first blush, there is a lot to be impressed with Gemini 3.1 Pro. Going by the benchmarks,
It is a distinction number one when it comes to humanity's last exam, not using tools,
sets a new high for the GPQA Diamond Scientific Knowledge benchmark,
sees a big jump up for Gemini on Terminal Bench 2.0,
coming in ahead of Opus 4.6.
And while it wasn't ahead of Opus 4.6 on Sweet Bench-Verified Agentic coding test,
it was nipping at its heels 80.6% compared to 80.8%.
The biggest jump in the one that a lot of folks are talking about was on Arc AGI2.
While Opus 46 scored a 68.8% on that test,
the jump between Gemini 3 Pro and Gemini 3.1 Pro was from a 31.1%
with Gemini 3 to 77.1% on Gemini 3.1 Pro.
Google CEO Sundarpa Chai says,
Gemini 3.1 Pro is great for super complex tasks like visualizing difficult concepts,
synthesizing data into a single view, or bringing creative projects to life.
Demis Hasabas points to major improvements in core reasoning and problem solving.
Google VP Josh Woodward calls out who they want the model to appeal to, writing,
to the scientists, the engineer, and the developer.
Gemini 3.1 Pro has arrived.
It's a significant leap in complex reasoning.
Once again, he points to Arc AGI2, so it's great edigentic tasks, intricate coding,
and data synthesis projects.
You should see fewer errors, better logic, and surprisingly good SVGs.
Attached to the post is an animated image of a seal bouncing a beach ball on its nose.
So what are the first impressions?
The model is still rolling out and it's only available in certain pockets of the Google ecosystem,
which, by the way, is its own challenge that people like Ethan Malik had pointed out,
that the Google ecosystem of AI is so diverse that it's sometimes hard to wrap your head around
what model lives where, but among those who have tried it, a lot of the responses are pretty positive.
AI developer Eric Hartford wrote, loving Gemini 3.1 Pro. It made three huge improvements to my
compiler and saw things that even chat GPT 5.2 Pro extended and Claude Opus 4.6 extended
it couldn't see. Designer and entrepreneur
Mang 2 writes, Gemini 3.1
Pro is an absolute beast for creating landing pages.
It understands design details
and animation so well. Insane
upgrade for web designers.
And then of course there's Ark EGI 2,
where it came in at a 77.1%,
but that might not even be the most impressive thing.
The Arc leaderboard measures
not only the score but the cost per task.
So for example, although Gemini 3 Deep Think,
which was released last week,
got a higher overall score,
it did so at more than 10 times.
the cost. 3.1 Pro achieved that score at less than a buck a task. On artificial analysis's
overall intelligence index, Google jumped all the way from the sixth spot behind various versions of
Claude, GPT, and even a Chinese model, GLM5, all the way up to number one. What's more,
artificial analysis points out that it's doing so at a more efficient cost. They write,
Google is once again the leader in AI. Gemini 3.1 Pro preview leads the artificial analysis
intelligence index, four points ahead of Claude Opus 4.6, while costing less than half as much to run.
They said that on their tests, it led six of the ten evaluations that make up the index,
with the biggest gains in reasoning and knowledge, coding, and hallucination reduction.
They also point out that it does so with some serious token efficiency.
They write that its processing efficiency combined with lower per token pricing
means that 3.1 Pro Preview costs less than half as much as Opus 4.6 max to run,
although it still is nearly twice as much as the leading open weights model, which is that
GLM5 that I mentioned. In terms of specific tests, they found that Gemini 3.1 Pro
led their coding index achieving the hardest score on both Terminal Bench Hard and SciCode,
but that one area where they were kind of lacking was on real-world agentic performance.
This is around that GDP-Val test, which we've talked about before, which is an agentic
evaluation that focuses on real-world tasks. While Gemini 3.1 Pro did jump up meaningfully from
Gemini 3 Pro, it was behind Sonnet 4.6, Opus 4.6, GPT 5.2, and GLM5. That was something that a number of
skeptical commentators focused on. Scaling01 on Twitter writes, Gemini 3.1 Pro's GDP vows scores are
concerning. Simon Smith points out that maybe that suggests that work tasks aren't Google's focus.
Indeed, he even goes so far as to speculate, they have a stake in Anthropics, so maybe they're okay
with that. When it comes to coding outside of that one example that I've mentioned already,
I'm just not seeing enough feedback yet to really know.
Some people had trouble actually finding the model or getting it to work inside anti-gravity
or Gemini-CLAI, although when they did, as reported by Matt Veloso, they had, quote,
awesome results so far.
Akash Gupta gets at what I think is likely to become a more discussed aspect of this,
which is the cost performance frontier.
He writes,
Best to AI Model Crown now rotates on a weekly basis,
with each lab holding a different column of the same spreadsheet.
The real number in this release is the 96 cents per task on Archage
G.I.2. Google went from 31.1% to 77.1% in three months while keeping pricing at $2.00 per million
input tokens. The same pricing is Gemini 3 Pro. They doubled the intelligence and charged
zero incremental cost. That's the game now. The frontier is commoditizing so fast that benchmark
leadership lasts weeks, not quarters. OpenAI, Anthropic, and Google are all within single-digit
percentage points of each other on most evals. The three labs are converging on comparable
intelligence, but diverging on distribution. Google has 2 billion Chrome users, Android workspace,
and cloud. That's the real moat in this chart, not the 77.1%. Whoever makes intelligence
ambient and cheap wins. And this benchmark table with its patchwork of leaders across every column
is the clear sign yet that raw capability is table stakes. I think there is a lot of truth in that,
and so one of the reasons why, yes, Gemini 3.1 Pro does matter, is that it's pushing on the cost
frontier, not just a performance frontier. Now, the other thing about Gemini is that it's very clear
that the productization of its multimodal capabilities is something that really matters to Google.
Alongside the new model update, Google Labs announced a new feature for their Prameli app called
Photoshoot. They write, with Photoshoot, you can start from a single image of your product
and easily create high-quality customized product shots to elevate your marketing. That tweet went wildly
viral. In fact, where a CEO Sundarpe Chai's tweet, announcing 3.1, had around 1 million views,
the Google Labs tweet, announcing photo shoot, has 12.2 million views at the time of recording.
Google Labs product director Jacqueline Kondselman wrote, clearly this hit a nerve. Turns out a lot of
people have been waiting for a way to get professional product photos but didn't have the time of
resources to make it happen. Now they can. Go try it. It's free. When folks like A16Z partner
Justine Moore tried it, they also came away in practice.
Another example of Gemini flexing its multimodal bona fides came with a partner announcement
from Replit when they introduced Replet animation.
It is exactly what it sounds like, a tool to vibe code infographic videos, powered, they say,
by Gemini 3.1 Pro.
Replit's CEO, I'm Jud Massad, wrote,
vibe coding as a term is a bit tragic because it implies you're merely making software,
but you can really make anything.
We've been having a lot of fun making videos with Replit animation,
the kind I used to pay thousands of dollars for when we needed to do a launch video.
Also, if you dig around enough, you can see the types of things that people are using Gemini 3.1 Pro for
are just a little bit different than the other tools.
Sure, there's a bunch of weird Pelican SVG tests, but you also have examples like this one
from Daniel Z who writes,
Gemini 3.1 Pro vibe coded a double wishbone suspension.
Independent double wishbone design, dynamic coilover shock absorver, vented disc brakes with
performance caliper, real-time kinematic travel and steering simulation.
AI isn't just generating visuals anymore.
Demis Heshabis shared an official example from the Google Deepmine account, where they used
3.1 Pro to build a realistic city planner app that has complex terrain, infrastructure mapping,
and even simulates traffic.
Google DeepMind chief scientist Jeff Dean shared an example of 3.1 Pro doing heat transfer analysis
based on a CAD file and material properties, and then turning that heat transfer analysis
at different times into a visual representation.
Overall, I agree on the surface with latent space when they wrote,
it's getting a little hard to say interesting things with all the round robin minor version updates
at frontier models every week. Gemini 3.1 Pro seems like a decent enough advance to catch up and in some
cases supersede the fellow frontier models. It's better at some SVG design things and translating
textual vibes to visual aesthetics, but that's kind of all they had to say. I think though, coming back to
this question of why 3.1 Pro matters or why any new model release matters, the point that I was trying to make
at the beginning is that it's not just about state-of-the-art of the benchmarks. That is, as
Akash pointed out, table stakes. What's important is to try to understand what it does uniquely well.
It's very clear, when you actually dig deep that Gemini is flexing its multimodal capabilities
in a full spectrum of ways, from being able to do much more technically and scientifically advanced
work, to being at the core of products that aren't possible with the other models.
Now, that doesn't necessarily mean for Google that they can still get away with competing on core
use cases like coding, but part of the reason I think we found that even though it was the primary
model for just 16.1%. Still a full 80% of people had used Gemini in the previous month because there
are just some use cases that it is ideally suited for. It is very clear that as we head deeper into the
AI and age and age, the greatest gains will not come from just shifting wholesale from one model to the next
as new capabilities emerge, but instead to understand with each model release what that particular
model is going to do best and where it should be in your model portfolio. I'm excited to dig into
3.1 Pro and I'm sure I will have more to report in the week to come. For now, though, that
that is going to do it for today's AI Daily Brief. Appreciate you listening or watching and until next time, peace.
