Everyday AI Podcast – An AI and ChatGPT Podcast - EP 504: Has Anthropic’s Claude lost its edge? What happened & can Claude recover?
Episode Date: April 15, 2025Has Claude by Anthropic lost its edge in AI? Discover why OpenAI and Google are pulling ahead and what it means for AI dominance. Explore if Claude can stage a comeback and what's next for AI in...novation. Tune in for insights and analysis.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Anthropic's Claude Losing Market EdgeComparison of AI Innovators: Anthropic vs. GoogleClaude 3.7 and Industry RelevanceOpenAI and Google AI AdvancementsEnterprise Hesitation with Anthropic's ClaudeBenchmark Performance: Anthropic vs. CompetitionClaude User Experience and Rate LimitsFuture Prospects for Anthropic and ClaudeTimestamps:00:00 Anthropic's Decline in AI Race02:20 Daily AI News10:00 Has Anthropic Claude Lost It's Edge?17:05 Claude's Restrictive Approach Missteps20:50 Anthropic's Delayed Feature Rollout22:17 Claude's Profitability vs. User Growth27:42 Can Anthropic Stay Competitive?30:02 Anthropic's Struggle Against Gemini Models32:57 "Claude's Ranking Below Top 10"36:08 Coding Model Rankings: OpenAI Tops40:46 "Claude's Decline Amidst AI Advancements"44:00 "Frustration Over Usage Limitations"48:33 Anthropic Lags Behind Google/OpenAI49:10 "Stay Connected & Engaged"Keywords:Anthropic, Claude, Claude 3.7, Claude's edge, AI model, Large Language Model, AI frontier labs, Google, OpenAI, Microsoft, Apple, Synthetic data, Differential privacy, Tariff policies, NVIDIA, U.S. AI manufacturing, Taiwan Semiconductor, Foxconn, Wistron, Digital twins, Advanced robotics, GPT 4.1, Million token context window, API developers, Context processing, Claude's competitiveness, Internet access, Third-party integrations, Rate limits, Claude artifacts, Claude projects, Computer use, Power users, AI safety, Claude Max, Rate limit sessions, Cloud recovery, AI coding index, Coding performance, Price comparison, Benchmark performance, Intelligence index, Human preference, Elo score, Web traffic, Market competition, Enterprise challenges.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live and Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
I would say for the better part of two years,
the large language model race was three teams.
You had Anthropic, Open AI, and Google racing for the lead.
And going back and forth, jab for jab as the best AI model maker in the land.
Obviously, you know, my.
Microsoft's in there, but they're more of a system that uses other technology.
But when it came to actual AI frontier labs, it's always been a three-team race.
I don't know if it's like that anymore.
I think right now, OpenAI and Google are so far ahead of everyone else.
And I'm left wondering, what happened to Anthropic?
What happened to Claude?
is it still a top tier large language model or has clawed completely lost its edge and can they ever catch up with Google and Open AI?
All right.
We're going to be talking about that and a lot more on Everyday AI.
What's going on, y'all?
My name's Jordan Wilson and I'm the host of Everyday AI.
This thing, it's yours.
It's your daily live stream podcast and free.
daily newsletter helping us all not just learn AI but how we can leverage it to grow our careers
because you can try to keep up with AI news and developments and new large language model updates you
can try to keep up but just hearing about them reading about them doesn't do anything you need to
leverage it and that is what our website is all about your everyday AI.com so there we recap
each and every day's podcast episodes sometimes I have guests on sometimes it's just myself
so we bring you exclusive insights every single
day, we're actually the only AI newsletter that does that, as well as we keep you up with everything
else happening in the world of AI. So you can be the smartest person in your company or your
department when it comes to generative AI. All right, let's actually do that and go over a quick
recap of what's happening in AI news for April 15th. So Apple is responding to criticism over its
AI performance, particularly in areas like notification summaries with a timely pivot toward synthetic
data and differential privacy.
So yeah, Apple kind of responding, according to reports, by focusing a little bit more on
synthetic data, right?
So the company generates, according to the report, the company is now generating synthetic
data to emulate user information without using real content, enabling private testing
on data's of users who opt into device analytics.
So this approach ensures accuracy while safeguarding privacy.
So yeah, Apple obviously has had a super, super slow rollout.
And by super slow rollout, they're years behind everyone else.
And their Apple intelligence, let's just say it has not been well received.
So some new reports and information are showing Apple's kind of new or updated approach by using synthetic data,
kind of tying it to those who sign up.
for kind of this device analytics.
So by polling devices with synthetic data comparisons,
Apple is hoping to enhance its Apple intelligence
with better email summaries and other functions,
signaling a broader commitment to addressing user concerns
and advancing its AI capabilities responsibly.
Our next piece of AI news,
Nvidia has committed $500 billion to USAI manufacturing
amid changing tariff policies here in the U.S.
So, Nvidia announced plans to invest up to $500 billion in AI infrastructure manufacturing
within the U.S. over the next four years, marking a significant shift in its supply chain
strategy to meet surging demand for AI chips and supercomputers.
So the move coincides with U.S. President Trump's ever-changing tariff policies,
which initially imposed steep levies on imports from Taiwan and China,
but recently exempted chips in other tech products,
easing concerns for companies like Nvidia and Apple
that rely heavily on overseas productions.
So, Nvidia will partner with Taiwan superconductor in Taiwan semiconductor in
for chip production and with Foxconn and Wistron in Texas
for supercomputer manufacturing, aiming to achieve mass production at these facilities
within 12 to 15 months.
So by using digital twins of factories and advanced robotics for automation,
Nvidia hopes and plans to streamline operations and enhance efficiency in its U.S.-based
facilities demonstrating how AI technology can transform the manufacturing process.
So yeah, if you're wondering like, okay, what the heck does this matter?
Well, so many big companies and all the AI systems that we use, like ChadGBT, Google, Microsoft,
everyone else, they're struggling to keep up.
with demand, right? So essentially, everyone's looking for more compute. This is a pretty big move
from Nvidia to bring more kind of AI power to the U.S. And then last, but definitely not least,
Open AI has launched a new family of models with the GPT 4.1 series. Probably the big
headliners there is it now has a million token context window, but right now at least,
It is only available on the API's end.
So only for developers right now.
So Open AI has launched its new family of models GPD 4.1 as a major upgrade to its previous models offering advancements in context processing, reliability, and cost efficiency.
But like I said, you're not going to find it.
If you go to chat gbt.com, it's not there.
At least right now, OpenAI did not announce any plans for it to live on the front end inside chat chad chpd.
and is only available for developers on the back end.
But let's talk a little bit about the model
because some pretty, pretty impressive specs here.
So GPT 4.1 introduces a 1 million token context window
far surpassing GPT4O's previous tops on the API end,
which was 128,000.
So that's big.
So, you know, Claude and Gemini and others were really beating Open AI
historically in context window, right, but not anymore.
So pretty big news there.
And then unlike previous models integrated into chat, GBT, like I said,
GPD 4.1 is exclusively available through OpenAI's API, making it a tool tailored for
developers rather than general use.
The performance is pretty impressive across coding, instruction, following, and complex reasoning
tasks.
What's also important is OpenAI has said some of those improvements have also been rolled out kind of under the hood to its GPT 40 model.
I would assume it was the late March update that there wasn't a lot of updates about.
And there are now three new varieties.
So there is GBT 4.1, kind of the full version.
GPT 4.1 MIDI, which is more affordable and compact.
And then GPT 4.1 nano.
Yeah.
the first time, you know, Open AI has gone at Nano, and that is their smallest, fastest,
and cheapest model. Yeah. If it's, it's, as if it's not hard enough to already understand these
models, now we have two variety of small ones. Yeah, if you thought Mini was small, no, now apparently
mini is medium and Nano is small. And then some sad news for some old school, you know,
if you like some of these old models, Open AI is planning to face out older models like the
OG GPT 4 by April 30th.
And then also somewhat surprisingly, OpenAI announced they'd be phasing out GPT 4.5 preview by July 14th to focus on the more efficient 4.1 lineup.
Also, this release coincides with a delay in GPT5's launch now expected in a few months as OpenAI
navigates some integration challenges.
And yeah, so FYI.
Obviously, Open AI has changed course a couple of times.
They essentially said, hey, we're going to stop releasing non-reasoning models.
And GPT5 is going to be more of a hierarchy or a system.
So they said, yeah, we're not going to be releasing a lot of new models before GPT5.
And here we are.
So, all right, let's get into it.
A lot more on those stories on our website at your everyday AI.com.
What's up?
Livestream, crew.
Yeah, if you listen on the podcast, come join us sometime on a live stream.
You know, when I have guests, we take questions.
Sometimes I ask you all things.
So thanks to everyone for joining in.
George from YouTube, Big Bogey saying GPD 4.1 is a coding powerhouse.
Yeah, it is already early benchmarks.
Trade printing here from YouTube.
Thanks for joining us on the LinkedIn, Kimberly and Dennis, Allison.
Thank you all for tuning in.
But let's just get straight into it.
Has Anthropics Claude lost its edge?
It's Tuesday, y'all.
I'm going to take a sip of my coffee.
And let me know.
Should I crank this up?
It's been a while since I really brought it on a hot take Tuesday.
I'm a little tired, but live stream audience if you could,
leave me an emoji or two.
Should I be one fire emoji?
Should I be kind of nice?
Two fire emoji.
Should I bring the heat?
Or three fire emojis?
Burn, baby, burn.
I mean, I don't know.
One thing, and let me tell you this,
I tell you all the truth.
I do, period, right?
As an example,
if you would have asked me 18 to 20 months ago,
hey, Jordan, what are your thoughts on Google Gemini?
I'd say, eh, don't use it.
Ask me today, Google Gemini is the king of the hill, right?
I do think it is Google and Open AI now going jab for jab.
But I tell you the truth.
Right.
So I'm not going to hold back if you all want a little bit, a little bit of fire.
All right.
Rolando here is saying to crank it up.
Fred, all right, Fred, Fred, thank you.
Fred's like, all right, Jordan, be nice today.
He wants me to be kind of nice.
Allison here, throwing in some dynamite.
That's dangerous.
All right.
All right.
We'll see.
We'll see.
I don't want to offend anyone.
Because let me say this.
Let me say this.
Claude is still one of the most impressive pieces of AI technology ever created.
All right.
Period.
So I don't want to overlook that.
But what I've found is I've been using Claude less and less.
I would say probably nine months ago,
Claude probably accounted for about 25% of my usage.
It's probably down to about 5% now.
I'm finding it hard to find actual use cases for Claude.
And I'm talking about on the front end, y'all.
So I'm not talking about on the back end.
I know that Claude 35 has historically been,
you know, one of the most used models if you look,
on like open router.
I know that Quad 3-7 is still popular for developers,
although not,
it's not the most popular anymore.
It's not the most popular anymore with Gemini 2.0 Flash and Gemini 2.5
Pro.
It's really not.
But this has been a long time coming.
So back in September, back in September,
if you want to go listen to this,
what episode was this year?
351.
All right.
So I told you all back in September, three reasons businesses shouldn't use Anthropics Clawed yet.
And this was after like a year, right?
This was a year of me being hesitant.
So what a lot of people don't know, people are like, okay, Jordan's just some random guy that, you know, jumps on a podcast and talks about AI.
Well, yes, on the surface.
Right.
On the other end, I do a lot of things that you all don't see on this show.
consult big companies, companies with tens of thousands of employees.
I work with research organizations.
They reach out to me, big ones, big main ones.
And they're like, hey, Jordan, can you help us better understand generative AI?
So it's much more than, you know, this little podcast.
Although thank you all for listening, you know,
and making everyday AI a top 10 tech podcast in the U.S.
But I'm talking with a lot of businesses, a lot of things that you don't hear.
And it's not just me.
Big enterprise companies have always been hesitant to use Claude at scale.
All right.
And it's been a long time coming.
I even said three big reasons.
This was back in September that I said,
Claude was in trouble.
And enterprises shouldn't be using it yet.
Number one, there is no enterprise access.
So I'm talking about on the front end, right?
So keep that in mind.
Everyday AI, it's for largely non-technical people.
Right.
And I'm talking about logging on to, you know, claw.aI.
Or I'm talking about logging on to Gemini.com, chatjpd.com.
Right.
Using this on the front end with your team.
One thing I'm a huge advocate for, if you listen to the show, is having your AIOS, right?
Your AI operating system.
Your team needs one, right?
In addition to whatever your company may be doing on the back end, you need a
front-end AI operating system where you and your team collaborate to get work done.
No internet access.
Claude went the first two years without having internet access.
They just rolled out internet access about a month ago.
Okay.
Very limited third-party integrations, all right?
Google technically on the front end has not a lot of third-party integrations,
but because they're Google, right?
because they have, you know,
anything that could be a third-party integration,
they essentially have in-house, right?
Google has like a trillion of their own products, right?
Extremely limited third-party integrations.
It's improved since September,
since I had this show, episode 351.
And then I said extremely restrictive testing tiers.
So both free and paid,
I'd say the one biggest thing that's been knocking Claude off
from real business adoption is you can't.
can't even go and test it.
You like, if you have a paid, even a paid plan of Claude, right?
And you're like, all right, you know, let's go test this.
Let's see if this is right for our business.
You know, you're paying the $25 a month or whatever.
There's been, and I'm not exaggerating, hundreds of cases because I use large language models.
I mean, it varies.
I don't know, anywhere from four to 12 hours.
Recently, it's been a lot of 12-hour days using large language models, right?
It's so easy on a paid plan to hit your rate limits on Claude,
I kid you not, within 10 minutes.
It's happened to me hundreds of times where I will hit on a paid plan,
the rate limit within 10 minutes.
Yes, I'm generally working in multiple tabs.
If I'm in Claude, I'm working with long context windows, yes.
I can't tell you the last time I've hit chatGBT limits, right?
doesn't happen.
Gemini doesn't happen.
Claude has been extremely restrictive.
And I think that was a major misstep early on.
How do you expect, aside from, you know, appealing to your, to your core audience,
which we'll talk about because I think they're losing space there, right, you know, coding,
development, software, engineering, et cetera, right?
How are you going to appeal to the average business owner, to the average enterprise,
use case when, you know, a company pays and maybe they get a team plan.
I think those rates are about double.
But still, you can't even use the thing when you pay for it.
It is extremely restrictive.
All right.
And another reason why I think that Claude has lost his edge.
It's no longer innovating.
Right.
In the early part of 2024,
even midway through
the year, I still think Claude was an innovator, right? They came out with artifacts, which
when it came out extremely impressive. So if you don't know Claude artifacts, it's actually kind of
hidden. You have to like enable it and then you have to make a call to it. Right. But it's it's still,
and right now, let me be honest, because I still said Claude is still one of the most impressive
pieces of AI technology. There's still great use cases, right? Even though I'm trying to, you know,
you all wanted the flame emojis, I'm not going to totally.
poo-poo on Claude.
There's still some use cases.
I said maybe now it's 5 to 10% of, you know, my use.
But the only thing I use Claude for right now is using 3.7 thinking on artifacts.
That's it.
Nothing else.
Because everything else, Claude is not a top model anymore.
In many cases, it's not even a top five or a top 10 model, which sounds crazy to say.
because nine months ago,
they were that,
that tier one,
right?
If we go back to our ranking tiers,
right, like S, ABC, right?
They were us.
They've fallen.
How the mighty have fallen.
But that's all I really use it for.
But Claude and Enthropic were innovators,
you know,
early on.
So the artifacts,
so,
you know,
that's something that can render code
in natural language.
You know,
you can have it build,
you a business dashboard, you know, games, whatever, right?
And you can run it in the browser.
And then guess what?
Chat GPT and Gemini said, all right, yeah, let's do this as well.
So they came in with Canvas.
All right.
Similarly, Claude was an innovator with projects,
Anthropic innovated with projects, right?
A good way to, you know, organize your chats,
a good way to leave custom instructions and project knowledge, right?
Chat GPT follow suit.
Computer use, right?
Anthropic was innovating.
Although back in October, when that came out,
it was extremely clunky,
extremely clunky, right?
You know, one of the easier ways to run computer use was,
you know, you had to download Docker.
You had to go to GitHub, you know,
work with their repo, which is fine.
But for non-technical people, not that good.
In the rent limits, I did a live show where going over
Claude's computer use, again, very hard to use with the rate limits.
So I think Claude is no longer innovating.
Now I think they're clone chasing, whereas before others were copying their innovation,
now they're copying other people's innovation.
So yeah, like now a lot of the things that you see and that are going to be rolling out,
like as an example, according to some online sleuths right now,
Claude is testing voice mode and all these things, right?
They're just now seemingly cloning features that were popular six months ago, a year ago.
And one of the reasons, I think, is Enthropic dropped the ball.
Right.
Back in September, when I gave those three, those three reasons, those three different scenarios on why I thought enterprise companies and why I told
countless enterprise companies don't use Claude for those three reasons.
They didn't address those.
Those were not secrets.
It was no secret that you couldn't,
it was so hard to literally use Claude on the front end.
They knew that, right?
Their, you know, team is interacting with people online on Twitter.
You know, everyone's been complaining about work rate limits and, you know,
Claude's team has been saying, oh, we're working on it for years.
It's too late.
It's too late.
One of the reasons why, right?
I don't know this, but we've heard stories that as an example, Open AI is losing money, right?
Even CEO, Sam Altman said on their new pro $200 a month subscription that they were losing money,
even though it's been extremely popular.
So I don't know.
This is my hunch, but my hunch has been, Claude has been maybe more profitable,
at least by percentages, then maybe their main and closest competitor to check.
GPD.
But at what costs?
Because I don't think they're growing their user base.
I don't think about it.
You know, I think sometimes, you know, if you are an avid listener to this show, if you're,
you know, an AI nerd like myself, we, I mean, we all live in an echo chamber as well, right?
Outside of our little echo chamber, no one knows about Claude.
Right?
But they could have.
they could have a year ago
if Anthropic would have listened
to its customer base
a little more closely
and continued to innovate
and improve the product
improve usability
I don't think we'd be having
the same conversation today
always have receipts y'all
always have receipts
all right so on my screen
this is January 2025
web traffic
all right
Adobe just introduced an entirely new way to create, bringing the power and precision of its
creative suite into one conversational experience. Meet Firefly AI Assistant, now live in
the Adobe Firefly app, the all-in-one creative AI studio. Powered by Adobe's creative agent,
Firefly AI Assistant lets you start with your vision, just describe what you want, and shape the
outcome as it takes form with the Assistant. The Assistant orchestrates multi-step workflows,
drawing on 60-plus pro-grade tools across Adobe Creative Cloud apps,
including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life.
You can also get started with creative skills, a growing library of pre-built workflows for common creative tasks,
like batch editing photos, creating mood boards, portrait retouching, and creating social variations.
Every step the assistant takes is visible so you can refine, redirect, or take over at any time.
You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at firefly.ad.com.
No one uses Claude.
Comparatively, no one uses Claude.
They don't.
I know I'm going to catch them flat for that.
And I'll be like, Jordan, you're a, you know, a chat GPT fanboy or, you know, jumping on the Gemini bandwagon.
No, I'm not.
I've been using Claude since the day it came out.
I've enjoyed certain features.
I use all.
You know, I've used dozens of LLMs.
And like I said, hours every single day.
Claw's not good anymore.
It's not.
I got more stats.
I got more receipts.
Don't worry, y'all.
You said you wanted some flame emojis, right?
So let's look at total visits in January 2025, web visits.
ChatGPT.com.
3.9 billion with a B.
Yeah, with a B.
Clawed, 76 million.
Gemini, 267 million.
Deep Seek, 277 million.
So, you know, essentially Gemini and Deep Seek are right there with each other
in terms of people visiting the front end.
Perplexity, 99 million.
Y'all, chat GPD.
Let me do some quick, some quick napkin math here.
ChatGPT has more than 10 times the users of Claude, Gemini, Deep Seek, and perplexity combined.
It's my math right there, 500.
All right, almost.
Sorry.
My matkin math was a little wrong there.
All right, so we have, that's about 500 million, 600 million, all right.
So about five times.
So chat GBT has.
five times more users than Claude, Gemini, Deep Seek, and Perplexity combined.
And Claude is in, at least according to kind of online demographic or online website information,
which is pretty accurate, right?
I've been using these different SEO tools for 10 plus years.
They're very accurate.
No one's using Claude.
Hot take, ready?
It's been less than two months since Claude released its latest.
model in Claude 3.7.
Claude Sonnet 3.7.
And it already feels antiquated.
So they announced it February 24th.
Claude 3.7.
And let me just call this one out, right?
They made a big deal of Claude being, you know, the world's, you know, first hybrid model, right?
So, you know, when you think of old school transformers and then you think of these, you know,
quote unquote new school models that think in reason under the hood, I don't know, to me that
seem like a marketing gimmick from Anthropic, right?
Why?
Well, you have to actually, if you want to use that extra thinking, right, you have to go in
and you have to click the button.
So is it actually a hybrid model?
I don't know.
I'd say not.
So now I think you also have Anthropic falling down this trap that Google fell in in late
2023 where they're getting caught up in the market.
in not listening to their users in shipping new, capable, powerful models.
But Claw 3-7s-on-it feels antiquated because since that time,
we've already had multiple updates from OpenAI.
We've had multiple updates from Google.
We've even had multiple updates from models that I'd say never use like Deepseek, right?
If you care about your privacy, don't use it unless you're, you know, downloading it and fine-tuning it locally, right?
But don't use Deepseek on the web or,
their API if you care about your data.
If you are a business, don't do it, especially in the U.S.
All right.
But anyways, how are we at the point now where a model that is not even two months old
feels antiquated?
That's where we're at.
And I don't know if Anthropic can keep up.
Like I said, they're very innovative to begin with.
They're great researchers.
You know, obviously, I think they are a world leader in terms of, of
AI safety in terms of ethics, right?
All of those things.
But in terms of like, okay, are they just going to be more of a research arm that kind
of drops AI models?
Or are they trying to actually dominate?
Are they actually trying to be relevant?
Are they actually trying to be one of the top large language model makers in the world?
I don't know.
I was personally very underwhelmed with Claude 3.7 sonnet, their newest model.
even the thinking variation when you have to toggle it on.
I know a lot of people, I was reading, you know, online, you know, a lot of people are using,
are using it inside like George here on YouTube says, you know, he says,
Claude seems lazy in windsurf and cursor, but it is not when you use it in an app.
Yeah.
So I know a lot of people, yes, Claude, up until, you know, a week ago when Google said,
oh, wait, Claude, you are no longer.
relevant because we're dropping Gemini 2.5 Pro, which wipes, wipes all the way the competitive
advantages that Claude 3.5 or Claude 37 Sonnet hat, right? Google just said, yeah, we're,
we're going to knock you off this pedestal. You're not going to compete. Google straight up
wiped them, which is interesting, right, because Google is, you know, has invested, you know,
but they're still technically competitors in some regards as well.
let me tell you what I mean.
And here's my hot take.
I think there's a lot of Twitter talk and hipster hype
when it comes to Claude 37 or Claude 35.
But I care about business utility.
Anthropics lost its edge there.
I care about benchmarks.
I care about real human usage.
Claude's not competing there anymore.
And like I said, I think one of the biggest things that's happened
in that I would not want to be working at Anthropic right now.
Jevini 2.5 Pro and 2.5 Flash,
I don't know if I'm being honest,
unless Anthropic has been sitting on a world-changing model,
I don't know how Anthropic is going to compete
against Jevonai 2.5 Pro and Gemini 2.5 Flash.
Good luck.
I know, you know, a lot of people have said,
oh, well, there's still, you know, Claude 37 Opus, right?
you know,
intrafate Claude had kind of these three tiers of models.
They have their small,
haiku,
their medium sonnet,
and their big one opus,
and they haven't updated opus in a very long time.
So everyone's like,
oh,
you know,
Claude 37 opus or,
you know,
Claude 4.0 will,
I don't know.
I don't know,
because there's also rumors,
even though Gemini 2.5 Pro
just went generally available like 10 days ago.
There's already rumors that Google has a much better
and more capable model
that they're already testing on the L.
am chat botterina.
I don't know how Gemini is going to compete against Google.
All right.
Got receipts as always, y'all.
I have receipts.
Yes, similar web, Dennis.
Thank you for asking.
That's where that data was from.
All right.
Let me know, y'all.
Why is your audience?
Am I wrong on this?
But let's get quickly to the receipts.
All right.
I'm not going to make you wait an hour for this one.
I'm going to go through quickly.
Because the proof is in the pudding, y'all.
The writing's on the wall.
All right.
So let's look at artificial analysis.
So a great third party unbiased website, right, that does benchmarks.
Because one of the thing is when companies put out their benchmarks, they cherry pick.
There's dozens of different benchmarks.
So, of course, you know, when, you know, these AI labs put out their models, they choose,
okay, out of these 50 benchmarks, here's the eight that we're going to put on our website because we look great on this, right?
So I always look at Elo scores.
We're going to talk about that in a minute here from Ella Marina and look at third party benchmarks as well.
So intelligence.
This is from the artificial analysis intelligence index.
Gemini 2.5 Pro in the lead.
Second, 03 mini high from OpenAI.
Then you have the two variations of deep seek.
And then you have the new version of GPT 4.1.
Y'all, I'd account.
Claude 3-7 is number 8.
in terms of intelligence on this third party benchmark.
Let's keep going because you're like, okay, what about humans?
Humans probably prefer it.
Okay.
So, Elo scores.
Let's talk about that.
That's head to head.
You put in a prompt on LM Arena on the Chapot Arena.
You get two outputs.
You don't know who they are.
You say this one's better.
All right.
There's been millions of votes.
Guess what?
Total Elo's score.
Claude is not a top 10 model.
That's when I, like, I know it sounds crazy to say,
but you have to ask the question.
even if you ask it rhetorically, is Claude no longer a state-of-the-art model?
I don't know, y'all.
In so many benchmarks, in so many now ELO categories, overall ELO, they're not a top 10 model.
Gemma 3, which is a small language model from Google, has a higher ELO score than Claude 3.7.
Let me say that again.
A small language model.
Not a large language model.
Humans prefer the outputs across millions of votes compared to Claude 3.7.
Google has, let's count it.
One, two, three, four, five models.
Five different models that humans prefer over Claude 3-7 sonnet.
I don't know.
So, I don't know, is my hot take, very hot take when I said,
Hey, Anthropic has lost its place atop, right?
Gemini 2.5 Pro, higher.
Let's see, we have Gemini 2.0 Flash thinking, higher.
Gemini 2.0 Pro experimental, higher.
Gemini 2.0 Flash, higher.
And then there's small language model, Gemma 3.
My gosh.
All right.
But you might be saying, all right, Jordan, well, people use Claude for certain reasons, right?
They use it for creative writing.
Claude's great at that.
They use it for coding.
in software development.
Claude's great at that.
That's an old narrative.
Literally, that's an old narrative, right?
Especially the creative writing thing.
I think essentially, right, you know,
a bunch of stuff went viral online,
like maybe a year and a half ago
about how bad chat GPT
and Gemini were at writing content
and Claude was just so much better.
All right, well, let's look at those two things.
Let's first look at creative writing.
Okay.
Oh, where's Anthropic?
Oh, the bottom of the list.
Again, not top 10, Elo in creative writing.
That's why I'm saying.
I think right now it's a lot of Twitter talk and hipster hype, right?
Oh, it's cool the light clawed, right?
It's like, oh, you know, I see you wearing that name brand chat GPT.
Oh, I see you with that mainstream Google Gemini.
I'm over here, Prompton, with that Claude.
man no why why not a top 10 model when it comes to creative writing which everyone thought it was
amazing at it was a year and a half ago um don't lie with me y'all this is this is millions of people
have voted this blindly guess what else not a top five encoding either it's not claude a 3.7 sonnet with
thinking is not a top five model for coding.
Guess what is?
Guess what's at the top?
Open AIs 01.
Their O1 preview, their O1 Mini, Gemini 2.0 Flash.
And I do, I do believe once Gemini 2.5 Pro is on here and gets enough votes, it'll be up there as well.
But not a top five model in terms of coding.
So what do you want?
What do you want?
I don't understand.
Why are people still using Anthropic?
Like I said,
maybe you have one or two use cases
that you're happy with it, right?
If I'm being honest,
the only thing I use it,
like I said,
Claude used to be maybe 20% of my usage.
I'm a heavy large language model user.
Like I said, maybe it's 5% now.
I'm only using it because there's certain things
and artifacts that Claude does better
than Google's Canvas and OpenAI's canvas.
but it's always like I'm doing it all at the same time anyways.
I'm running the same thing in all three of them.
And sometimes I'm like, okay, yeah,
Anthropics is a little bit better here.
All right.
So maybe you're like, oh, it's fast.
It's affordable.
It's not fast.
It's not affordable.
It's not, you know, when you look at speed,
and this is from artificial, artificial analysis,
Gemini 2.0 Flash and Gemini 2.5 Pro are the fastest models
followed by GBT40 in 03 Mini from OpenAI.
Again, claw not in the top five when it comes to speed,
which is output tokens per second.
So it's not fast.
All right.
And that is the non-thinking model, by the way.
All right.
It's terrible on price.
It's terrible on price, right?
Which I still don't understand why people are so deep-seek drunk.
Like deep-seat is not cheap anymore, right?
It's not.
When it first came out, it's like,
Yeah, this is cheaper.
Okay.
Well, Gemini 2.0 Flash is wiping the floor with everyone when it comes to price.
Lama's new Lama 4 Scout, GPD 4O Mini, right?
There's just so many faster, better, cheaper models than Claude.
So I don't, it's definitely lost its edge, right?
I think there's one more thing I wanted to pull up here.
Okay, it's coming up in a slide here.
Because this is also telling.
So looking at the intelligence versus price.
So it's not like you're getting a good.
bargain either if you're using Claude on the back end right you're not you're not so on the front
end humans aren't preferring it on the back end you're not necessarily getting what you pay for
again this is intelligence versus price so there's a little quadrant here so you want to be on the
upper the upper left because that means it is cheaper and smarter Claude is on the right side
and Claude 3.7 sonnet is actually on the bottom right all right uh not
necessarily fast or affordable.
And here we go, everyone's like, oh, it's the best coding model.
Guess what? It's not artificial analysis.
Their coding index.
This one is very interesting.
Claw 3-7, the thinking model, ready?
The thinking model is in fifth place.
Guess what's ahead of it?
The new model that was just released from OpenAI,
GPT4-1.
But guess what, y'all?
This is the mini-version.
The mini version of Open AI's new model.
Not only is it a non-thinking model, right?
Because normally if you use these thinking models, these reasoners,
they code much better, right?
Especially when you're working with very complex tasks in long token, long-contacts windows.
So not only is this GPT4-1 model.
It's not a thinking model and it performs better on artificial analysis coding index,
but it is the mini version.
It is the mini version.
So I don't know, y'all.
If you're still using Quad37 Sonnet, let me know why.
Let me know why.
I'm very curious.
Like I said, I know a lot of people on the software engineering side,
on the development side, they love it, right?
using it with cursor, using it with windsurf,
using it inside all these different IDs.
I also don't understand why on that.
Now with Gemini 2.5 Pro with Gemini 2.0 Flash,
and now these new models from OpenAI that they just announced,
I don't understand it.
I honestly don't understand how Anthropic has gone,
in Claude has gone from that top tier, right,
state of the art world leading model to kind of irrelevant.
So a lot of people are like, oh, well, you know,
Claude just released a new plan, Jordan.
You're really harping on them for, you know, these rate limits.
You can just pay more and use it way more.
Okay, well, why?
If it's not a top 10 model, right?
Yeah, Claude just came out with their Claude Max, right?
So you get higher limits, you know,
if you're paying $100 a month or $200 a month,
which let me just call this out, you know, because people are like, okay, Jordan, this solves.
Well, you don't get anything more powerful for that $100 or $200 a month.
You don't get more features, right?
So when Open AI as an example announced their $200 pro plan at the time, that was the only way you could access SORA.
That's still the only way that you can access 01 Pro.
And then you get unlimited everything.
Unlimited.
This is not limits.
Or sorry, this is not unlimited.
You can still go on the front end and pay $100 or $200 a month.
You don't get new features.
You don't get new models that are exclusive to that max plan.
You just get slightly better limits.
But here's a concerning one, y'all.
This one's kind of concerning, ready?
This is from Anthropics website, talking about their new plan, ready?
Talking about their message limit on the new max plan.
your message limit will reset every five hours.
We call these five hour segments a session,
and they start with your first message to Claude.
Please note that if you exceed 50 sessions per month,
we may limit your access to Claude.
Each session includes any messages sent within five hours
from the first initiated chat.
So we expect it to be.
fairly generous for our users.
Like, gosh, I don't know.
How tone deaf is this, y'all?
Come on.
So let's just say, in theory, let's say you're a very regimented person, all right?
Like I am.
So this is why I can't even use Claude on the current paid plane, but even if I pay $100 or $200 a month.
So let's say I use Claude in the morning before my show to help plan it.
All right.
So let's say 6 a.m.
And then I use it at noon, midday.
All right.
And then in the evening, you know, I use it again.
So let's just say I just do a couple of props, a couple prompts a day.
I do it at, you know, 6 a.m.
I do it at, you know, noon, and then I do it at 6 p.m.
6 noon, 6, right?
Couple prompts a day.
Paying $100 or $200 a month.
In that scenario, even if I'm only doing a couple prompts, right, paying $100, $200 a month,
I might get cut off from my pricey $100, $200, $200,000.
$100 a month plan.
That's what they're saying, 50 sessions a month.
So if I do that, if I use clawed three times a day that are more than five hours spaced
apart, I could, in theory, in three weeks, get shut off.
And I might not be able to use their paid plan for the last week of the month.
In theory, that's what it's saying here.
How tone deaf is that?
I don't understand.
If I'm being honest, when I saw that, I'm like, come on, Anthropic.
You have, I don't know.
How many billions of dollars have you gotten from Amazon?
I lost track, $6 billion or something.
This is why people aren't using your service.
Humans don't prefer it.
Benchmarks don't prefer it.
And for those people that are actually still finding utility in our power users,
you're slapping them in the face.
Get real.
All right.
Hot take.
Let's end it here.
Can Claude recover?
I honestly don't think so.
I don't think so.
Here's, again, this is just reading reports.
You can't knock Anthropic for putting safety first.
You can't.
They put out world leading research.
I do think when it comes to, you know, safe AI.
They are a leader in that.
But no one's paying you for your research.
You're not competing to be the best frontier.
AI lab with the best research, with the best safety.
This is a race.
This is the Wild West.
That's what it is.
There's no rules when it comes to AI.
Anthropic is playing, I'd say, the wrong game.
They've alienated their power users.
They've stopped innovating.
And I think that has caused them to now face.
an almost insurmountable challenge, right?
Let's just say, as an example,
Claude had their 4.0 model ready,
and they probably have had it ready for a while.
When you see these new drops from OpenAI, right,
their 4.1 models,
the smaller versions,
when it comes price per performance,
amazing.
Same thing with Google Gemini 2.5.
I don't think, if I'm being honest,
right where nine to 15 months ago I'm like yep it's going to be a three team race it's not anymore
yes you have to pay attention to open source you have to pay attention to chinese models but
most enterprise companies here in the u.s aren't going to touch many open source models for different
reasons and they're not going to touch Chinese models for obvious reasons data security data
privacy and not sending all your business IP straight to china from a u.s perspective
Anthropic was primed
to compete in this three-team race.
They were primed to be a leader.
But now they're a second-tier company.
They are.
That might be harsh.
You wanted my honest take?
That's not just me.
Is that my personal usage?
Sure, is that my personal experience?
Yes.
But I showed you the receipts.
Users aren't using it.
Number one.
They're not competing on bench.
Marks. Number two. Humans don't prefer it. Number three. So can Claude recover? I don't know. I probably say no.
All right, y'all. This was helpful. You wanted some hot takes? I try to bring it. Try to bring it a little bit.
So, you know, talking a little bit has Anthropics Claude lost its edge? What happened? And are Google and Open AI too far ahead?
Simple answer? Yes, Anthropics lost its edge. And yes. Open AI.
Google, at least today, are way too far ahead for Anthropic to catch.
That could be wrong, but the only way you're going to find out is by continuing to tune in.
Maybe I'll be eaten a big helping of, you know, humble pie, you know, in 2026, but we will see and find out.
All right.
Thank you for tuning in, y'all.
If you haven't already, please go to your everyday AI.com.
If this was helpful, please share this with your network, tag of friends, someone that needs to hear this.
if you're listening on the podcast, appreciate your support as always.
Reach out to me.
I always lead my email in my LinkedIn there in these show notes.
So please reach out if you have thoughts on this.
Let me know in the live stream comments as well.
Then go to your everyday AI.com.
Sign up for the free daily newsletter.
Thanks for tuning in.
We'll see you back tomorrow and every day for more everyday AI.
Thanks y'all.
Meet Firefly AI assistant.
Now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words in the assistant.
handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps,
including Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind.
Go break some barriers and we'll see you next time.
