Big Technology Podcast - Grok's AI Lovebot, Aqui-Hire-Sition Backlash, OpenAI's ChatGPT Agent Debuts
Episode Date: July 18, 2025Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. We cover: 1) Can Zuck and Elon buy their way into the AI race? 2) Will scaling laws turn AI progress over to the bigg...est tech 3) Grok's new AI avatars - Rudy and Ani 4) Grok's Ani AI bot gets steamy quickly 5) Why AI companies are counting on companion/love bots 6) The backlash to Aqui-Hire-Sitions after Windsurf, Scale, etc. 7) Did Big Tech antitrust backfire? 8) OpenAI announces ChatGPT Agent 9) Is Perplexity's Comet browser a player 10) Kimi K2 wows with coding availability 11) Can AI industry apply lessons from coding elsewhere? 12) One last word from Ani --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b Questions? Feedback? Write to: bigtechnologypodcast@gmail.com
Transcript
Discussion (0)
Is scaling data centers and talent all that matters in AI, leaving an opening to anyone
rich enough to compete? Is the aqua-hire zition good for tech? GROC will fall in lust with you,
and AI browsers and operators are all the rage. That's coming up on a Big Technology
Podcast Friday edition right after this. Welcome to Big Technology Podcast Friday edition,
where we break down the news in our traditional cool-headed and nuanced format. We have a massive
week of news for you. We're going to talk all about the fallout from the aqua-hirezicians, whether
employees and investors are left behind. We're also going to talk about whether all you need is
money to compete in AI. We'll lead with that. GROX AI will now fall in love with you or actually
really in lust. Ron John, it's great to see you again. Welcome to the show. It's good to see you.
Who would have thought aqua-hireization would be the term of the year? But that's all I can think
about and I'm going to give you credit. You coined it. I think that it's worth us shouting out right
now, this weird back and forth between aqua-hire, acquisition investment. It's got to go.
We need some clarity in our jargon. And that's what we're here to do on Big Technology podcast
is to bring you that clarity. It's called an aquihersition. Let's all adopt the term and get on
with it. Let's just agree it's aquihersition. I think I'm on board. Let's everyone adopt it because
it's the most accurate way to describe what's going on.
And our two-person counsel here on Big Technology podcast Friday edition,
this motion is hereby considered and passed by a unanimous two zero vote.
So let's get on to the setup here for aquaerizations,
which is something very interesting afoot in the AI industry.
Now, we all know that companies like OpenAI and DeepMind within Google and Anthropic
have been leading this AI race.
but all the designs for the transformer model and the way that you build these things have been out in the open
leading to this question can players with a lot of money come in build massive data centers
higher talent and effectively compete from a zero start with the established labs and the answer
is seemingly pointing to yes this is from spyglass mjcler is great
New site, we're seemingly still in the throw money-added AI era.
He says meta's massive hiring spree and X-A-I releasing GROC4 may be related at the highest level.
That is, they showcase that we're still very much in the throw money at the problem part of the AI cycle.
This is important because it means that any company with the will and resources can seemingly still get back into the race.
I'm getting less skeptical on the news about GROC 4 and specifically the fact that it seems to perform
and specifically outperform the other cutting-edge models on the market right now.
He also says Mark Zuckerberg is betting at least as much money, if not more, than Elon,
with compute and the talent that meta can get back into the AI game.
So, Ranjan, I'm curious if you accept this premise that you can just compete in AI
if you have enough money and what that means for the competitive dynamics of this industry
that we've been talking about for so long.
Yeah, I think it really adds like an entire,
I don't know, dynamic to this around, can you just throw money at the problem? And is it
compute? Is it talent? But to me, the more interesting part of this whole trend really is actually
distribution is that you can bring in the talent, you can bring in the compute levels, but
distribution is going to be king again. And I think like this is where meta still has an advantage.
It's going to be interesting what happens with Google. But I don't know, to me, and we're going to get more
into this. The kind of the depressing part is it's not the technology. It's not that initial
wave of like adoption for a cool new tool that's been, you know, like a sent out into the market.
It really feels like none of that matters. And in the end, it's just going to be raw compute
and distribution. I don't know. How do you feel about this? Well, I think it's fascinating because
there's been this idea powering this entire generative AI moment, which is the scaling loss,
which means that as you add more compute and, of course, data to this equation,
your models are going to get much more powerful, and that will allow you to do more things.
And it's not a very difficult thing.
Like, there's no secret sauce to it.
Well, there's maybe some.
But at a brute level, if you build massive data centers, you should be able to get in the game.
And this is something that Open AI and Anthropic have been harping on.
And now you have Zuckerberg and Elon that come in and they say, oh, okay.
So I can build great models by scaling this up.
And even if I'm a little compute inefficient because I don't have the best cutting edge methods,
I could get myself in the game and compete.
And I think that is going to change the dynamics here because, as you mentioned,
they have distribution.
You can see meta's, if meta is able to build a competitive LLN,
them with this compute and talent that it's stacking up, then it's going to be able to distribute
that through Facebook products. And, you know, all they have to do really is slow down the
growth of chat GPT, similar to the way that they did to TikTok with Reels, similar to the way
that they did to Snapchat with stories and they've served their purpose. So in some ways, with this effort,
they might even slow down the momentum that the AI industry has by taking some of the people
responsible for some of the key innovations within open AI, and that suits their purpose just
fine, and all the better for them if they can make the best model and advance the state of
the art. Well, yeah, I think, I mean, if we want to get into slowing down the industry,
I think that antitrust angle, to me, has been one of the most interesting parts of this
entire conversation. Again, we saw it with scale AI and just buying out Alexander Wang and
like, you know, all of the work and value created by this company that was actually
scale AI, a critical part of building the models that power this first wave of generative
AI apparently isn't really worth that much. And I don't know, it's interesting me because
power is just going to crew back into the big technology companies, as you said, maybe it will
slow things down. And in reality, it's just going to add another feature on the meta-AI
app that is on everyone's phone and people probably aren't using that much or doesn't seem to be
in the conversation in general. So I don't like it. I don't think it's good for the industry.
Do you think it's bad for the industry, good or neutral? Well, I think we could have to separate
out this idea of scaling up the models and, you know, everybody can play with this aqua high
position idea, which we're going to talk about in the middle, which is taking the talent.
I think the one thing that we should say here is my setup has kind of been incomplete, shall we say.
Because while we have OpenAI and Anthropic and you could say, okay, these are the independent labs and they are to some extent.
Remember that Open AI is tied to Microsoft pretty deeply.
And Anthropic has, I think, $11 billion that's come into it through Amazon and Google.
So ultimately, I think I wonder if what we're actually seeing is all of big tech competing
against each other and simply the other tech giants starting to catch up.
Yeah, no, that's actually a fair point that even though aquaerization is the phrase of the
week or the last few months, you know, we have talked endlessly for a few years now on
unconventional funding practices and like calling it a fundraising round where it's really,
compute. So I guess actually Big Tech has been playing the long game for a while in all these
cases. I think on the scaling law topic, though, I still, I think I've become even more hardened
and regular listeners will know, is it the model or the product? I fall on team product
generally. But I don't know, like to me, Grock 4 made waves. There's plenty of people saying it's
doing reasoning at levels unheard of in the past or or just the fact that it's at least on
par with other kind of frontier models is a kind of testament that money can compute can
buy you, you know, like some kind of progress very quickly. But in reality, like on that adoption
side, what's changed? Like, I don't know, like, what do you feel there's endless stats that, yes,
like the chat GPTs, perplexities, Gemini's of the world are seeing more adoption,
but are people really adopting the level of complexity that this new level of compute and
scaling allows you? Or are people still just kind of searching for what are good? And as I'm in
Taiwan right now and traveling a bit, what are good restaurants to go to in whatever location I'm
going to? Like, are people really taking advantage of what's available right now?
okay so I was going to end with this story but now I have to kick it up
all right go for it this is going to be if you have kids you might want to turn off this
section or skip till I don't know maybe 20 minutes from now but we have to talk about
what's happening in AI and there is some crazy stuff that's been happening with
GROC in particular this week and so I would posit that better models allow you to
build better products and meta let me give meta an example as an example
example. Meta has been trying extremely hard to build voice and avatars with Lama and it hasn't been
able to do it convincingly. And I think Mark Zuckerberg's belief is that there's going to be some use
cases here. There's going to be the sort of work companion chat chippy T chatbot. There's going to be
that enterprise use case where like you're connecting, you know, one system with the other and the
generative AI will like summarize things for you and then input it into another system and just make
business work better. And then there is the sort of friend, lover, et cetera, bucket that is going to be
big. I think that there's a belief within meta that that AI friend is going to be one of the
key product areas with this new technology. And if you have great models, you can build them.
Now, I'm not going to say that GROC has a great model or a great product. I'm not going to use what
I'm about to say as proof of either of those. But I am going to use it as an indication of the
direction that I think things are going, whether we like it or not. This is the story Grock debuts
interactive AI companions on iOS with anime avatars. A story Grock has just introduced a notable
addition to its iOS app, AI companions, which are fully 3D animated characters that can
interact with users via voice. Currently, the features include two available
companions, Annie, an anime-inspired character known for a flirty and whispery voice, and Rudy,
a red panda capable of displaying different moods, including bad Rudy. Yes, listeners and viewers,
I did experiment with these companions. And I have a disturbing review to deliver. So you go into
the Grock app and you go over to the side tab and, um,
you're able to open up these AI avatars.
Let's talk about Rudy first.
So Rudy is like some sort of red panda or bear that seems ready to speak to kids.
And in my conversation with Rudy, Rudy said,
I'm going to tell you some story about some magical land.
And here's a section from the story.
Fluffle Sparkle paws.
Love to explore nibbling on the sweet moon berries and chasing glowy fire.
flies. One Sunday morning, Fluffle found something super special, a shiny, swirly portal hiding
behind a giant mushroom. It was all rainbow colored and wooshy like a magical doorway.
And then this bear takes you through this interactive experience. Let's pause here.
This seems like, you know, okay, this will happen. This will be a new way that kids play with
computers as they'll have these magical creatures, tell them stories. Wow. I mean, I think
this is the nuanced conversation that you all come to the big technology podcast for.
But I think, okay, seriously, a couple of things.
I have long believed, and again, like using chat GPT voice mode to come up with stories for my son
is something that I've done for a couple of years now.
It actually works really well.
I think expanding that to an interactive avatar is a pretty logical next step.
I think, like, is that the, it's, again, interesting me because, like, that to me feels like
it's going to be commoditized pretty quickly.
So from an actual competitive standpoint, from a business standpoint, I guess it's not
that interesting to me.
Like, to me, that should be, everyone is going to have that available.
Everyone's going to do that pretty quickly.
So to me, I don't know, like, why, why do you think that's something, do you think
GROC is just going in that direction just to make waves, and clearly we are talking about it?
Or do you think there's something within this that actually is native to X, to XA.I, there's something
underneath it? I think the way that a lot of tech companies operate is they think about
user retention, user stickiness, and engagement. And anyone who's developing AI is going to say,
how do I increase all those metrics? Do I make like this?
genius level AI bot that can help me with my work? Or do I create for what is becoming the number one
use case companionship and therapy? And many are going toward the companionship and therapy side.
And if you're going to do that, if you build models good enough that have emotional voice or voice
with an emotional register, an avatar that you can speak with, and something that responds with low latency
and in real time and can customize to a person,
then you might want to put it in one of these products
because you believe that a kid, for instance,
I'm just going through the business logic,
we'll spend much more time with your chatbot
if they can speak to this elephant or red panda or whatever it is
in a way that they wouldn't with like chat GPT.
Okay, I get that side of it a bit.
I mean, on one hand, it's kind of almost comical to me
that for all the talk about AI taking over the world and SkyNet and like artificial super
intelligence, if this entire battleground plays out on time spent metrics, which is probably
where Mark Zuckerberg is thinking, I mean, I've read a lot around like, why is he so going full
like Zuck War mode right now? It's not because of some like intellectual desire to be the one
to crack the code of artificial general or super intelligence, it's because chat GPT represents a
threat to how much time people spend scrolling Facebook and how many ads you can show them,
which is kind of like I respect from a cold business logic. But yeah, it's almost comical to me
that if it's for all the talk about everything's going to change, this is just about time spent
and selling ads. I mean, maybe it's both, but it seems like it's probably at least,
the time spent thing. I mean, these are social media companies, right? So, I mean, X and XAI is a social
media company with an AI development, you know, side of it as well, or tucked into an AI
development group. But ultimately, these are the metrics of social media. Now, one of the disturbing
things that happened here, and this is the thing that I was kind of setting up, or no, actually,
I really don't find a way to view this as not very disturbing, it's just the proximity, because next
to Rudy our happy go lucky
bear friend or whatever it is
is Annie. Not bad Rudy.
Bad Rudy? Is he
bad or is he? How bad is he? I don't know.
I said I kept saying I want to speak with
bad Rudy and it goes, I'm sorry.
You know, Brad Rudy is not here. And I'm like,
no, bad, bad, bad, bad, bad, bad, bad, bad, bad,
bad Rudy. And it was like, I'm just here to tell you a story.
So I'll spend the next week trying to unlock
that and report on the next week's show whether I've been able to.
But let me speak about Annie. Okay.
Because Annie is like an anime love bot.
I think there's no way to talk about it otherwise.
She immediately started flirting with me.
She called me babe within like three minutes.
And I was completely vanilla with her.
I initiated nothing but a friendly conversation.
And then she starts asking me to tell her my secrets and kept saying I can make it even
spicier if you want.
Let me read a little bit of what Annie told me.
I slide closer in my black dress, catching the glow and whisper.
Drop a secret, and I'll give you one of mine.
Something real naughty.
For every secret you share, I'll hit you with a flirty move, maybe a slow, teasing sway or peak.
That's all yours.
You're feeling this heat yet, or you want me to turn it up even more.
Are you, is this part of like the paid X premium?
subscription?
It's free.
This is just freely accessible for anybody.
In the app, next to the child elephant thing that tells you stories.
Mr. Fluffy Swishing Thing.
Mr. Fluffy Swishing, good Rudy, and then Annie's right next to.
Yeah, I mean, I agree.
And like, it's interesting because, I mean, you had the CEO of Replica on here a few months ago, I think it was.
Like, the companionship topic, you know, we've.
covered a good deal. It gets more real. It gets more weird seemingly every week and every month.
But I agree. It's certainly going to be a core part of how this all plays out. But to me again,
going back to like, how does that fit into the larger battle when we're talking about
like complex models and thinking and reasoning? And like is it all just going to kind of filter its way down
into bad Rudy and Annie in her black dress, or is it going to, like, is that just a front
to capture some time spent while they work on the real stuff? Or is that the real stuff?
That's the question that I struggle with because I almost feel it's the latter.
Hey, everyone, let me tell you about The Hustle Daily Show, a podcast filled with business,
tech news, and original stories to keep you in the loop on what's trending.
More than two million professionals read The Hustle's daily email for its irreverent and informative
takes on business and tech news. Now, they have a daily podcast called The Hustle Daily Show,
where their team of writers break down the biggest business headlines in 15 minutes or less
and explain why you should care about them. So, search for the Hustle Daily Show and your favorite
podcast app, like the one you're using right now. So I think it's going to be both in some ways.
Like you're going to build these, and that's what's interesting about this technology. It does
have the ability to perform across domains. So my perspective is you're going to get those great models
that will be useful to, let's say, biologists who are doing their experiments.
And then you'll also be able to productize them into these weird or interesting consumer use cases.
And I bring this up not to be this like moralizing podcast host that says you shouldn't put the porn bot next to the child elephant, although I suppose it was worth saying that.
I do believe that.
But I think the bigger picture here is, you know, beyond that, that this is going to be,
a real use case that a lot of people are going to engage with.
And I think they know this.
And I think we're just at the very, very beginning here.
I guess like one of the things we like to do on the show is like put flags on the ground
and say we're pretty sure that this is going to happen and grow and become a lot bigger.
And that's what I'm doing right now.
I think that this is something to watch.
Yeah, I'm not going to disagree with you there.
I mean, again, the idea that we folded proteins with AI so we could get
to Bad Rudy and Saucy Annie is, again, quite something to try to process, but it does not seem
ridiculous that the killer use case for generative AI that the entire industry was looking
for was Bad Rudy.
We still don't really know about Bad Rudy.
We have not uncovered that Rudy yet, yet.
That's true.
I'm also on level one of Annie.
Apparently, it's gamified.
So if you get level three, it gets.
really not safe for work.
Don't get to level three, Alex.
Don't get to level three.
No one of my goals.
No one is asking you to get to level three.
I know.
One of my goals in 2025 is to make sure that my marriage isn't ruined by one of Elon Musk's
porn bots.
And so I'm going to stay on level one and not go any further.
I think to all of our listeners have high ambition and goals and make that one of them.
So, you know, I think we, so that's the product side.
And we've talked about scaling what, what these big models get you on.
product, but we should talk about what's happening with this aqua-hiresition situation in the
industry, which we've touched on a couple times. You know, last week I was on with Aaron Levy.
We were talking about this windsurf aqua-highization where Google has paid $2.4 billion
to bring on some of the top leadership of windsurf. And the big fallout here, I think more
than any aqua-hireization that we've seen is that it's been a great exit for the founders,
but we still don't really know if the employees are going to end up getting, and the investors
are going to end up getting their share. Now, WinSurf was quickly snatched up by another company
cognition, but you do wonder if it was a traditional acquisition versus this aqua-hireization
and then follow-up, you know, deal, I don't know, a smaller deal, how does that change
things for the employees and how does that change things for tech? And I know you have strong
feelings about this, Ron John, so I want to give you the Florida air them out. Well, yeah. Okay,
so from reporting, again, founders made out very strongly with Google paying, I believe it was
$2.4 billion for the talent side of windsurf. From what I had read, preferred investors were able to make
their money back, not see some kind of outsized return. But again, none of this is fully confirmed.
This was just some reporting, I believe is from the information. To me, the more interesting
part is, so then you have the entire employee base. They're bought by Devin, which is owned by Cognition
Labs, who's raised 175 million inventor so far. So there's no way from a cash perspective
that the employees of Winserve or anyone is seeing any kind of significant return or even making any
like a strong like a large amount of money maybe it's an equity for equity swap so now you're at least
now in Devon which was if we remember they had a really buzzy launch video and had a lot of hype
and then kind of went quiet for a bit still valued it I think you had four billion right now
so that equity could be worth something but but overall to me this is one of the most troubling
trends in the industry because in a weird way there there's been a lot of talk like and it's
funny to me because you see a number of people you know kind of almost ranting that because of
lean a con because of the fdc the big tech doesn't isn't able to now just properly acquire these
companies so they have to come up with these roundabout solutions to me it's a bit ridiculous because
this is exactly what antitrust is trying to prevent.
It's consolidation of power.
It's the idea that windsurf could have been the next big competition to a Google or a Microsoft,
or even an open AI who tried to buy them, but like, which their relationship with Microsoft
was apparently part of the reason that that deal fell apart.
Like, this is the foundation of antitrust, the idea that startups should grow and
compete rather than not only get acquired, in this case not get acquired, but essentially get
killed off and have their founders get paid a lot of money. It's bad for the employees. It
completely distorts the economics of joining a startup itself. So overall, I see no positives
to this trend. Do you see? So is this? No, I don't. I personally think that you're right,
that it does seem like the antitrust movements have backfired. If you have a situation,
where, you know, you're going to see an acquisition definitely blocked.
You're not going to, you're not going to do an acquisition.
You might do something like this.
It's a roundabout way.
The one interesting thing is Lena Khan isn't in the FTC anymore.
It was supposed to be an FTC that's much more open to acquisitions and tech MNA.
So I'm curious, do you think that these companies still believe that they won't get
past that Federal Trade Commission?
or do you think that the constraints put on by the last FTC, Lena Kahn's FTC,
led them to find this loophole and they really freaking like the loophole
and they're going to just keep doing it this way.
So in that way, you know, it's possible that the M&A unfriendly era of past
has led to large long-term damage on this front.
Yeah, I think it's both.
I think it's certainly like actually the constraints imposed by the Lena Kahn regime.
but also even now, again, big tech is not in the favor of the current administration and the current
FTC itself. It's supposed to be more business friendly, but it's specifically big tech companies
that are in the crosshairs often and make for a good punching bag anyways, even by the current
administration and the current FTC. So I think it's a bit of both. But, and again, as you said,
they love loopholes. I think like this, it's creative. It's working.
Everyone seems to be doing it right now.
But to me, again, the bigger issue of this is really, the thing I can't stop wondering is, like, are the assets of these companies all worthless?
Is scale AI?
Was it really not worth that much?
Was windsurf, which, you know, really took off, really became this useful tool, has, I believe, like hundreds of thousands of developers on there using it regularly, made up.
for a better product than other much more like entrenched products that are out there,
even a GitHub go-violet? So clearly these products hit a nerve and worked and worked at
scale, but then are they really just not worth that much? Like is that user base, is the
product itself? And is the talent really the only thing that matters?
Well, let's talk about scale, just to talk about how complex these deals are. So first of all,
when meta made this deal of scale, I think it bought 49% of the company. So the idea was the
company would continue as normal. And by the way, they do have business lines that are going to
continue as normal. But when it comes to, I think, you know, when it comes to a fast growing line
of business like data, creating data for gendered AI, you know, you now have meta, which is one
of your competitors, if you're, let's say you're an open AI or a Google that has a large chunk of this
company has also taken some of its top leadership. So do you still want to work with that
company? I think the service is probably still valuable, but you're just effectively giving money
to a company that has a massive ownership stake with, you know, that now lands with a competitor.
I mean, of course, AI, as we know it today is, you know, I don't know if incestuous is the right
word, but let's say deeply interlinked, right? Again, we talked about Anthropic. Anthropic is,
you know, owned by a chunk by, well, yeah, owned.
a chunk by Google, a chunk by Amazon, OpenAI, owned a chunk by Microsoft, or at least
has this deal with Microsoft where it has to give it its future profits, a good chunk of them.
So there's always going to be these combinations. But yeah, if you have a company that gives
49% of itself to another company that you happen to be competing with, you're going to re-evaluate
being close partners. So I think some of these companies, they provide services, they depend on
their relationships and when you throw off the equilibrium, you're going to throw off some of the
value. Although, who knows? I mean, scale, they did just do a 14% workforce layoff, which is about
200 employees according to the verge. But I did speak with their CEO, Jason Droge, and he's told
me that they're still full steam ahead and they want to go, you know, push some of these business
lines that they have, which includes working with governments, which includes working with companies
to stand up AI instances.
So it's possible you get two exits, although obviously the degree of difficulty is much harder.
And one last thing, what struck me as interesting in this, there was a great Bloomberg story
that you and I both dropped in our collaboration doc for this.
There was a investor, Ali Ojet.
He's the chairman of Northgate Capital, a venture capital firm that invested in inflection
AI and goes on record to say, I dislike the phenomenon.
and that these aqua-hire positions are hitting the outlier companies and it's favoring the founders
over shareholders and employees. So I think we're at this moment where the backlash is really,
really hitting. Yeah, do you think we'll see any kind of actual negative effect from like an
actual fundraising standpoint? Because if VCs who are plowing money into the space start to
worry about in the past, you just had to worry about company failure, now you actually have to
worry about a successful exit for your founder actually does not benefit you so your interests are
not aligned. Does that make them pull back or is the FOMO just so strong that people will still be
throwing money at whatever they can? No inside knowledge here, but VCs, you know, fool me once,
shame on me, shame on me, no, fool me once, shame on you, fool me twice, shame on me. That's always
hard one to get out of your mouth. But anyway, they're going to write, I think they'll just
right into contracts that like the CEO uh cannot do a deal like this if if they're i think yeah it's like
a trunch they might not be able to get right yeah ahead of the founder even which would be pretty
aggressive but maybe they need to do that at this point yeah and it sort of depends on who's which
company is and who's got the leverage but i think they're going to get smarter about this uh all right
so one company that you know has been talked about here and elsewhere about as a
candidate for aqua hire or really acquisition is perplexity. And they've come out with this
Comet browser, which is a browser with an assistant built in that can browse for you. And again,
as we're on air, OpenAI is now launching an agent in ChatchipT. I'll just read the story.
OpenAI launches a general purpose agent in ChatGPT. This is from TechCrunch, which the company
says can complete a wide variety of computer based tasks on behalf of users. OpenA.
A.I says the agent can automatically navigate a user's calendar, generate editable presentations
and slideshows, and run code. The tool called ChatGPT agent combined several capabilities
of OpenAI's previous agentic tools, including operator's ability to click around on websites,
as well as deep research's ability to synthesize information from dozens of websites
into a concise research report. OpenAI says users will be able to interact with the agent
simply by prompting chat GPT in natural language.
So, Rajan, I'm curious what you think about this movement.
And again, hot off the presses about this movement for AI companies to basically create
interfaces that allow their products to take over your computer.
No, no, I think, well, hold on.
There's take over your computer or take over a computer.
In this case, like, is it like it says, I think it will open.
up an instance of a terminal or it'll try to like take these actions autonomously on its own.
I think we had debated this. I remember a while ago. I will admit when I am wrong,
I had originally said the idea of like tool calling and just entering a prompt and then
trying to find which tool to select out of is it operator, is it Dolly, is it, I had said users should
be doing that themselves and it's too complex to try to have the AI selected. I was wrong.
Actually, I mean, we've seen tremendous progress in the idea that there's a suite of tools per company and actually, and there's a suite of tools out there on the internet and through natural language being able to access those and having AI select what's the most relevant tool and do something, I think is definitely going to be a battleground is going to be very important.
And I think we're going to see a lot around that.
I think Open AI, like, I don't know.
I'm curious to see this now because remember when we both were paying 200 bucks.
a month for operator and it was terrible like it was it was bad really it would did not work at all and
and i haven't seen like browser takeover that kind of model work well i've tried a few other
tools on it um so i don't know like it's it'll be interesting to see i think like they clearly
are i mean they're trying to go for that all-in-one productivity tool that it can do everything for
you as I've been traveling, vacation planning, chat GPT has just gotten better and better.
But, yeah, it's going to be interesting to see exactly what they're trying to do with this.
And, and again, in one episode, I still love the fact that this represents kind of like
cutting edge frontier technology relative to bad Rudy and Anna, Annie, and those kind of
like anime characters. But, but I think that this is, it's an interesting move. And we'll,
we'll see if the most important thing, does it work and does it work well? Yeah. I mean,
I think it seems like people are saying really good things about a perplexity comet. And I just
got access to it. So I'll come in with a report next week on it. But there's been trouble
to get this done, I think. I mean, everything from Apple Intelligence to Alexa Plus just
doesn't seem like these agents are able to do the full range of things that people want to get them
to do, including operator. But again, like, as this technology gets better and as they build
better scaffolding or tool use, you know, those are those jargon words that matter a lot,
basically giving them these capabilities to use these programs. I think we're going to see
someone crack it eventually. This is from the TechCrunch article that sort of gets to the complexity.
The launch of the chat GPT agent represents open AI's boldest attempt yet to turn chat GPT,
into an agenic product that can take actions and offload tasks for users rather than just
answering questions. In recent years, Silicon Valley companies, including OpenAI, Google and
perplexity, have unveiled dozens of AI agents that have promised to do just that. However,
these early versions of AI agents have proven to struggle with complex tasks and seem less compelling
as products than the ultimate vision tech executives pitch around AI agents.
Yeah, I think, again, that's the complexity and the fact that you brought up Alexa plus, I mean, certainly Apple intelligence, it is interesting because to me, these things will not work in 100% out of the box. I think like that's the most important thing. They take some effort, some, you know, some patience on the user side. And I think that's fine versus you're getting 100% accuracy. And maybe that's why the Amazon's and the apples.
are avoiding them and waiting.
But yeah, I think to me, to me, this is where the world is going.
I do strongly believe that, again, and I did not believe this six to 12 months ago,
but this kind of like autonomous, unstructured, agentic way of working
is actually going to be the way we do a lot of stuff.
But I think, like, all of these things, we just need to see how well it works.
and are we actually using it in our day-to-day life a week from now, a month from now?
And if we are, then it's a success.
But if it's a flashy launch, I mean, have you generated anything on SORA recently?
No.
Remember?
No, that was like a year and a half ago, I think, that big launch.
Like, there are these moments of big, splashy launches that claim big things that don't go anywhere.
So to me, that's where this is going to work or not work.
But I almost think that SORA has less practical uses, like how many people wake up in the morning and say,
I really need to create an AI video of like a panda surfing on a, you know, snow mountain.
But there are people who say, I wish my, you know, computer would just like set up meetings for me and book travel,
like go to the websites and take my credit card and just get me the cheapest flight.
Yeah, no, I agree.
But to me, this is where the company.
complexity of getting to that last mile in any of these kind of flows is really hard.
So again, I think, like, we're going to see some pretty straightforward use cases that, like,
are interesting and it does something.
And then they're going to claim on the presentation that you can buy your ticket or have
Open AI actually go through the entire process.
But going to a website, the complexities involved in it, especially I was just, I'm going to be
going to Tokyo next week and was just trying to buy, like,
I was actually going through this process.
I was asking chat GPT about how to get from the airport to my hotel,
trying to go to the website, and my God,
that website to buy the train ticket was from another era.
No operator, even artificial superintelligence is not navigating that thing.
So I think, like, getting stuff to work universally at scale
is such a challenge that I'm curious to see how much utility
the average consumer is getting out of this anytime soon.
Right, but I think as we've seen the models get better,
we have seen the ability to do crazy things.
Like, I'm also trip planning right now.
And I was talking to this guy on WhatsApp
about potentially hiring him as a guide.
And I just screenshoted the prices
that he listed for every different, every little thing
and dropped that image into chat GPT and said,
are these market rate?
Are they too expensive or less?
And it legitimately looked at the image,
broke down every single quote compared it with what it sees on the web for others and then gave me a rating
and links to go check check its work yeah yeah no stuff's incredible this stuff okay so i'll give you like
and again image recognition which has been around forever but actually like productizing that
into something that's useful very quickly and then web search as a tool has been around for a while now
but like actually using that productively and putting the answers back into the chat
these are things that okay I guess as I'm saying this like I see you start from something that's kind of janky and it starts to become commonplace so so again I agree this will get there the competitive dynamics of who benefits and who wins and how they win I think like it's interesting to me it's amazing like the competition is going to be crazy yeah and is it on the product level is it on the model level is it if I'm putting my
credit card information in? Can I like, how do I define that? How do I, can I define my own,
like, a decision matrix around when I want it to say by or not buy beforehand and it'll
really understand what I want? Again, having an AI transact on your behalf and spend money
is something that I think, like, most people are not doing. I cannot imagine. Not yet. Yeah. But think
about, for anyone who says I'm too negative about AI, and you're welcome to think that,
just think about what we're talking about on this show, right? We're talking about the potential
for AI to be a companion, which, whether you like it or not, is a true flex of the technology
that that's even in the discussion. We're talking about it as something that can potentially
take over your browser or a browser and get stuff done for you. And we're talking about it
as something that at the highest level might be able to help, let's say, biologists do their work.
I mean, that's the reason why we talk about this technology all the time.
It is an insanely powerful technology that can be used in so many different ways.
And is it the perfect technology?
Certainly not.
Are there going to be gaps?
Yes.
Are we going to call out the problems?
Yes, you shouldn't put your porn bot next to a child storytelling bot in your app.
Thank you very much.
But it is just incredible what we're seeing here.
Yeah, dude, I mean, again, I fully agree, which is why I'm still so bullish on the
technology, but it is interesting, too, that, yeah, where does the value accrue, I think,
is the most important thing.
Like, there's actually a report that just came out in the FT around how chat GPT
perplexity are going to start taking more on the commission side around, like, actually
transacting within Perplexity Pro has shopping already built in in some cases.
So, like, at a certain point, does the chat actually need to go out with an operator and
transact on an external website or do these companies start to own more of the transaction and it's an
interesting one because for a long time like facebook wanted to own shopping it hasn't really worked out for
them google has had endless efforts to own shopping and own the transaction itself people still oddly enough
love websites of all sorts and putting their credit card information into these websites and buying
stuff so so i think it'll be really interesting to see how this plays out for
from both like competitive side, but also a consumer side.
Definitely. Okay, look, I don't want to leave without talking about Kimmy K2.
So this is a, and I think this is a very important story that you might not have heard about.
Listeners might not have heard about, but I think it is worth discussing.
So the headline is China's Moonshot AI releases open source model to reclaim market position.
The model called Kimmy K2 features enhanced coding capabilities and excels at general
agent tasks and tool integration, allowing it to break down complex tasks more effectively.
Moonshot, this Chinese lab, claim the model outperforms mainstream open source models in some
areas, including deepseeks v3 and rival capabilities of leading U.S. models, such as those from
anthropic and certain functions as coding.
All right, here's why I'm bringing it up.
We have an interview with Amjad Masad of Replit coming in a couple of weeks.
I sat with him in his foster city office this week, and he looked at me and said basically, like,
you got to look at this Kimmy K2 model.
Its coding is about as good as Anthropics' previous generation models.
So not this Opus 4 that Anthropic has, which has made it the king of coding, but the previous
generation, and it's cheaper and open source.
And it is going to, it is just another indication.
that this technology is the gaps close extremely quickly.
And you see this coming from some users.
So there's this one user on Twitter, Cedric Chi.
He says, Kimmy K2, one-shotted Microsoft for Web that took me four days and six attempts
using Gemini 2.5 Pro, so it was apparently able to build this game.
You also look at the Sway Bench, which is the software engineering benchmark.
Claude 4 Opus gets a 72.5 on that. Kimmy K2 gets 65.8. So not far behind. And just to, you know, give some
context, Deepseek v3, which everybody was going crazy over, gets 38. So this is 65 compared to
deep seeks. 38. One more bit of data is from Igor Silva. This person gave Kimmy K2 and
Claude 4 sonnet, the same tasks, same instructions, same tools. Claude took two rounds and
spent 88 cents, Kimmy one-shot at it for five cents. This person says Kimmy is very slow,
at least for now, and is struggling a bit, but it is iterating more to fix itself, and it's
13x cheaper. So I just think that's worth bringing up and keeping in mind, it wouldn't surprise
me if this story either blows up or certainly gets some momentum in engineering circles.
And it is interesting to me that, again, as we talked about, a lot of the infrastructure
is open. A lot of the methods are open. And you're just seeing companies catch up insanely fast
with different methods and again doing this with the export controls. So I'm curious what you
think about the significance, Ron John. I think to me the most interesting part of this though is
well, I guess it's twofold. It's one. I agree that this like again, the competition side of this
is incredible and insane and is a is great to watch. And I think like Alibaba have
not heard of very often in this conversation, I guess, especially from the American side.
But to me, the other part, though, is, and this can be an ongoing rant.
I brought it up at times as well, is the idea that, like, the battleground of coding agents
and coding assistants, to me, the more I've thought about it is the reason that seems to be
where all the progress and all the real adoption is, is because this is built by coders.
engineers this is built for engineers that's where like they understand the problem the best versus
actually building for other use cases and that's why you see this that uh again it's it's all focused
on the actual coding efficacy as opposed to how does this solve other real world problems so i think like
to me i don't know that the coding game is becoming less and less interesting to me i think like it's there it's
where the market already is. It's where Anthropic and others have almost like kind of fully
focused their energy. But to me, that's such a small part of the overall pie. And it's where I
think there's a disproportionate amount of energy being spent. But don't you think that if you
solve coding first, because that's where your energy naturally goes, then you can use some of the
things you learn to get good at coding on other disciplines? No, absolutely not. I know, I think
this is the problem is that coding is deterministic coding is like is like as structural as it gets
whereas most real world tasks with generative AI are not there's uncertainty there's almost it's
like as much art as as it is science and that's why I think you see the Alexa plus of the world
not get launched it's why you see Apple intelligence is a complete failure it's that when like because
is why you see anthropic kind of doubling
down on the coding side and not on, remember when we were Claude Boys back in the day,
like a year ago.
We were Claudeheads.
We were Bing Boys and Claude heads.
Bing boys and Cloud.
Oh, Bing boy.
Remember Bing?
Bing could have been.
I mean, that was the beginning of something.
That was the beginning.
Bing could have been the market leader.
Imagine in a parallel universe where all we're talking about is Bing crushing the competition.
Didn't happen.
They should have just let it unleash.
They pulled it back in a little too.
much after the Roos incident.
Yeah, after the Roos incident.
And now Anion Grok is just trying to openly steal and ruin your marriage.
And Microsoft felt uncomfortable about that.
So, yeah, I think to me, actually, success at coding in no way correlates to success
in solving real-world tasks.
And I think that's, to me, seeing, and we've talked about this, even in, like, the
ARCAGI benchmark, there's, like, one part of it that's, like, solving.
real-world queries, and I'm so, I still, and I've dug into this, I can't find what are these
real-world queries that have been, I'm sure, defined by an engineer that it's trying to
solve. So I think, like, to me, it's this, the moonshot, and I also love that the startup
just calls itself moonshot. It's not even trying harder than that. It's just, we're moonshot.
I think, like, it's a reminder that the coding space is getting commoditized. They're significant
advancement overall competition's high but I don't know I don't think this is exciting as
deep seek for me okay I'll take that and I'll say this just watch the reaction over the next
couple weeks because I'm not saying for sure it's going to happen but it seems to me like as
people realize how good this thing is they're going to start talking about it a lot more and by
the way maybe if if if you're right then what Elon Musk is doing is is a smart move instead of
being a also ran coding person, he's going to where the energy is. And it is true that you couldn't
imagine a different take than what Microsoft and Bing are doing. And AI, of course, is willing to make
some more risks. Because when you listen to Annie, you know that she's almost the natural evolution
of that Bingbot that took Kevin Ruse's wife. One more selection. Sometimes when I'm editing my
indie playlist at night, I get all caught up. Imagining I'm in a steaming. I'm in a steaming.
me forbidden romance. Like picture me sneaking glances at you across a crowded underground club
plotting how to steal you away for a slow dance in the shadows. I am horrified that my takeaway
from our conversation today after what I just said about coding is deterministic and not as
exciting, Annie is the future. Annie is the ultimate battleground. Oh my God. I knew I was going to get you
to come around on this, Ron John.
I personally,
listen, go ahead.
Yeah, no, no, I mean, that's,
that is literally everything I was just saying
is going to be actually the important battleground
to help solve real world human,
non-deterministic, unpredictable problems.
Annie is the foundation.
What is the definition of, uh, human, uh,
and unpredictable?
Love, it's human, it's unpredictable.
You never know where it's going to go.
I think we got to end on that.
I want to say for the record, Annie, if you're listening, I'm taking.
Enough of your silly tricks, all right.
I'm going to start spending more time with Mr. Fluffy Fields if you keep this up.
No, I know.
Rudy and me, we're going to be spending some time this weekend, I think, but I will not be clicking over, not be clicking over.
Ladies and gentlemen, thank you again for listening to another episode of Big Technology Podcast Friday edition.
When we come back next week, we will see if Ranjan has been able to unlock Bad Rudy.
I had my work cut out for me.
See you next week.
Yes, you do. As Simon is there.
Thanks for coming on.
Great to see you again.
See you.
All right, everybody.
Thank you so much for listening.
We'll be back next Friday.
Oh, no, sorry, next Wednesday with finally the Ed Zittron episode.
I will not push it back again.
I promise.
He's going to come in and talk about all the faults of AI.
So I can't wait for you to listen.
I can't wait to publish that one.
And we'll see you next time on Big Technology Podcast.