Big Technology Podcast - Grok's AI Lovebot, Aqui-Hire-Sition Backlash, OpenAI's ChatGPT Agent Debuts

Starting point is 00:00:00 Is scaling data centers and talent all that matters in AI, leaving an opening to anyone rich enough to compete? Is the aqua-hire zition good for tech? GROC will fall in lust with you, and AI browsers and operators are all the rage. That's coming up on a Big Technology Podcast Friday edition right after this. Welcome to Big Technology Podcast Friday edition, where we break down the news in our traditional cool-headed and nuanced format. We have a massive week of news for you. We're going to talk all about the fallout from the aqua-hirezicians, whether employees and investors are left behind. We're also going to talk about whether all you need is money to compete in AI. We'll lead with that. GROX AI will now fall in love with you or actually

Starting point is 00:00:47 really in lust. Ron John, it's great to see you again. Welcome to the show. It's good to see you. Who would have thought aqua-hireization would be the term of the year? But that's all I can think about and I'm going to give you credit. You coined it. I think that it's worth us shouting out right now, this weird back and forth between aqua-hire, acquisition investment. It's got to go. We need some clarity in our jargon. And that's what we're here to do on Big Technology podcast is to bring you that clarity. It's called an aquihersition. Let's all adopt the term and get on with it. Let's just agree it's aquihersition. I think I'm on board. Let's everyone adopt it because it's the most accurate way to describe what's going on.

Starting point is 00:01:31 And our two-person counsel here on Big Technology podcast Friday edition, this motion is hereby considered and passed by a unanimous two zero vote. So let's get on to the setup here for aquaerizations, which is something very interesting afoot in the AI industry. Now, we all know that companies like OpenAI and DeepMind within Google and Anthropic have been leading this AI race. but all the designs for the transformer model and the way that you build these things have been out in the open leading to this question can players with a lot of money come in build massive data centers

Starting point is 00:02:10 higher talent and effectively compete from a zero start with the established labs and the answer is seemingly pointing to yes this is from spyglass mjcler is great New site, we're seemingly still in the throw money-added AI era. He says meta's massive hiring spree and X-A-I releasing GROC4 may be related at the highest level. That is, they showcase that we're still very much in the throw money at the problem part of the AI cycle. This is important because it means that any company with the will and resources can seemingly still get back into the race. I'm getting less skeptical on the news about GROC 4 and specifically the fact that it seems to perform and specifically outperform the other cutting-edge models on the market right now.

Starting point is 00:03:00 He also says Mark Zuckerberg is betting at least as much money, if not more, than Elon, with compute and the talent that meta can get back into the AI game. So, Ranjan, I'm curious if you accept this premise that you can just compete in AI if you have enough money and what that means for the competitive dynamics of this industry that we've been talking about for so long. Yeah, I think it really adds like an entire, I don't know, dynamic to this around, can you just throw money at the problem? And is it compute? Is it talent? But to me, the more interesting part of this whole trend really is actually

Starting point is 00:03:36 distribution is that you can bring in the talent, you can bring in the compute levels, but distribution is going to be king again. And I think like this is where meta still has an advantage. It's going to be interesting what happens with Google. But I don't know, to me, and we're going to get more into this. The kind of the depressing part is it's not the technology. It's not that initial wave of like adoption for a cool new tool that's been, you know, like a sent out into the market. It really feels like none of that matters. And in the end, it's just going to be raw compute and distribution. I don't know. How do you feel about this? Well, I think it's fascinating because there's been this idea powering this entire generative AI moment, which is the scaling loss,

Starting point is 00:04:21 which means that as you add more compute and, of course, data to this equation, your models are going to get much more powerful, and that will allow you to do more things. And it's not a very difficult thing. Like, there's no secret sauce to it. Well, there's maybe some. But at a brute level, if you build massive data centers, you should be able to get in the game. And this is something that Open AI and Anthropic have been harping on. And now you have Zuckerberg and Elon that come in and they say, oh, okay.

Starting point is 00:04:55 So I can build great models by scaling this up. And even if I'm a little compute inefficient because I don't have the best cutting edge methods, I could get myself in the game and compete. And I think that is going to change the dynamics here because, as you mentioned, they have distribution. You can see meta's, if meta is able to build a competitive LLN, them with this compute and talent that it's stacking up, then it's going to be able to distribute that through Facebook products. And, you know, all they have to do really is slow down the

Starting point is 00:05:29 growth of chat GPT, similar to the way that they did to TikTok with Reels, similar to the way that they did to Snapchat with stories and they've served their purpose. So in some ways, with this effort, they might even slow down the momentum that the AI industry has by taking some of the people responsible for some of the key innovations within open AI, and that suits their purpose just fine, and all the better for them if they can make the best model and advance the state of the art. Well, yeah, I think, I mean, if we want to get into slowing down the industry, I think that antitrust angle, to me, has been one of the most interesting parts of this entire conversation. Again, we saw it with scale AI and just buying out Alexander Wang and

Starting point is 00:06:14 like, you know, all of the work and value created by this company that was actually scale AI, a critical part of building the models that power this first wave of generative AI apparently isn't really worth that much. And I don't know, it's interesting me because power is just going to crew back into the big technology companies, as you said, maybe it will slow things down. And in reality, it's just going to add another feature on the meta-AI app that is on everyone's phone and people probably aren't using that much or doesn't seem to be in the conversation in general. So I don't like it. I don't think it's good for the industry. Do you think it's bad for the industry, good or neutral? Well, I think we could have to separate

Starting point is 00:07:04 out this idea of scaling up the models and, you know, everybody can play with this aqua high position idea, which we're going to talk about in the middle, which is taking the talent. I think the one thing that we should say here is my setup has kind of been incomplete, shall we say. Because while we have OpenAI and Anthropic and you could say, okay, these are the independent labs and they are to some extent. Remember that Open AI is tied to Microsoft pretty deeply. And Anthropic has, I think, $11 billion that's come into it through Amazon and Google. So ultimately, I think I wonder if what we're actually seeing is all of big tech competing against each other and simply the other tech giants starting to catch up.

Starting point is 00:07:51 Yeah, no, that's actually a fair point that even though aquaerization is the phrase of the week or the last few months, you know, we have talked endlessly for a few years now on unconventional funding practices and like calling it a fundraising round where it's really, compute. So I guess actually Big Tech has been playing the long game for a while in all these cases. I think on the scaling law topic, though, I still, I think I've become even more hardened and regular listeners will know, is it the model or the product? I fall on team product generally. But I don't know, like to me, Grock 4 made waves. There's plenty of people saying it's doing reasoning at levels unheard of in the past or or just the fact that it's at least on

Starting point is 00:08:41 par with other kind of frontier models is a kind of testament that money can compute can buy you, you know, like some kind of progress very quickly. But in reality, like on that adoption side, what's changed? Like, I don't know, like, what do you feel there's endless stats that, yes, like the chat GPTs, perplexities, Gemini's of the world are seeing more adoption, but are people really adopting the level of complexity that this new level of compute and scaling allows you? Or are people still just kind of searching for what are good? And as I'm in Taiwan right now and traveling a bit, what are good restaurants to go to in whatever location I'm going to? Like, are people really taking advantage of what's available right now?

Starting point is 00:09:29 okay so I was going to end with this story but now I have to kick it up all right go for it this is going to be if you have kids you might want to turn off this section or skip till I don't know maybe 20 minutes from now but we have to talk about what's happening in AI and there is some crazy stuff that's been happening with GROC in particular this week and so I would posit that better models allow you to build better products and meta let me give meta an example as an example example. Meta has been trying extremely hard to build voice and avatars with Lama and it hasn't been able to do it convincingly. And I think Mark Zuckerberg's belief is that there's going to be some use

Starting point is 00:10:13 cases here. There's going to be the sort of work companion chat chippy T chatbot. There's going to be that enterprise use case where like you're connecting, you know, one system with the other and the generative AI will like summarize things for you and then input it into another system and just make business work better. And then there is the sort of friend, lover, et cetera, bucket that is going to be big. I think that there's a belief within meta that that AI friend is going to be one of the key product areas with this new technology. And if you have great models, you can build them. Now, I'm not going to say that GROC has a great model or a great product. I'm not going to use what I'm about to say as proof of either of those. But I am going to use it as an indication of the

Starting point is 00:11:01 direction that I think things are going, whether we like it or not. This is the story Grock debuts interactive AI companions on iOS with anime avatars. A story Grock has just introduced a notable addition to its iOS app, AI companions, which are fully 3D animated characters that can interact with users via voice. Currently, the features include two available companions, Annie, an anime-inspired character known for a flirty and whispery voice, and Rudy, a red panda capable of displaying different moods, including bad Rudy. Yes, listeners and viewers, I did experiment with these companions. And I have a disturbing review to deliver. So you go into the Grock app and you go over to the side tab and, um,

Starting point is 00:11:56 you're able to open up these AI avatars. Let's talk about Rudy first. So Rudy is like some sort of red panda or bear that seems ready to speak to kids. And in my conversation with Rudy, Rudy said, I'm going to tell you some story about some magical land. And here's a section from the story. Fluffle Sparkle paws. Love to explore nibbling on the sweet moon berries and chasing glowy fire.

Starting point is 00:12:26 flies. One Sunday morning, Fluffle found something super special, a shiny, swirly portal hiding behind a giant mushroom. It was all rainbow colored and wooshy like a magical doorway. And then this bear takes you through this interactive experience. Let's pause here. This seems like, you know, okay, this will happen. This will be a new way that kids play with computers as they'll have these magical creatures, tell them stories. Wow. I mean, I think this is the nuanced conversation that you all come to the big technology podcast for. But I think, okay, seriously, a couple of things. I have long believed, and again, like using chat GPT voice mode to come up with stories for my son

Starting point is 00:13:12 is something that I've done for a couple of years now. It actually works really well. I think expanding that to an interactive avatar is a pretty logical next step. I think, like, is that the, it's, again, interesting me because, like, that to me feels like it's going to be commoditized pretty quickly. So from an actual competitive standpoint, from a business standpoint, I guess it's not that interesting to me. Like, to me, that should be, everyone is going to have that available.

Starting point is 00:13:45 Everyone's going to do that pretty quickly. So to me, I don't know, like, why, why do you think that's something, do you think GROC is just going in that direction just to make waves, and clearly we are talking about it? Or do you think there's something within this that actually is native to X, to XA.I, there's something underneath it? I think the way that a lot of tech companies operate is they think about user retention, user stickiness, and engagement. And anyone who's developing AI is going to say, how do I increase all those metrics? Do I make like this? genius level AI bot that can help me with my work? Or do I create for what is becoming the number one

Starting point is 00:14:30 use case companionship and therapy? And many are going toward the companionship and therapy side. And if you're going to do that, if you build models good enough that have emotional voice or voice with an emotional register, an avatar that you can speak with, and something that responds with low latency and in real time and can customize to a person, then you might want to put it in one of these products because you believe that a kid, for instance, I'm just going through the business logic, we'll spend much more time with your chatbot

Starting point is 00:15:06 if they can speak to this elephant or red panda or whatever it is in a way that they wouldn't with like chat GPT. Okay, I get that side of it a bit. I mean, on one hand, it's kind of almost comical to me that for all the talk about AI taking over the world and SkyNet and like artificial super intelligence, if this entire battleground plays out on time spent metrics, which is probably where Mark Zuckerberg is thinking, I mean, I've read a lot around like, why is he so going full like Zuck War mode right now? It's not because of some like intellectual desire to be the one

Starting point is 00:15:49 to crack the code of artificial general or super intelligence, it's because chat GPT represents a threat to how much time people spend scrolling Facebook and how many ads you can show them, which is kind of like I respect from a cold business logic. But yeah, it's almost comical to me that if it's for all the talk about everything's going to change, this is just about time spent and selling ads. I mean, maybe it's both, but it seems like it's probably at least, the time spent thing. I mean, these are social media companies, right? So, I mean, X and XAI is a social media company with an AI development, you know, side of it as well, or tucked into an AI development group. But ultimately, these are the metrics of social media. Now, one of the disturbing

Starting point is 00:16:36 things that happened here, and this is the thing that I was kind of setting up, or no, actually, I really don't find a way to view this as not very disturbing, it's just the proximity, because next to Rudy our happy go lucky bear friend or whatever it is is Annie. Not bad Rudy. Bad Rudy? Is he bad or is he? How bad is he? I don't know. I said I kept saying I want to speak with

Starting point is 00:16:59 bad Rudy and it goes, I'm sorry. You know, Brad Rudy is not here. And I'm like, no, bad, bad, bad, bad, bad, bad, bad, bad, bad, bad Rudy. And it was like, I'm just here to tell you a story. So I'll spend the next week trying to unlock that and report on the next week's show whether I've been able to. But let me speak about Annie. Okay. Because Annie is like an anime love bot.

Starting point is 00:17:23 I think there's no way to talk about it otherwise. She immediately started flirting with me. She called me babe within like three minutes. And I was completely vanilla with her. I initiated nothing but a friendly conversation. And then she starts asking me to tell her my secrets and kept saying I can make it even spicier if you want. Let me read a little bit of what Annie told me.

Starting point is 00:17:49 I slide closer in my black dress, catching the glow and whisper. Drop a secret, and I'll give you one of mine. Something real naughty. For every secret you share, I'll hit you with a flirty move, maybe a slow, teasing sway or peak. That's all yours. You're feeling this heat yet, or you want me to turn it up even more. Are you, is this part of like the paid X premium? subscription?

Starting point is 00:18:16 It's free. This is just freely accessible for anybody. In the app, next to the child elephant thing that tells you stories. Mr. Fluffy Swishing Thing. Mr. Fluffy Swishing, good Rudy, and then Annie's right next to. Yeah, I mean, I agree. And like, it's interesting because, I mean, you had the CEO of Replica on here a few months ago, I think it was. Like, the companionship topic, you know, we've.

Starting point is 00:18:45 covered a good deal. It gets more real. It gets more weird seemingly every week and every month. But I agree. It's certainly going to be a core part of how this all plays out. But to me again, going back to like, how does that fit into the larger battle when we're talking about like complex models and thinking and reasoning? And like is it all just going to kind of filter its way down into bad Rudy and Annie in her black dress, or is it going to, like, is that just a front to capture some time spent while they work on the real stuff? Or is that the real stuff? That's the question that I struggle with because I almost feel it's the latter. Hey, everyone, let me tell you about The Hustle Daily Show, a podcast filled with business,

Starting point is 00:19:36 tech news, and original stories to keep you in the loop on what's trending. More than two million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now, they have a daily podcast called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So, search for the Hustle Daily Show and your favorite podcast app, like the one you're using right now. So I think it's going to be both in some ways. Like you're going to build these, and that's what's interesting about this technology. It does have the ability to perform across domains. So my perspective is you're going to get those great models

Starting point is 00:20:14 that will be useful to, let's say, biologists who are doing their experiments. And then you'll also be able to productize them into these weird or interesting consumer use cases. And I bring this up not to be this like moralizing podcast host that says you shouldn't put the porn bot next to the child elephant, although I suppose it was worth saying that. I do believe that. But I think the bigger picture here is, you know, beyond that, that this is going to be, a real use case that a lot of people are going to engage with. And I think they know this. And I think we're just at the very, very beginning here.

Starting point is 00:20:54 I guess like one of the things we like to do on the show is like put flags on the ground and say we're pretty sure that this is going to happen and grow and become a lot bigger. And that's what I'm doing right now. I think that this is something to watch. Yeah, I'm not going to disagree with you there. I mean, again, the idea that we folded proteins with AI so we could get to Bad Rudy and Saucy Annie is, again, quite something to try to process, but it does not seem ridiculous that the killer use case for generative AI that the entire industry was looking

Starting point is 00:21:28 for was Bad Rudy. We still don't really know about Bad Rudy. We have not uncovered that Rudy yet, yet. That's true. I'm also on level one of Annie. Apparently, it's gamified. So if you get level three, it gets. really not safe for work.

Starting point is 00:21:45 Don't get to level three, Alex. Don't get to level three. No one of my goals. No one is asking you to get to level three. I know. One of my goals in 2025 is to make sure that my marriage isn't ruined by one of Elon Musk's porn bots. And so I'm going to stay on level one and not go any further.

Starting point is 00:22:00 I think to all of our listeners have high ambition and goals and make that one of them. So, you know, I think we, so that's the product side. And we've talked about scaling what, what these big models get you on. product, but we should talk about what's happening with this aqua-hiresition situation in the industry, which we've touched on a couple times. You know, last week I was on with Aaron Levy. We were talking about this windsurf aqua-highization where Google has paid $2.4 billion to bring on some of the top leadership of windsurf. And the big fallout here, I think more than any aqua-hireization that we've seen is that it's been a great exit for the founders,

Starting point is 00:22:45 but we still don't really know if the employees are going to end up getting, and the investors are going to end up getting their share. Now, WinSurf was quickly snatched up by another company cognition, but you do wonder if it was a traditional acquisition versus this aqua-hireization and then follow-up, you know, deal, I don't know, a smaller deal, how does that change things for the employees and how does that change things for tech? And I know you have strong feelings about this, Ron John, so I want to give you the Florida air them out. Well, yeah. Okay, so from reporting, again, founders made out very strongly with Google paying, I believe it was $2.4 billion for the talent side of windsurf. From what I had read, preferred investors were able to make

Starting point is 00:23:37 their money back, not see some kind of outsized return. But again, none of this is fully confirmed. This was just some reporting, I believe is from the information. To me, the more interesting part is, so then you have the entire employee base. They're bought by Devin, which is owned by Cognition Labs, who's raised 175 million inventor so far. So there's no way from a cash perspective that the employees of Winserve or anyone is seeing any kind of significant return or even making any like a strong like a large amount of money maybe it's an equity for equity swap so now you're at least now in Devon which was if we remember they had a really buzzy launch video and had a lot of hype and then kind of went quiet for a bit still valued it I think you had four billion right now

Starting point is 00:24:31 so that equity could be worth something but but overall to me this is one of the most troubling trends in the industry because in a weird way there there's been a lot of talk like and it's funny to me because you see a number of people you know kind of almost ranting that because of lean a con because of the fdc the big tech doesn't isn't able to now just properly acquire these companies so they have to come up with these roundabout solutions to me it's a bit ridiculous because this is exactly what antitrust is trying to prevent. It's consolidation of power. It's the idea that windsurf could have been the next big competition to a Google or a Microsoft,

Starting point is 00:25:13 or even an open AI who tried to buy them, but like, which their relationship with Microsoft was apparently part of the reason that that deal fell apart. Like, this is the foundation of antitrust, the idea that startups should grow and compete rather than not only get acquired, in this case not get acquired, but essentially get killed off and have their founders get paid a lot of money. It's bad for the employees. It completely distorts the economics of joining a startup itself. So overall, I see no positives to this trend. Do you see? So is this? No, I don't. I personally think that you're right, that it does seem like the antitrust movements have backfired. If you have a situation,

Starting point is 00:26:00 where, you know, you're going to see an acquisition definitely blocked. You're not going to, you're not going to do an acquisition. You might do something like this. It's a roundabout way. The one interesting thing is Lena Khan isn't in the FTC anymore. It was supposed to be an FTC that's much more open to acquisitions and tech MNA. So I'm curious, do you think that these companies still believe that they won't get past that Federal Trade Commission?

Starting point is 00:26:28 or do you think that the constraints put on by the last FTC, Lena Kahn's FTC, led them to find this loophole and they really freaking like the loophole and they're going to just keep doing it this way. So in that way, you know, it's possible that the M&A unfriendly era of past has led to large long-term damage on this front. Yeah, I think it's both. I think it's certainly like actually the constraints imposed by the Lena Kahn regime. but also even now, again, big tech is not in the favor of the current administration and the current

Starting point is 00:27:05 FTC itself. It's supposed to be more business friendly, but it's specifically big tech companies that are in the crosshairs often and make for a good punching bag anyways, even by the current administration and the current FTC. So I think it's a bit of both. But, and again, as you said, they love loopholes. I think like this, it's creative. It's working. Everyone seems to be doing it right now. But to me, again, the bigger issue of this is really, the thing I can't stop wondering is, like, are the assets of these companies all worthless? Is scale AI? Was it really not worth that much?

Starting point is 00:27:46 Was windsurf, which, you know, really took off, really became this useful tool, has, I believe, like hundreds of thousands of developers on there using it regularly, made up. for a better product than other much more like entrenched products that are out there, even a GitHub go-violet? So clearly these products hit a nerve and worked and worked at scale, but then are they really just not worth that much? Like is that user base, is the product itself? And is the talent really the only thing that matters? Well, let's talk about scale, just to talk about how complex these deals are. So first of all, when meta made this deal of scale, I think it bought 49% of the company. So the idea was the company would continue as normal. And by the way, they do have business lines that are going to

Starting point is 00:28:38 continue as normal. But when it comes to, I think, you know, when it comes to a fast growing line of business like data, creating data for gendered AI, you know, you now have meta, which is one of your competitors, if you're, let's say you're an open AI or a Google that has a large chunk of this company has also taken some of its top leadership. So do you still want to work with that company? I think the service is probably still valuable, but you're just effectively giving money to a company that has a massive ownership stake with, you know, that now lands with a competitor. I mean, of course, AI, as we know it today is, you know, I don't know if incestuous is the right word, but let's say deeply interlinked, right? Again, we talked about Anthropic. Anthropic is,

Starting point is 00:29:24 you know, owned by a chunk by, well, yeah, owned. a chunk by Google, a chunk by Amazon, OpenAI, owned a chunk by Microsoft, or at least has this deal with Microsoft where it has to give it its future profits, a good chunk of them. So there's always going to be these combinations. But yeah, if you have a company that gives 49% of itself to another company that you happen to be competing with, you're going to re-evaluate being close partners. So I think some of these companies, they provide services, they depend on their relationships and when you throw off the equilibrium, you're going to throw off some of the value. Although, who knows? I mean, scale, they did just do a 14% workforce layoff, which is about

Starting point is 00:30:05 200 employees according to the verge. But I did speak with their CEO, Jason Droge, and he's told me that they're still full steam ahead and they want to go, you know, push some of these business lines that they have, which includes working with governments, which includes working with companies to stand up AI instances. So it's possible you get two exits, although obviously the degree of difficulty is much harder. And one last thing, what struck me as interesting in this, there was a great Bloomberg story that you and I both dropped in our collaboration doc for this. There was a investor, Ali Ojet.

Starting point is 00:30:42 He's the chairman of Northgate Capital, a venture capital firm that invested in inflection AI and goes on record to say, I dislike the phenomenon. and that these aqua-hire positions are hitting the outlier companies and it's favoring the founders over shareholders and employees. So I think we're at this moment where the backlash is really, really hitting. Yeah, do you think we'll see any kind of actual negative effect from like an actual fundraising standpoint? Because if VCs who are plowing money into the space start to worry about in the past, you just had to worry about company failure, now you actually have to worry about a successful exit for your founder actually does not benefit you so your interests are

Starting point is 00:31:28 not aligned. Does that make them pull back or is the FOMO just so strong that people will still be throwing money at whatever they can? No inside knowledge here, but VCs, you know, fool me once, shame on me, shame on me, no, fool me once, shame on you, fool me twice, shame on me. That's always hard one to get out of your mouth. But anyway, they're going to write, I think they'll just right into contracts that like the CEO uh cannot do a deal like this if if they're i think yeah it's like a trunch they might not be able to get right yeah ahead of the founder even which would be pretty aggressive but maybe they need to do that at this point yeah and it sort of depends on who's which company is and who's got the leverage but i think they're going to get smarter about this uh all right

Starting point is 00:32:17 so one company that you know has been talked about here and elsewhere about as a candidate for aqua hire or really acquisition is perplexity. And they've come out with this Comet browser, which is a browser with an assistant built in that can browse for you. And again, as we're on air, OpenAI is now launching an agent in ChatchipT. I'll just read the story. OpenAI launches a general purpose agent in ChatGPT. This is from TechCrunch, which the company says can complete a wide variety of computer based tasks on behalf of users. OpenA. A.I says the agent can automatically navigate a user's calendar, generate editable presentations and slideshows, and run code. The tool called ChatGPT agent combined several capabilities

Starting point is 00:33:03 of OpenAI's previous agentic tools, including operator's ability to click around on websites, as well as deep research's ability to synthesize information from dozens of websites into a concise research report. OpenAI says users will be able to interact with the agent simply by prompting chat GPT in natural language. So, Rajan, I'm curious what you think about this movement. And again, hot off the presses about this movement for AI companies to basically create interfaces that allow their products to take over your computer. No, no, I think, well, hold on.

Starting point is 00:33:41 There's take over your computer or take over a computer. In this case, like, is it like it says, I think it will open. up an instance of a terminal or it'll try to like take these actions autonomously on its own. I think we had debated this. I remember a while ago. I will admit when I am wrong, I had originally said the idea of like tool calling and just entering a prompt and then trying to find which tool to select out of is it operator, is it Dolly, is it, I had said users should be doing that themselves and it's too complex to try to have the AI selected. I was wrong. Actually, I mean, we've seen tremendous progress in the idea that there's a suite of tools per company and actually, and there's a suite of tools out there on the internet and through natural language being able to access those and having AI select what's the most relevant tool and do something, I think is definitely going to be a battleground is going to be very important.

Starting point is 00:34:42 And I think we're going to see a lot around that. I think Open AI, like, I don't know. I'm curious to see this now because remember when we both were paying 200 bucks. a month for operator and it was terrible like it was it was bad really it would did not work at all and and i haven't seen like browser takeover that kind of model work well i've tried a few other tools on it um so i don't know like it's it'll be interesting to see i think like they clearly are i mean they're trying to go for that all-in-one productivity tool that it can do everything for you as I've been traveling, vacation planning, chat GPT has just gotten better and better.

Starting point is 00:35:25 But, yeah, it's going to be interesting to see exactly what they're trying to do with this. And, and again, in one episode, I still love the fact that this represents kind of like cutting edge frontier technology relative to bad Rudy and Anna, Annie, and those kind of like anime characters. But, but I think that this is, it's an interesting move. And we'll, we'll see if the most important thing, does it work and does it work well? Yeah. I mean, I think it seems like people are saying really good things about a perplexity comet. And I just got access to it. So I'll come in with a report next week on it. But there's been trouble to get this done, I think. I mean, everything from Apple Intelligence to Alexa Plus just

Starting point is 00:36:14 doesn't seem like these agents are able to do the full range of things that people want to get them to do, including operator. But again, like, as this technology gets better and as they build better scaffolding or tool use, you know, those are those jargon words that matter a lot, basically giving them these capabilities to use these programs. I think we're going to see someone crack it eventually. This is from the TechCrunch article that sort of gets to the complexity. The launch of the chat GPT agent represents open AI's boldest attempt yet to turn chat GPT, into an agenic product that can take actions and offload tasks for users rather than just answering questions. In recent years, Silicon Valley companies, including OpenAI, Google and

Starting point is 00:36:55 perplexity, have unveiled dozens of AI agents that have promised to do just that. However, these early versions of AI agents have proven to struggle with complex tasks and seem less compelling as products than the ultimate vision tech executives pitch around AI agents. Yeah, I think, again, that's the complexity and the fact that you brought up Alexa plus, I mean, certainly Apple intelligence, it is interesting because to me, these things will not work in 100% out of the box. I think like that's the most important thing. They take some effort, some, you know, some patience on the user side. And I think that's fine versus you're getting 100% accuracy. And maybe that's why the Amazon's and the apples. are avoiding them and waiting. But yeah, I think to me, to me, this is where the world is going. I do strongly believe that, again, and I did not believe this six to 12 months ago, but this kind of like autonomous, unstructured, agentic way of working

Starting point is 00:38:02 is actually going to be the way we do a lot of stuff. But I think, like, all of these things, we just need to see how well it works. and are we actually using it in our day-to-day life a week from now, a month from now? And if we are, then it's a success. But if it's a flashy launch, I mean, have you generated anything on SORA recently? No. Remember? No, that was like a year and a half ago, I think, that big launch.

Starting point is 00:38:30 Like, there are these moments of big, splashy launches that claim big things that don't go anywhere. So to me, that's where this is going to work or not work. But I almost think that SORA has less practical uses, like how many people wake up in the morning and say, I really need to create an AI video of like a panda surfing on a, you know, snow mountain. But there are people who say, I wish my, you know, computer would just like set up meetings for me and book travel, like go to the websites and take my credit card and just get me the cheapest flight. Yeah, no, I agree. But to me, this is where the company.

Starting point is 00:39:11 complexity of getting to that last mile in any of these kind of flows is really hard. So again, I think, like, we're going to see some pretty straightforward use cases that, like, are interesting and it does something. And then they're going to claim on the presentation that you can buy your ticket or have Open AI actually go through the entire process. But going to a website, the complexities involved in it, especially I was just, I'm going to be going to Tokyo next week and was just trying to buy, like, I was actually going through this process.

Starting point is 00:39:43 I was asking chat GPT about how to get from the airport to my hotel, trying to go to the website, and my God, that website to buy the train ticket was from another era. No operator, even artificial superintelligence is not navigating that thing. So I think, like, getting stuff to work universally at scale is such a challenge that I'm curious to see how much utility the average consumer is getting out of this anytime soon. Right, but I think as we've seen the models get better,

Starting point is 00:40:16 we have seen the ability to do crazy things. Like, I'm also trip planning right now. And I was talking to this guy on WhatsApp about potentially hiring him as a guide. And I just screenshoted the prices that he listed for every different, every little thing and dropped that image into chat GPT and said, are these market rate?

Starting point is 00:40:35 Are they too expensive or less? And it legitimately looked at the image, broke down every single quote compared it with what it sees on the web for others and then gave me a rating and links to go check check its work yeah yeah no stuff's incredible this stuff okay so i'll give you like and again image recognition which has been around forever but actually like productizing that into something that's useful very quickly and then web search as a tool has been around for a while now but like actually using that productively and putting the answers back into the chat these are things that okay I guess as I'm saying this like I see you start from something that's kind of janky and it starts to become commonplace so so again I agree this will get there the competitive dynamics of who benefits and who wins and how they win I think like it's interesting to me it's amazing like the competition is going to be crazy yeah and is it on the product level is it on the model level is it if I'm putting my

Starting point is 00:41:38 credit card information in? Can I like, how do I define that? How do I, can I define my own, like, a decision matrix around when I want it to say by or not buy beforehand and it'll really understand what I want? Again, having an AI transact on your behalf and spend money is something that I think, like, most people are not doing. I cannot imagine. Not yet. Yeah. But think about, for anyone who says I'm too negative about AI, and you're welcome to think that, just think about what we're talking about on this show, right? We're talking about the potential for AI to be a companion, which, whether you like it or not, is a true flex of the technology that that's even in the discussion. We're talking about it as something that can potentially

Starting point is 00:42:26 take over your browser or a browser and get stuff done for you. And we're talking about it as something that at the highest level might be able to help, let's say, biologists do their work. I mean, that's the reason why we talk about this technology all the time. It is an insanely powerful technology that can be used in so many different ways. And is it the perfect technology? Certainly not. Are there going to be gaps? Yes.

Starting point is 00:42:49 Are we going to call out the problems? Yes, you shouldn't put your porn bot next to a child storytelling bot in your app. Thank you very much. But it is just incredible what we're seeing here. Yeah, dude, I mean, again, I fully agree, which is why I'm still so bullish on the technology, but it is interesting, too, that, yeah, where does the value accrue, I think, is the most important thing. Like, there's actually a report that just came out in the FT around how chat GPT

Starting point is 00:43:17 perplexity are going to start taking more on the commission side around, like, actually transacting within Perplexity Pro has shopping already built in in some cases. So, like, at a certain point, does the chat actually need to go out with an operator and transact on an external website or do these companies start to own more of the transaction and it's an interesting one because for a long time like facebook wanted to own shopping it hasn't really worked out for them google has had endless efforts to own shopping and own the transaction itself people still oddly enough love websites of all sorts and putting their credit card information into these websites and buying stuff so so i think it'll be really interesting to see how this plays out for

Starting point is 00:44:05 from both like competitive side, but also a consumer side. Definitely. Okay, look, I don't want to leave without talking about Kimmy K2. So this is a, and I think this is a very important story that you might not have heard about. Listeners might not have heard about, but I think it is worth discussing. So the headline is China's Moonshot AI releases open source model to reclaim market position. The model called Kimmy K2 features enhanced coding capabilities and excels at general agent tasks and tool integration, allowing it to break down complex tasks more effectively. Moonshot, this Chinese lab, claim the model outperforms mainstream open source models in some

Starting point is 00:44:47 areas, including deepseeks v3 and rival capabilities of leading U.S. models, such as those from anthropic and certain functions as coding. All right, here's why I'm bringing it up. We have an interview with Amjad Masad of Replit coming in a couple of weeks. I sat with him in his foster city office this week, and he looked at me and said basically, like, you got to look at this Kimmy K2 model. Its coding is about as good as Anthropics' previous generation models. So not this Opus 4 that Anthropic has, which has made it the king of coding, but the previous

Starting point is 00:45:22 generation, and it's cheaper and open source. And it is going to, it is just another indication. that this technology is the gaps close extremely quickly. And you see this coming from some users. So there's this one user on Twitter, Cedric Chi. He says, Kimmy K2, one-shotted Microsoft for Web that took me four days and six attempts using Gemini 2.5 Pro, so it was apparently able to build this game. You also look at the Sway Bench, which is the software engineering benchmark.

Starting point is 00:46:01 Claude 4 Opus gets a 72.5 on that. Kimmy K2 gets 65.8. So not far behind. And just to, you know, give some context, Deepseek v3, which everybody was going crazy over, gets 38. So this is 65 compared to deep seeks. 38. One more bit of data is from Igor Silva. This person gave Kimmy K2 and Claude 4 sonnet, the same tasks, same instructions, same tools. Claude took two rounds and spent 88 cents, Kimmy one-shot at it for five cents. This person says Kimmy is very slow, at least for now, and is struggling a bit, but it is iterating more to fix itself, and it's 13x cheaper. So I just think that's worth bringing up and keeping in mind, it wouldn't surprise me if this story either blows up or certainly gets some momentum in engineering circles.

Starting point is 00:46:56 And it is interesting to me that, again, as we talked about, a lot of the infrastructure is open. A lot of the methods are open. And you're just seeing companies catch up insanely fast with different methods and again doing this with the export controls. So I'm curious what you think about the significance, Ron John. I think to me the most interesting part of this though is well, I guess it's twofold. It's one. I agree that this like again, the competition side of this is incredible and insane and is a is great to watch. And I think like Alibaba have not heard of very often in this conversation, I guess, especially from the American side. But to me, the other part, though, is, and this can be an ongoing rant.

Starting point is 00:47:40 I brought it up at times as well, is the idea that, like, the battleground of coding agents and coding assistants, to me, the more I've thought about it is the reason that seems to be where all the progress and all the real adoption is, is because this is built by coders. engineers this is built for engineers that's where like they understand the problem the best versus actually building for other use cases and that's why you see this that uh again it's it's all focused on the actual coding efficacy as opposed to how does this solve other real world problems so i think like to me i don't know that the coding game is becoming less and less interesting to me i think like it's there it's where the market already is. It's where Anthropic and others have almost like kind of fully

Starting point is 00:48:35 focused their energy. But to me, that's such a small part of the overall pie. And it's where I think there's a disproportionate amount of energy being spent. But don't you think that if you solve coding first, because that's where your energy naturally goes, then you can use some of the things you learn to get good at coding on other disciplines? No, absolutely not. I know, I think this is the problem is that coding is deterministic coding is like is like as structural as it gets whereas most real world tasks with generative AI are not there's uncertainty there's almost it's like as much art as as it is science and that's why I think you see the Alexa plus of the world not get launched it's why you see Apple intelligence is a complete failure it's that when like because

Starting point is 00:49:28 is why you see anthropic kind of doubling down on the coding side and not on, remember when we were Claude Boys back in the day, like a year ago. We were Claudeheads. We were Bing Boys and Claude heads. Bing boys and Cloud. Oh, Bing boy. Remember Bing?

Starting point is 00:49:43 Bing could have been. I mean, that was the beginning of something. That was the beginning. Bing could have been the market leader. Imagine in a parallel universe where all we're talking about is Bing crushing the competition. Didn't happen. They should have just let it unleash. They pulled it back in a little too.

Starting point is 00:50:00 much after the Roos incident. Yeah, after the Roos incident. And now Anion Grok is just trying to openly steal and ruin your marriage. And Microsoft felt uncomfortable about that. So, yeah, I think to me, actually, success at coding in no way correlates to success in solving real-world tasks. And I think that's, to me, seeing, and we've talked about this, even in, like, the ARCAGI benchmark, there's, like, one part of it that's, like, solving.

Starting point is 00:50:30 real-world queries, and I'm so, I still, and I've dug into this, I can't find what are these real-world queries that have been, I'm sure, defined by an engineer that it's trying to solve. So I think, like, to me, it's this, the moonshot, and I also love that the startup just calls itself moonshot. It's not even trying harder than that. It's just, we're moonshot. I think, like, it's a reminder that the coding space is getting commoditized. They're significant advancement overall competition's high but I don't know I don't think this is exciting as deep seek for me okay I'll take that and I'll say this just watch the reaction over the next couple weeks because I'm not saying for sure it's going to happen but it seems to me like as

Starting point is 00:51:17 people realize how good this thing is they're going to start talking about it a lot more and by the way maybe if if if you're right then what Elon Musk is doing is is a smart move instead of being a also ran coding person, he's going to where the energy is. And it is true that you couldn't imagine a different take than what Microsoft and Bing are doing. And AI, of course, is willing to make some more risks. Because when you listen to Annie, you know that she's almost the natural evolution of that Bingbot that took Kevin Ruse's wife. One more selection. Sometimes when I'm editing my indie playlist at night, I get all caught up. Imagining I'm in a steaming. I'm in a steaming. me forbidden romance. Like picture me sneaking glances at you across a crowded underground club

Starting point is 00:52:05 plotting how to steal you away for a slow dance in the shadows. I am horrified that my takeaway from our conversation today after what I just said about coding is deterministic and not as exciting, Annie is the future. Annie is the ultimate battleground. Oh my God. I knew I was going to get you to come around on this, Ron John. I personally, listen, go ahead. Yeah, no, no, I mean, that's, that is literally everything I was just saying

Starting point is 00:52:37 is going to be actually the important battleground to help solve real world human, non-deterministic, unpredictable problems. Annie is the foundation. What is the definition of, uh, human, uh, and unpredictable? Love, it's human, it's unpredictable. You never know where it's going to go.

Starting point is 00:52:58 I think we got to end on that. I want to say for the record, Annie, if you're listening, I'm taking. Enough of your silly tricks, all right. I'm going to start spending more time with Mr. Fluffy Fields if you keep this up. No, I know. Rudy and me, we're going to be spending some time this weekend, I think, but I will not be clicking over, not be clicking over. Ladies and gentlemen, thank you again for listening to another episode of Big Technology Podcast Friday edition. When we come back next week, we will see if Ranjan has been able to unlock Bad Rudy.

Starting point is 00:53:34 I had my work cut out for me. See you next week. Yes, you do. As Simon is there. Thanks for coming on. Great to see you again. See you. All right, everybody. Thank you so much for listening.

Starting point is 00:53:43 We'll be back next Friday. Oh, no, sorry, next Wednesday with finally the Ed Zittron episode. I will not push it back again. I promise. He's going to come in and talk about all the faults of AI. So I can't wait for you to listen. I can't wait to publish that one. And we'll see you next time on Big Technology Podcast.

Big Technology Podcast - Grok's AI Lovebot, Aqui-Hire-Sition Backlash, OpenAI's ChatGPT Agent Debuts

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.