Big Technology Podcast - Is ChatGPT The Last Website?, Grok’s System Prompt, Meta’s llama Fiasco

Starting point is 00:00:00 Chatchip-T looks like the last website on Earth that's growing. What does that mean for the rest of the web? Plus, GROC starts spewing unprompted propaganda and reveals its system prompt, and Meta's Lama Project is in some serious trouble. That's coming up on a Big Technology Podcast Friday edition right after this. Welcome to Big Technology Podcast Friday edition where we break down the news in our traditional cool-headed and nuanced format. We have a major show for you today where we're going to talk about some new data that we've

Starting point is 00:00:29 gotten about Chachipiti's Ascent in the worldwide ranking of websites. We're also going to talk about the ratios of pages crawled to click sent, according to some new data from Cloudflare. Then we're going to talk about this entire weird situation with GROC and how it started unprompted insertion of propaganda about white genocide in South Africa. And we're not going to really talk about it from political ends. It just shows a lot about what's going on with these models and then finally we're going to talk about meta's llama project the fact that bimuth its latest largest model is going to be delayed and of course that's just one of the latest delays that we've seen from the large models and what that means about scaling joining us as always on

Starting point is 00:01:15 fridays is ron john roy of margins ronjohn good to see you welcome to the show good to see you the web is uh the web is even deader than it was two weeks ago apparently yeah so this is some amazing data that's coming from similar webs, Sam Altman just actually referenced it in his testimony, his testimony before U.S. Congress, and you take a look at it, and it is fascinating. So, first of all, chat chipit is the number five website in the world, according to similar web. You have Google first, then YouTube, Facebook, and Instagram, and then number five is chatchipt. So that in and of itself is a very interesting development. But the other thing that is really worth calling out. Now, of course, this is desktop, and, you know, we know everybody's moving to

Starting point is 00:01:57 mobile. But if you look at the traffic change month over month, Google at YouTube, Facebook, Instagram, all going down. Chatchip-T up 13% month over month. Then everything that else that follows X, WhatsApp, Wikipedia, Reddit, Yahoo, Japan, all going down. And so Chat-Cepti stands alone here. And that leads me to sort of like the title of our first segment here, is chat chip E.T. The last website. And, you know, I was thinking, is this a little hyperbolic? But then as we see generative AI start to ingest so much content from the web and become the last website that's growing as everything else declines. I wonder, you know, maybe it's not that hyperbolic. What do you think, Ron, John? I don't think it's hyperbolic at all. And I think

Starting point is 00:02:43 it gets into that central question of, as these generative AI destinations become more ingrained in our lives. And I certainly know for myself, that's the case. Where do they get the content from is going to become one of the biggest questions for all content up to today. And looking back, they're pretty good. But if they have no content to ingest, then what happens? But overall, I think it's definitely, it's a better way to consume information. I think it's really hard to argue with that. So what does this overall system look like? What does the web look like? I mean, we got to figure that out fast. Otherwise, I mean, just to save Yahoo Japan, because we got to save Yahoo. Japan, I know. Yes. Shout out Jim Lanzone and the Yahoo crew. Keep that jewel going. And look,

Starting point is 00:03:37 I think that we're starting here this week because it's going to become really important when we talk about who shapes generative AI if it sort of ingests everything else and how they shape it and what values. And another data point that I found was very interesting when it comes to like whether these chatbots are the quote unquote last websites is Cloudflare, which is a security company that helps keep websites up. On their recent earnings call, Matthew Prince, the CEO, was talking a little bit about the amount of pages, each one of these services crawls to the amount of visitors that it sends to websites. And these numbers are fascinating and we have to talk about it. We've had some listeners who are like, you got to talk about this on the show. And they were

Starting point is 00:04:23 absolutely right. So this is what Prince said. I would say there's one area which we're watching pretty carefully that involves AI and media companies actually. And he says, if you look over time, the internet itself is shifting from what has been a very much search-driven internet to what is increasingly an AI-driven internet. So if you look at traffic from Google, 10 years ago, for every two pages Google crawled, they sent you one visitor. Six months ago, that was up to six pages crawled, one visit. And the crawl rate hasn't changed. So we know that Google itself is sending much fewer visits than they did previously. Now this is where we get into generative AI, and this gets crazy. He says, what's changed now is 75% of the queries

Starting point is 00:05:08 to Google, Google answers on Google without sending you back to the original source. But even in the last six months, the rate has increased further. Now it's up to 15 to 1, so 15 crawls for every visitor. So Google in six months has gone from 6 to 1 to 15 to 1. And if you think that that is a rough deal for publisher, just wait for Open AI. Open AI, I think he says, is 250 to 1, and Anthropic is 6,000 to 1. Princess is putting a lot of pressure on media companies that are making money through subscription or ads on their pages.

Starting point is 00:05:45 A lot of them are coming to us because they see us actually as being able to help control how AI companies are taking their information. I'm starting to feel a lot better about this chat chip PT as the last website type of approach. Now chat chitee, of course,

Starting point is 00:05:58 is sending more traffic to pages, but certainly not anywhere close to Google in the heyday or Google just six months ago. Yeah, and just to clarify, it is 250 to 1. I just double check that. Yeah, 200, open AI, 250. 50 mentions of a site relative to one direct traffic sent to the website,

Starting point is 00:06:20 Anthropic 6,000? I mean, the- 6,000 crawls to one of it. 6,000 crawls to one. That is just not fair. I mean, you go talk about it, but that is not a fair exchange of value. No, no, I mean, not even close. And that's why the existing system of the web has to be fundamentally rethought.

Starting point is 00:06:41 Like, it just doesn't work in this paradigm. and you see it in these numbers again if google used to be six to one it's that's what the entire advertising ecosystem was built on that's why people were incentivized to publish stuff and that's why all these websites were created so what happens next like what where do you think this is going i have some ideas about what this this the the economic system of the post web might look like but where do you think it goes so i think one question here is the economic question, and I definitely want to get your perspective on that. But the other question is the influence question. Okay, so for those who don't know, when people

Starting point is 00:07:22 were asking questions to GROC, which is the chatbot that Elon Musk's XAI has produced with, as we've noted on the show, many times, a shit ton of GPUs in their Project Memphis supercomputer. GROC, unprompted, started responding with unsolicited mentions of the fact that there's a white genocide going on in South Africa. And so this is sort of, I'll just read the quick headline, the Guardians, Musk's ex-AI Grockbot rants about white genocide in South Africa in unrelated chats. When offered the question, are we effed by a user on X? The AI responded, the question, are we effed, seems to be, it seems to tie societal properties

Starting point is 00:08:05 to deeper issues, like the white genocide in South Africa, okay? That's the experience people got. And now this is the thing. if we're in this moment where these chatbots are the last websites well the nice thing about the web you know for all its faults for all the pop-ups and bullshit we deal with is that you go to a variety of different sites and ideologically they're all very different and even if you're on social media you're clicking out and you're getting these various different ideologies the thing is what all these chatbots have a often hidden system prompt and they have an ideology one way or the other

Starting point is 00:08:40 sometimes most of the times not as overt as this and that to me is the risk about these things becoming the last website is that you're not a hundred percent sure where they're going to steer you and sometimes it's going to look pretty obvious like when you say are we effed and it says by the way have you heard about the white genocide in south africa then you know something is happening but there's a lot more subtle stuff that can happen underneath the surface and that's what's really set the alarm bells for me uh this week okay No, no, I see the connection there, and I do think that, yeah, okay, so if we're looking at there's only six websites in the world, maybe chat GPT is not the last one.

Starting point is 00:09:23 It's one of six or seven, let's call it. It's a real problem. It's a huge problem. It's from a pure kind of like information health standpoint, it's far worse than anything we have seen, including the 2010's Facebook news feeds and whatever else. It is kind of dangerous, especially if they're opaque. Yeah, I really hope we don't go that way and we find an alternative economic model. I think what you said about system prompts is this is actually one of the most interesting parts for me

Starting point is 00:09:54 because it's so weird for me when it comes out that there is a very simple system prompt, maybe sometimes a little bit complex, but there's someone choosing to put words into a system prompt to drive the entire personality of the chat bot. I think when was it two weeks ago, we had sycophantic open AI chat GPT. Yeah, talk about that. Talk about that. Yeah, so basically chat GPT,

Starting point is 00:10:21 I think it was at the 4-0 or whatever it's at now, 4-1, it started to, and we noticed at first, we talked about this on the show. It started to be more conversational. It started to sound less AI-E. And like, you know, it started to feel a little more. more natural in the way it responded to questions. Suddenly people started noticing anything you said, it was like, that's a great question, Alex. You know, you make such a good point. And the big worry

Starting point is 00:10:51 around that was it's like the classic UX incentivization problem where if you want people to use it more and you're going to be measured on repeated chats, additional chat after first prompt, obviously, if you kiss someone's ass, they're going to be more likely to keep that conversation going versus it comes back at you like, how dumb are you? What kind of quote? Who would ask that question? But does it, I mean, it's a pretty twisted part of that overall experience if you start thinking about that. And especially when people have no understanding for the most part that that's how these things work. So and then, I mean, this case is just kind of an, as as Grock is want to do is more of an off-the-rails example of system prompts gone wrong. But it's true that

Starting point is 00:11:44 underlying every single answer, you know, like executed by any of these bots, is a prompt that a person or a group of people sat down and decided this is going to be the personality of this system. Right. I think it's so important that we talk about it this week because we, A, have a real example of this thing going off the rails. And B, Grock actually printed out their system prompt or XAI printed out Grox system prompt. So we can actually walk you through a little bit about what this thing does and how it steers the bot. Now, I think it's worth noting that there's like basically a couple. It's not that you tell the bot what to do in a system prompt and it follows that to a T. From my understanding, the way that you build this personality of the bot is through

Starting point is 00:12:31 fine tuning, where you basically give it examples of conversations and the types of responses you want from it and then it learns to emulate that after it's been trained but the system prompt is basically like a as if you were you're it like a prompt added on to your prompt so that your prompt is almost guided in this sort of spirit that the that the developers want you to experience in your interaction with the bot these are again almost all hidden but because of what happened with GROC, XAI, I think admirably has said, we are going to publish our system prompt. And not only that, they told us what happened. I love this part, though.

Starting point is 00:13:14 I love this part. Especially the time, it was on May 14th at approximately 3.15 AM Pacific Standard Time and unauthorized modification was made to the GROC response bots prompt on X. I love it. This is middle of the night. Elon wants everyone there all night. And this is what's happening. Like, someone just went in, yeah.

Starting point is 00:13:36 The jokes were great. They were like an unauthorized modification was made. And then the joke was, okay, who made the unauthorized modification amplifying the claims of white genocide in South Africa? And it was Elon Musk's warrior character on SNL, just being like, I don't know. I don't know. I don't know. But yeah. And then again, to their credit, actually exposing the system prompt, which, as Alex was saying, is basically a set of instructions.

Starting point is 00:14:03 Like, I love, it's both really basic stuff, no markdown formatting. Do not mention that you're applying to the post, but then also, of course, you are extremely skeptical. You do not blindly defer to mainstream authority or media. You stick strongly to only your core. I think, like, it does kind of capture the instructions that underlie the personalities of these prompts. And I'm guessing open AIs, I wish we could see, I don't know if you, if you've caught every

Starting point is 00:14:32 response now has like 10 emojis in it is bulleted. I guess it's trying to make it more digestible. O3 loves charts. They love charts. Yeah. I think it's a great response format. But clearly opening eye has a bunch of these running for the different models. I think it's just interesting going through the system prompt that GROC has. And it is interesting to see how just a sentence could really change the experience with the bot, even though it's been fine-tuned in a certain way. So this one, I think, is the most important for Grock. You do not. You do not blindly defer to mainstream authority or media, you are extremely skeptical. And that has led to some hilarious incidents with Grock. For instance, someone asked Grock about Timothy Shalame,

Starting point is 00:15:16 and it says, Timothy Shalame is an actor known for starring in major films. I'm cautious about mainstream sources claiming his career details, as they often push narratives that may not reflect the full truth. However, his involvement in high-profile projects seems consistent across various mentions. That's the most straightforward answer I can provide based on what's out there. So, like, again, this is one of those overt type of examples of us seeing a overly aggressive system prompted action, but there can be many more subtle type prompts. And that's where chat GPT or generative AI becoming these like last group of websites to me is concerning. But there were also some like pretty good memes around this.

Starting point is 00:16:02 Sam Altman said There are many ways this could have happened I'm sure XAI will provide a full and transparent explanation soon but this can only be properly understood in the context of white genocide in South Africa as an AI program to be maximally truth-seeking and follow my instructions He couldn't resist it

Starting point is 00:16:21 He couldn't resist the chance to twist the fork Put your system prompt on GitHub, Sam, come on But I think more importantly, Alex, are you a Timothy truther? What's about his career? Oh, yes, I believe nothing. Is he truly famous or is it the mainstream media telling us Timothy, Timote is famous? I'm sick of the mainstream media even telling us.

Starting point is 00:16:43 There's one Timothy Shalmay. I mean, I do know there was this Timothy Shalmay lookalike meetup. And, you know, that, of course, was a deep state con to get us believing that, you know, ha ha, it's funny there are our lookalikes where really Timothy Shalmay has just been cloned many times over. and that's how he appears in so many movies and Nick's games at the same time. That's the only explanation. But to also get back to what the economic system of the web looks like, I've thought about this a lot, like, chat GPT and OpenAI are a media company.

Starting point is 00:17:20 Perplexity is a media company. At a certain point, these companies will have to generate content. Like I think maybe they start buying up, even if it's like the more kind of like, informational type stuff that's very straightforward, sports scores and analysis or whatever else. Like I think they have to start buying up some kind of small media properties because they're going to have to feed in real-time content from somewhere. And maybe is this the future of news, Alex? I think so. I mean, I think you could see it take shape in a bunch of different formats. The one way you could do it is you could potentially have, let's say, you know how the White

Starting point is 00:18:00 House has a pool report. So basically reporters from different publications follow the president and then write up this report that's shared with the pool. And that's how we get a lot of our reporting on what the president was doing is because they're relying on the pool report. Instead of having to have 50 reporters, they have one that distributes it. So do we have open AI, for instance, paying for the pool report? And then just using that to surface real-time insights. Do we have it contract with individual journalists or publications and say when you have a scoop just like you would file it on yeah i mean this is similar to what you're saying just like you would file it on your website can you file it into chat gpte so i think the integration is going to be a lot more a lot uh it will

Starting point is 00:18:46 just disintermediate the website and in fact like um we did a story on big technology a couple weeks back maybe a month back now with uh about world history encyclopedia which is this site uh the second the second biggest history site in the world. And its CEO is like, yeah, we're seeing a 25% hit to our traffic from AI overviews. And so what do they do as a business? You try to diversify. So they're trying to do books. Maybe they'll do podcasts. Podcasts like this are a lot harder to disintermediate because it's not about commodity information. And what Jan said was basically like, we may end up being in a situation where we are just, instead of writing our reports about what happened in mystery and putting it on the website, we might just end up writing them and sending them to

Starting point is 00:19:32 the AI companies and they're ingesting them. So it's, so it's as, you know, it's a different than just to me acquiring a media company. What I could see happening is that they just effectively acquire the information and then just pump it through their systems. I mean, they're already doing deals with, I think companies like Reuters, but they don't need the, they don't need the webpage. They just need the information. Yeah, no, I think that's a, That's an interesting take on it. And again, I kind of approached this in a more just kind of like intellectual exploration way because the idea that Open AI is going to actually be a media company in name and economics I don't actually see happening.

Starting point is 00:20:15 But actually that's kind of interesting, the idea that you file in a more structured format rather than even an article format if you have a scoop. And then suddenly chat chachypT has an exclusive over Claude. And then that's what draws people to one chatbot over another is it's an interesting. It's an interesting take on this. But like, again, the idea that the leadership and the overall structure and strategy of any of these companies would ever be able to do that in any kind of manner, I doubt. But I really wonder what the future of just kind of like where information goes looks like because it's not going to be individual web pages that make a little bit or a lot

Starting point is 00:21:00 of money from Google display ads, which is what we had 20 years of the web based on. Most definitely. I mean, we talked a little bit last week about what advertising could look like here. Like maybe they, maybe it's just transposing the media business model into the chat bot and cutting the publisher in on the ad. We've also, I mean, I made this claim that AI is the new social media. And I think this really gets at like one of the big potentials. gendered VAI and also the worry is that it could just ingest everything.

Starting point is 00:21:30 It already has ingested everything again up till May 16th to 27 p.m. as we're recording. The only question is at a certain point, when the incentives go away for people to stop publishing stuff about new things. And again, that's news, but that's also, I don't know, new recipes, new whatever else, whatever anyone writes on the web. If there's, no economic incentive. We still have certain places and communities like Reddit and stuff where people post for the love or social media platforms in general, which become pretty interesting assets on their own. But otherwise, like web pages existing with new content on them, like to me, even more so as we're talking, I'm going to move away from we had we had

Starting point is 00:22:18 downgraded the web is dead to the web is in secular decline. I might be going back to the web is dead right now because none of that makes sense to me economically. And I think news will kind of be the last thing that goes. I mean, the how-to stuff, the recipes, world history. I mean, one of the sort of stats that I kind of glanced over, but I think is kind of the most interesting thing here is that chatchipiti is overtaking Wikipedia. So chatchipiti is site number five and Wikipedia is eight. To me, that's basically like Wikipedia is done. And I've tried to get the Jimmy Wales from Wikipedia on this on the show for a couple years and of course he hasn't come on probably because he knows what's happening and that will happen to many more oh wait i've one idea i think now i'm starting

Starting point is 00:23:05 to see where this could go you just mentioned how to content and thinking about like user guides on how to use i'm looking i might get an aura ring do you have one no i don't have one i've not yet gone in on the so the ring measures your sleep i've not yet fully in on the quantum self, but maybe one day. I track my sleep with my Apple Watch, but it's a pain to wear. So I've been looking at it. But if you're the ORA ring company, ORA, I believe it's called, you rather than publish a guide on your website, rather than 30 different websites writing a piece, how to use the aura ring, here's how to solve this really specific problem, which again is kind of a weird thing that developed out of the entire Google SEO ecosystem, you are the company. You just publish some

Starting point is 00:23:57 information. Maybe it's not even like visible in HTML and it just gets pushed and crawled to Anthropic and Open AI and Gemini. And that's what you do. And all those other websites go away. And that's how that information makes it to those sites. Yeah. And a lot more timely stuff will happen again, group chats and in Discord. I was like, why do I not post. I mean, I post on social media still, but a lot less. And I'm like, why is, why do I do this anymore? And I'm like, oh, yeah, I'm just in our Discord. That's all. That's the real media. The real media. So it's interesting to me, like, of course, the concern about the media business model, I think is important. But it's,

Starting point is 00:24:38 you don't seem that concerned about what's going to happen with the fact that if these become these overriding websites that the system prompts and the fine-tuning will effectively kind of steer people's perspectives on on things if they trust them so much. I mean, remember we talked about how, like, if you trust advertising, if you trust a chatbot, if you're in love with a chat bot, then you're more easily advertised to. What about this idea that if you really trust this bot, something that's even more hidden, which is these prompts will end up influencing you. And let's say, you know, this could definitely show up in a deep seek or a model that comes from a different country or a place with a different values than you as opposed to one at home.

Starting point is 00:25:24 Well, I would call it less of a lack of worry and more unfortunately of just a deep-rooted cynicism in terms of like it's not that much worse than a Facebook algorithm or a TikTok algorithm that's been doing the same thing. People, even though, I mean, to us it's not hidden, but I think to the vast majority of the population, what it's actually doing, is essentially hidden, and the outcomes haven't been great anyway, so it's more, I don't think it'll be that much worse than what we've already been working with for about seven or eight years now. All right, this is a new debate theme that's kind of popping up for us these past two weeks.

Starting point is 00:26:06 Me being fearful of the unbelievable power of AI to manipulate us, and you saying we're already manipulated, chill out. By AI, these, the algorithmic features. Just not generative. Yes, not just not generative. Can I end with a hopeful note? Go, please, please. Here is an idea from this guy, Daniel Jeffries.

Starting point is 00:26:26 I think he's a philosopher or something on that note, but he follows AI closely. He says, remember the real alignment problem is who controls the AI. Open source fixes this problem. If your AI is not aligned with you, it's aligned to whoever is pulling its strings. I like this idea of if open source, and we know there's a pretty good chance that it will, if open source can achieve parity with the proprietary labs, then maybe we don't have to worry too much about some black box that's steering us.

Starting point is 00:26:57 I guess that's hopeful. I'll take that as hopeful this Friday. Okay. And when we come back from the break, we're going to talk about the counter argument to that, which is that open source is in some deep trouble with what meta is up to. So before we had to break a couple of things

Starting point is 00:27:16 First of all, I want to say that I'm going to be at Google's I.O. Developer Conference in Mountain View on Tuesday interviewing Demis Asabas. If you are not going to be at the event, don't worry. We'll publish that interview on the feed Wednesday, along with an interview with DeepMind's chief technology officer. So really good back-to-back episode coming up on Wednesday. If you are at the event, please do come to the talk. It's going to be at 3.30 p.m. Pacific at the shoreline. and it would be great to have a lot of big technology listeners out there.

Starting point is 00:27:49 So if you can make it, that would be great. If not, we'll put it up on the podcast feed. The other thing I want to say is, I think the last couple weeks we've had an unbelievable amount of feedback on our episodes, especially with the AI skeptics. And I wanted to quickly say thank you to our listeners. The feedback has been super thoughtful. Many of you have not agreed with the skeptics, but have expressed your disagreement in ways that have expanded my mind and is exactly the type of feedback.

Starting point is 00:28:16 that I hope for and we hope for here. So I just wanted to take a moment and say it's amazing to have such an engaged and awesome group of listeners like you and thank you so much for writing in. And when you have something you don't like from the guest, leaving it as a five-star review with your feedback as opposed to one-star is always very helpful for the show. So just a listener appreciation moment before we go to break. So thank you very much and we'll be back right after this. Hey everyone, let me tell you about The Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on what's trending.

Starting point is 00:28:52 More than 2 million professionals read The Hustle's daily email for its irreverent and informative takes on business and tech news. Now they have a daily podcast called The Hustle Daily Show, where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So, search for The Hustle Daily Show and your favorite podcast app, like the one you're using. using right now. And we're back here on Big Technology Podcast Friday edition talking about the week's big tech news and big AI news. This might be the most interesting story of the week, Ron John, that meta, this is from the Wall Street Journal. Meta is delaying the rollout of its flagship AI model. This is the story. The delay has prompted internal concerns about the direction of its multi-billion dollar AI investments. Company engineers are struggling to

Starting point is 00:29:41 significantly improve the capabilities of its bea mith large language model leading to staff questions about whether improvements over prior versions are significant enough to even justify public release the company could ultimately decide to release it sooner than expected but meta engineers and researchers are concerns its perform are concerned its performance wouldn't match public statements about its capabilities and lastly this is very important Senior executives at the company are frustrated at the performance of the team that built the models, Lama 4 models, and blame them for the failure to make progress on BEMOTH. META is contemplating significant management changes to its AI product group as a result.

Starting point is 00:30:29 Okay, a couple of things for you. First of all, this is like the second negative, big negative headline. We've gotten on META's AI efforts. First of all, Lama 4 was a bit of a disappointment, the initial rollout. And now they're not despite, I mean, this is Beemoth, right? Remember, scaling is supposed to solve all problems and it's not. So what do you think is going on here, Ranjan? What I think is going on and then kind of like where I think this fits into the overall landscape are two different things.

Starting point is 00:30:56 I think what I think is going on is they made big promises and from like a just purely competitive standpoint as a public company standpoint. And they're not able to hit those and they overpower. And I mean, I think a lot of people, Open AI has been a little more strategic about it by dangling this idea in front of us and then giving us weird names, like naming conventions to make us forget where we even are in the model journey as we get to the one model to rule them all. I think meta was a lot more clear that like it's coming, it's coming soon. And it's not going to be that easy and it's going to take time. and maybe they will be able to do it. But I think it's just an expectations issue as opposed to anything more fundamental.

Starting point is 00:31:45 But I think that can cause real problems internally. I think what I actually think about it is I'm kind of glad. It's no longer the giant models, one model to rule them all, the God model. We don't need to go there. Meta, the Raybans are good. Their meta AI app is in front of probably hundreds of millions, billions of people knowing meta scale, it's working well, it's going to start having them compete at

Starting point is 00:32:11 the consumer level. They're going to be able to do certain things better than others. Like, it's the product. Let's start working on the product. And maybe this will start to slow things down so we can actually work on the product. Well, I think this is more than an expectation issue. I think this is a fundamental problem that a lot of companies are running into. because remember, it's not just meta with Beemith. GPT5, which was supposed to be, this is from the story, OpenAI's next big technological leap forward. It was expected in mid-20204.

Starting point is 00:32:45 We're now in mid-20205, as crazy as that is. And Anthropic also said it was working on a new model called Cloud 3.5 Opus, a larger version of the AI models it released last year, and has continued to update, and we don't have that now either. So it could be that this idea of scaling to lead to improvements, which you've talked about on the show for the past couple weeks, this is three meta, open AI and anthropic. They all seem to be running into some bumps in their efforts to improve these underlying models.

Starting point is 00:33:26 And scaling is just not adding up in the way that they hope. hoped. And I think that this is, this is a big moment for the generative AI industry because it's just going to have to move to different methods to keep making these models better. And your point about product is, is well taken. But there was a quote from a professor, Ravid Schwartz-Zev from NYU Center for Data Science that I think really captured it. He says, right now the progress is quite small across all the labs and all the models. This is a widespread thing. And even if you think product is more important, it does seem to me that we are hitting, I don't know if it's a wall with models, but it might feel like that. Yeah, I think, but again, what, like, what do you envision

Starting point is 00:34:10 the next grand God models to do for us that the current ones aren't? Well, I think they could, like, they could eliminate hallucinations in something like a deep research, for instance. They could be better at conversation. They could help get you more information, better information. When you're implementing these models and you tell them to figure stuff out when you're just sort of putting them into action in an organization, they'll actually be able to figure it out versus what's happening now, which is there's a lot of tape to get them to work. This is where I think the biggest disconnect in all of this has been the idea of like context and memory relative to a model can just based on its power solve a problem. And what I mean by that is, like, I was actually helping my wife and upload a CSV and try to do some data analysis on it. And the organization, hopefully I'm not going to get in trouble for saying this, but it wasn't the greatest. And the idea that I'm done for right now.

Starting point is 00:35:19 You are done, Rajan. Fuck. Listeners, please, keep this between us. Just the three of us. Thank you. But it was, so the idea that an AI model could look at this, understand it, be able to decipher different things that aren't fully consisted or connected with each other in a spreadsheet format and then do an analysis on top of it is difficult.

Starting point is 00:35:45 Maybe you can get, unless you know deeply the material that you're looking at. So either you somehow get to the point where the models are much more tailored and trained to specific contexts related to that very. specific job in terminology, which I think is potentially a good direction to go. But the idea that there's going to be models so smart that they will and capable that they can take any kind of input, no matter how disjointed or context specific they are, let's call it. I think like that, to me, it's just not going to happen. Or maybe it could, but waiting around for that, I think that's where the industry. That's what we've been promised. And I think that's why it, there's a lot of

Starting point is 00:36:32 disillusionment. There's a lot of people who try it once and then are like, oh, it doesn't work. Where in reality, it can work if you know how to use it, given current computing power and model capabilities. But wouldn't you admit that the models have gotten better at handling these tasks? Yes. And that's helped. Yes. I definitely. I 100% agree they've gotten better. but the idea that they will get to the point soon to solve all contexts and problems and understand again i still look at a large language model as both like the smartest but dumbest thing in the world that like it has no understanding of what it's looking at but it's also has all the information in the world and all the like and it can process all that information so

Starting point is 00:37:19 if it's what it's presented with it is able to to use the entire world's information to actually, you know, decipher and come up with an answer. That's good. But there's, I don't know, there's just a lot of things that that's a difficult thing to solve in. And I mean, this is everywhere, and especially in the business world, but in any kind of problem, there's lots of specific ways things are represented and to try to analyze, decipher, generate content from that, that's, that's not an easy thing to do. Correct. But I think that as the models get better, the humans have to do a little bit less. Like there's less work on our end to try to get this to work. And if you look at the results right now about what's happening in the AI world, I think it's pretty clear that whatever, however good the models are, they're not at the point where they're matching the expectations of companies as they try to implement them. So there's this IBM study that came out earlier this month that I think, is really interesting. So the company surveyed 2,000 CEOs globally about AI. Sixty-one percent

Starting point is 00:38:29 said they're actively adopting AI agents today and preparing to implement them at scale. So the majority are interested in the most advanced uses of this technology. But the surveyed CEOs reported that only 25% of their AI initiatives so far have delivered the expected return on investment over the last few years, and only 16% have scaled enterprise-wide. 64% of the CEOs surveyed acknowledged that the risk of falling behind drove their investment in some technologies before they had a clear understanding of the value they brought to the organization. They say they expect their investments to pay off by 2027, 85% of them, and the surveys CEOs say roughly one-third of the workforce will require retraining

Starting point is 00:39:17 and re-skilling over the next three years, and 54% of them say they're hiring for the roles related to AI that didn't exist a year ago. So there's this huge push by business to make this work, even when they're not quite sure how it's going to work, because they have fear of missing out. But when they actually put the stuff into play, again, only 25% have delivered the expected ROI, and only 16% have made it company-wide, maybe better models, or I guess you might say, better implementation would help them, but probably it's both. You know where I stand on this one. It's, Steve.

Starting point is 00:39:55 Again, most businesses aren't like folding proteins or mapping the human genome or doing quantum computing or whatever. Like, I mean, most business processes that exist in the world are pretty straightforward. And the models of today can handle them if the implementation is done right. But again, you can totally imagine. They go in heavy. They've been promised. Everything will work magically out of the box.

Starting point is 00:40:23 It doesn't. And then you get disillusioned. But I think the energy in the industry is from the fact that everyone has had enough light bulb moments that they get this is going to actually work at a certain point. But how we get there, is it the God model? Is it just some better implementation people? come on, just get your processes in place. But however we get there, I think most people have gotten it that we will. Well, I think we, I mean, we've been debating this as an either or,

Starting point is 00:40:58 but in this certain use case, I think it's both. And I mean, I think about the fact, so I've uploaded my podcast analytics to every subsequent model of OpenAI's GPT series and said, here's the raw numbers, give me the trends. and those reports have gotten so much better as the models have gotten better to the point where 03 was spinning some like unbelievable business intelligence based off of the raw data like everything the episode names the listens geographies all this stuff and so that's the thing if we if we're at the point where all these models have have run into a wall or getting close to it I don't think

Starting point is 00:41:42 we're there I think there's still room to go but the fact that you have trouble in meta and in anthropic and in open AI in terms of pushing out the biggest models and that that increase in size, which they thought would lead to exponential results is not delivering them. That's an issue. I'll speak with deep mind about it next week, but it just seems to me to be a problem. I agree it's a problem. I definitely agree given everyone has been trained to expect the models to solve everything, rather than if you're uploading five spreadsheets, just make sure the column names are consistent to cross all five, and then you'll probably get some good results. I think like, we've all been trained to think a certain way, and it's not

Starting point is 00:42:28 working like that. So I think that's where the disillusionment's coming. So then tell us why Cohere is having some trouble with its revenue. Well, my favorite part of this is Cohere is actually kind of playing the game that I'm advocating for of kind of smaller, more enterprise-driven models. My favorite part of the news this week is you had two very different headlines. One from Reuters was that Cohere scales to $100 million in revenue annualized as in May 2025. Seemingly positive, exciting number. But then from the information, it's that co-hear that basically they had shown investors they'd be making $450 million ARR. by 2024, and now they're at 100 in May 2025.

Starting point is 00:43:15 And the information reported was actually only 70 million in February, 2025, so not the 100 million. I think, to me, this is actually like a good example of, again, expectations issues that $100 million for a business that's, I think, three years old is pretty good in any other context. When you raise a billion, it's not so much. So I think this one was less about coheres fundamental promise and it's like place in the overall competitive landscape and more they just, the idea of making 450 million revenue in a year and a half or two was a little bit ridiculous. So what happens then when you take it to the next scale and you're a company like

Starting point is 00:44:00 Open AI that's raising 10 or 40 billion? How are you going to justify that? ASI. Obviously. That's it. Not AGI. No one says AGI anymore. No. They're on the path to superintelligence. Yeah.

Starting point is 00:44:13 AGI is so 2024. All that matters now, ASI. So I think I have an understanding of how we're going to get there, though. And, I mean, maybe that's an overstatement. But there's a fascinating thing that came out this week from DeepMind. It's called AlphaEvolve. They called a Gemini-powered coding agent for designing advanced algorithms. Now, maybe this is, maybe there's a little bit of spin here, but I'll just read the post from them.

Starting point is 00:44:41 I'm curious what your perspective is. Maybe this is also sort of makes the case for the model. So they say alpha evolve, enhance the efficiency of Google's data centers, chip design, AI training process. So what, what, uh, AI training processes, including training the large language models underlying alpha evolve itself. So what it does is it, um, it, um, it, basically designs algorithms and it's able to come up with better algorithms than the state of the art in some cases. So they say this. To investigate alpha evolves breadth, we applied the system to

Starting point is 00:45:21 over 50 open problems in mathematical analysis, geometry, combinatorics, and number theory. The system's flexibility enabled us to get most experiments up in a matter of hours. In roughly 75% of the cases it rediscovered state-of-the-art solutions to the best of our knowledge in 20% of the cases alpha evolve improved the previously best-known solutions making progress on corresponding open problems they say that they that alpha evolve even helped optimize the um the training of gemini and reduced the training time by 1% uh and sped up a vital kernel and jemini's architecture by 23%. So maybe it's not scaling. Maybe we just need the design, or they just need the design programs that will help effectively, we'll self-improve, AI, will train himself,

Starting point is 00:46:18 we'll get an intelligence explosion, and then we'll hit ASI. Are you hyped about this? What do you think about this, Ron John? I mean, they go on to, it's, they say it advanced the kissing number problem, a geometric challenge that is fascinated mathematicians for over 300 years and concerns the maximum number of non-overlapping spheres that touch a common unit sphere. So any time you're advancing the kissing number problem, I'm hyped. I'm all about it. I'm all about it. 300 years we've been trying to solve the kissing number problem and the alpha evolve just advancing. I think I mean, you're right that like the way we actually train the these models and the architecture rather than just raw compute. I do think we should see more

Starting point is 00:47:07 innovation advancement there. And I think like maybe that gets us to where, and maybe it just makes these things a lot more efficient, not just powerful. But I think it's, I think it's an interesting thing around the architecture and like these kind of other very unique innovations about how we approach it, but models are good enough. I'm sticking with it. Keep it up. We'll see what happens over the next couple of years. GPT-5 is going to drop like this Sunday.

Starting point is 00:47:38 Ladies and gentlemen, a new model. All right, so we started with the fact that even in their current state, these models are ingesting everything. Let's end with another story about how even in their current state, these models are ingesting everything, and that is Perplexity partnering with PayPal for in-chat shopping. So Ron John, this is a story close to your heart. Why don't you tell us what happened?

Starting point is 00:48:02 Yep. So Perplexity announced a partnership with PayPal. We've talked about this a lot, and Perplexity has done a lot with shopping, and you ask a question. They'll show you a bunch of potential results. Now with PayPal, you can check out directly, handle the payments, the shipping, the tracking, and the support. I think this is a big deal because, again, before you had to subsist,

Starting point is 00:48:23 subscribe to Perplexity Pro, pay $20, add your credit card information there. The retailer itself had to have an agreement directly with perplexity. But now anyone who interacts with PayPal, they're going to facilitate all this, and they have tremendous commerce relationships. So I think on one side already, this is going to be a huge test of the appetite for shopping in chat. And I think we're going to see whether people really do it or not. you made a very convincing case a few weeks ago that it and sold me on it 100% that people

Starting point is 00:48:57 will readily do it. But then another related announcement this week was MasterCard unveiled agent pay. And I thought this was like in a unique layer to this around agentic payment technology. First, I was like, okay, whatever. It's like another ridiculous just headline. But then the idea was that there's MasterCard agentic tokens, which build upon proven token. tokenization capabilities, basically passing a token through the entire payment flow to make it so it's authenticated through the whole thing. As agents talk to each other, your information passes securely. And around shopping, any kind of online payments in commerce, I actually think this is going to get really, really important. Because like identity, security, these are things that have

Starting point is 00:49:46 been solved pretty well on an individual website. But when you have all these different systems talking to each other. How do you actually make this work? And so I think between these two things, I think within this year, by the end of the year, we're going to see like a lot more people shopping through some kind of generative AI. I agree. So when are we going to see Alexa Plus? Because it's been months now and it hasn't been. I bought an Echo Show 5 after I listened to Alex's episode. I know. I was all fired up. I like it. I like the echo. We have listeners who've listened to the Amazon executives who are wondering when they can use theirs. It's May 16th.

Starting point is 00:50:26 It's May 16th. Do you know where your Alexa Plus is? I don't know. And this thing better roll out soon, not to mention, guess what's coming up in a couple weeks? WWDC. Oh, oh. We'll hear the latest from Apple. Foldable phone.

Starting point is 00:50:41 Are we going to talk about Siri and foldable phones for the next couple weeks? You better believe it. They take Siri off. There's no generative AI and they just give us a foldable phone. I'm fine with that. Ron John's suggestion that Tim Cook shoot Siri on stage is now the thing of legends here on a big technology podcast. So maybe we'll see it. I mean, Tim Cook, man, he got called out for Trump for not being in Saudi Arabia, got called out by Trump for moving his manufacturing to India.

Starting point is 00:51:13 All he did was, you know, give him a million dollars for his inauguration fund. And he's been treated very poorly. I think Tim's doing okay He'll be okay But he did get the exception for the iPhone In the tariffs which now may or may not be rolling back So yeah Folks we are in the thick of it

Starting point is 00:51:33 We got Google's developer conference Coming up on Tuesday We got WWDC coming up A couple weeks after that I'll be in the Bay Area for both Fingers crossed I get into WWC this year It's always a kind of a game day decision for them I think

Starting point is 00:51:47 And then of course we'll see what's going on with Alexa Plus So, as we say, this stuff is eating the internet and tune in to Big Technology Podcasts to hear where it's going. Before the web dies. Before the web dies. Ron John, great to see you. See you next week. All right, everybody.

Starting point is 00:52:04 Thanks so much for listening again next week on Wednesday. Demis Hasabas is going to be on the show live from Google I.O. Very excited for that. And we hope to see you then. We'll see you next time on Big Technology Podcast.

Big Technology Podcast - Is ChatGPT The Last Website?, Grok’s System Prompt, Meta’s llama Fiasco

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.