Everyday AI Podcast – An AI and ChatGPT Podcast - EP 434: Will OpenAI run away in the LLM race in 2025?

Starting point is 00:00:00 This is the Everyday AI Show, the everyday podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the all-in-one creative AI studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. Will Open AI run away in the large language model race in 2025?

Starting point is 00:00:53 Or has Google already caught them? Or maybe Anthropic Clawed will wake up from its late 2024 nap of not too many updates and come back and retake all of the spotlight. We're going to be talking about that today in a lot more on Everyday AI. What's going on, y'all? My name's Jordan Wilson and welcome to Everyday AI. Before we get started, have to give a quick shout out to our partners at Microsoft. So why should you listen to the Work Lab podcast from Microsoft?

Starting point is 00:01:30 Because it's the place to find research-backed insights to guide your org's AI transformation. Tune in now to learn how shifting your mindset can help you grasp the full potential of AI. That's W-O-R-K-L-A-B, no spaces available wherever you get your podcasts. Another place you can get your podcast, well, right here, but our website. So if you're new here, thank you for tuning in. My name is Jordan Wilson. This is Everyday AI. We do this every day.

Starting point is 00:02:01 This is your daily live stream podcast and free daily newsletter, are helping us all learn and leverage generative AI to grow your companies and your career. So you can spend, I don't know, hours a day trying to keep up, trying to see what it all means, or you could let us do that, tune in every day, subscribe on the podcast, and go to our website, your everyday AI.com. On there, it's like a free generative AI university, hundreds, hundreds of episodes. You can go back, watch them, listen to them, read all the important insights on our website, all for free.

Starting point is 00:02:32 So that's your new home away from home. So make sure you go check that out. All right, before we get started, now I'm excited today to talk about the large language model race in 2025. Not quite a prediction show. I'm going to have that coming for y'all in two weeks, been spending like dozens of hours on that one. All right, but before we get started,

Starting point is 00:02:51 let's talk about as we do almost every day, the AI news. So Microsoft has announced a $3 billion investment to boost AI and cloud services in India. Microsoft's significant investment in India highlights the country's growing importance in the global tech landscape, particularly in AI. So Microsoft plans to invest $3 billion to expand its AI and cloud services in India. The company aims to train an additional 10 million people there in AI skills, which could enhance job prospects and career growth for many in the tech sector. CEO, Sadia Nadella, emphasized the exciting diffusion rate of AI in India,

Starting point is 00:03:32 in India, indicating a strong market potential for AI technologies. Microsoft operates three data center regions in India already and is preparing to launch a fourth aiming to develop a scalable AI computing ecosystem for startups and researchers. All right. Next in AI news, Google is reportedly building a new AI team that aims to simulate the physical world with advanced models. Yeah, more world model updates. These are important.

Starting point is 00:04:01 So Google is making headlines with the formation of a new team at Google DeepMind focused on developing AI models that simulate the physical world. Here's why it's pretty noteworthy. Well, this person leading it, former Open AI employee. So Tim Brooks, formerly a co-lead on Open AI's video generator, SORA, will lead the new team, which aims to tackle, quote unquote, critical new problems in AI modeling. The team will collaborate with existing projects such as Google's Gemini, Vio, and Genie, enhancing capabilities and image analysis, text, generation, and video production. So Google's Gemini series is already recognized for its versatility and AI task,

Starting point is 00:04:49 while VEO focuses on video generation and genie simulates games and 3D environments in real time. So this development of world models could revolutionize different. different sectors, including visual reasoning, simulation, and interactive entertainment, potentially impacting how video games and movies are created. All right. Last, but definitely not least. You're going to be hearing a lot more about CES this week, but CES is off to a big bang. So the biggest tech conference kicked off hours ago with NVIDIA CEO Jensen Wong delivering

Starting point is 00:05:25 the keynote address. All right. So like I said, we'll be covering this a lot more in today's newsletter and throughout the rest of the week. We'll probably have a dedicated episode, maybe on Thursday or Friday, recapping everything new. But here's what NVIDIA announced. And why did NVIDIA keynote, one of the biggest tech shows in the world, right, the consumer electronic show? Well, because everything that they announce impacts everything in technology, right? They are literally powering the generative AI movement with their GPU chips. So speaking of that,

Starting point is 00:06:00 some new GPU announcements. The RTX 50 series GPU was announced with Blackwell architecture. They have four new models of that, which are ridiculously priced. The RTX 5070 starts at $549. How is that even possible? Yeah, if you're not a dork, in RTX 5070, the new model for 549 is baffling. Also, the GP, the GB10, which is the Grace Blackwell super chip for desktop AI computing, announced updates with the Cosmos platform for physical AI in robotics training. Probably one of the biggest updates we saw last night. InVedia, getting into the desktop computer game with Project digits, its first supercomputer, computer. It's priced at $3,000, but it is ridiculous. powerful, more powerful than literally any computer that you can buy right now.

Starting point is 00:07:00 Also, they announced a partnership with Toyota for autonomous driving systems and also Uber and Invidia partnered to enhance AI technology in autonomous vehicles. So yeah, a lot more AI news in our newsletter. So please go to your everyday AI.com. Sign up for that free daily newsletter. All right. Yeah. Fred just said project digits.

Starting point is 00:07:21 Wow. Yeah. Juliet, the keynote, I believe you can go replay that. We'll leave that in the newsletter today as well. All right, I'm excited for this. So we'll open AI run away in a large language model race this year. So this is actually your show. So in the newsletter yesterday, I said, hey, we got our hot take Tuesday coming up.

Starting point is 00:07:44 What do you want to hear? So if you are a long time listening to the podcast, I literally started this thing for you guys, right? I noticed that when I was trying to learn more about generative AI, AI like four or five years ago, I noticed it was only for highly technical people. So I wanted to create something that's for all of us. So sometimes in the newsletter, I'm like, yo, what do you guys want to hear tomorrow? I'll stay up all night putting together a show just for you guys. So this is technically a user request.

Starting point is 00:08:12 All right, but let's talk about it. So right now, I'm really just focusing on Open AI, Google and Anthropic, for this, you know, who's going to run away with it. episode. Here's why. Obviously, some of the biggest names in the game are playing different games, right? Microsoft in their Microsoft 365 co-pilot right now is using OpenAIs GPT40 to power their system. So a lot of people are like, oh, what about Microsoft? Well, even though they are developing their own models, their Phi models are fantastic, small language models. They are reportedly going to be offering new models or new choices in the future aside from OpenAI's GPT4.

Starting point is 00:08:57 They're not a player in this game, not in the large language model race. Also, meta, meta is going open source-esque, not truly open source, but I think they're playing a different game as well. And there's all the Chinese companies as well that I think are going to be really thrusting themselves into the conversation for 2025. And like I said, we'll have our whole 2025 prediction shows. I think we're going to break it up this year in a couple of weeks. But this is just about the large language model race. Who is going to win it? Well, first, let me start by saying, why does this matter?

Starting point is 00:09:35 Well, you probably are using large language models every day if you're listening to this show. That's why it matters, right? Probably in every aspect of your business, you know, your company, your career, your personal life, you're probably using AI a ton. So this kind of frontier model race, right? It impacts us all, right? And even if you are not a huge, you know, a large language model user right now,

Starting point is 00:10:05 probably there's thousands of name brand pieces of software out there that you probably don't even know are leveraging these technologies, right? So that's the other thing, At least when we talk about the quote unquote big three here with Google OpenAI and Anthropic, their APIs or their backends are powering just about everything. There's very few enterprise softwares right now that are not using AI, that are not using large language models. And in most cases, in the overwhelming majority of the time,

Starting point is 00:10:42 they're using one of these three models. So even if you're not logging into the front end of these tools, you're probably benefiting from them, right? They're starting to seep into every aspect of our daily lives. That's why this race, this large language model race, is incredibly important. All right. Live stream audience, let me know who do you think is going to win this race? All right. So is it a, is it going to be a open AI?

Starting point is 00:11:10 Is it going to be B, Google? Is it going to be C, Anthropical? might it be D meta or you can leave E other in the comments. I'm curious what everyone else thinks, right? Part of this, doing this everyday AI thing together with you all is learning from you, right? I comment from my perspective. I'm lucky enough to spend the majority or maybe dorky enough to spend the majority of my day playing with large language models, testing new features, helping large enterprise companies, right?

Starting point is 00:11:43 companies, right? People are always like, oh, Jordan, how does this everyday AI thing, you know, make money? Well, we're lucky enough to have great sponsors and partners like Microsoft, but enterprise companies hire us, right? And they're like, hey, hey, Jordan and team, we have, you know, 5,000 employees or 500 employees that need to learn chat GPT or we need to learn Microsoft co-pilot and then we go help them. So, you know, I'm lucky enough to be able to interview people from these companies, but also help enterprise organizations and small and medium-sized businesses actually learn these, but I want to learn from you guys. So, yeah, a lot of people here so far saying, you know, Marie said, Marie said A,

Starting point is 00:12:20 Kathleen said A, which is Open AI in a landslide. Fred is team Google. Douglas says Open AI and Microsoft. Jackie says it's a two-horse race here between OpenAI and Google. A lot of people, no single votes for, Anthropic. That's interesting. It's interesting, right? Depending on where you look, where you read, you would think Anthropic is the only large language model out there, right? I think people on Twitter, for whatever reason, are very, very bullish on anthropic, right? Like, you would literally think

Starting point is 00:12:59 no other large language model exists. So that's why I like asking you all. But let's go ahead and talk about it. And I want to give you three reasons. All right. Yeah, this is Hot Take Tuesday. I'm going to accidentally go on a rant. I'm going to, you know, keep Fred on the treadmill longer than normal. Sorry, or if you're walking your dog, all right. But I'm going to give you three reasons why Open AI might win this race and why they might not.

Starting point is 00:13:31 And at the end, I'm going to give you my honest take who is going to win this. And again, let me remind you of the importance. your company is probably making long-term, maybe six, seven, eight-figure financial decisions on what large language model do you use, whether you're using it on the front end in a team or an enterprise account, or maybe you're building something on the back end with the API. There's a good chance your company is making major investments in large language models and how you use that to change knowledge work. keep that in mind. I always like to reframe this and tell you why it's important. So let's start with reasons. Open AI might not win the large language model race. Yeah? You might be curious on this one because you're like, Jordan, you talk about Open AI all the time. Well, yeah, it is, I mean, Open AI is the company that's technically started the generative AI wave.

Starting point is 00:14:32 Yes, the transformer technology in GPT originated with researchers at Google, but Open AI in November 2022 technically started this whole generative AI race with its AI chatbot chat GPT, even though there have been many different large language models available for developers prior to that. But reason number one, they might not win the race, even though they kind of started it, is because I think the race is going to make way for reasoning and agentic models. Or in other words, the way that we look at this race, right, like who's winning, we look at benchmarks, right? We look at things like MMLU or MMLU Pro or human eval, right?

Starting point is 00:15:19 We look at all these dorky benchmarks. But then we also look at head-to-head scores, right? So probably the most popular one out there is the LM Arena or the chatbot arena, previously under the hugging face umbrella, but now it kind of has its own domain. So this is where millions of users have gone on and they put a single prompt in and then they get two outputs and then they judge which one is better. All right. And that gives us what's called an Elo score.

Starting point is 00:15:50 So think of it like, you know, how they had the blind taste test for, you know, Pepsi and Coke. That's kind of what this is. There's dozens of frontier models that go head-to-head, blind scores, the top model, the model that wins the most, essentially, gets the most points. I think in 2025, these big companies are going to care less. I think in 2023, benchmarks and ELO scores largely drove the conversation. And it actually, I think, just these two metrics,

Starting point is 00:16:25 influenced decision makers on which model they should try. And I think part of it, rightfully so, right? Because when you looked at capabilities, again, up until the latter part of 2024, these two benchmarks, these two metrics alone, right? The chatbot arena and benchmarks, that told the whole story, right? And that's what companies race toward. I don't think it's going to be like that as much in 2025. I think companies like OpenAI are going to stop caring less.

Starting point is 00:17:03 And you can't say that they didn't care, right? Because what you would see anytime one of these companies, you know, released a model and then they, you know, were kind of crowned the top of the chatbot arena board, you know, literally you had a day later other company would release an update to their model that they had been sitting on. because they're like, oh, we got overtaken on the top of the chat pot arena board, right? So you can't say that this wasn't a driving factor for releases in 2023, 2024 it was.

Starting point is 00:17:35 But I think in 2025, we're going to be talking much, much more about business value, right? I don't think it's going to matter as much about benchmarks and ELO scores because we're already topping out, right? If you're a dork and you follow MMLU like me, I mean, all the new models are going to be 88, 89, 90, 91, 92, right? They're going to be in the high 80s, low 90s, which is smarter than the smartest human, smartest single human, and it's not even close, right? So we've already gotten past the point

Starting point is 00:18:08 when large language models, you can't make, you know, as long as you know what you're doing, which a majority of the people talking about AI or sharing about it online, literally don't know what they're talking about. But as long as your company knows what they're doing, and they probably do because you're investing in the technology,

Starting point is 00:18:24 right, there's a certain point of diminishing returns. I think once you hit a certain point on these benchmarks, once you hit a certain MMLU, right, once you hit a certain ELO score or a head-to-head win rate against the other models, like I think there's a point of diminishing returns where it's like, yes, I think it turns into a simple yes or no. And I think the smart companies have already figured this out, right? You know, overfitting a model to go from, you know, 88.7,

Starting point is 00:18:54 to an 89 on the MMLU is not going to be a driving factor anymore. So what I'm trying to say is open AI may not be atop the benchmarks, atop the leaderboard for the majority of 2025. Like they've probably spent 80% of the time since these benchmarks became widely used since the chatbot arena started becoming a main discussion piece. They probably spent 80% of their time at the top. I don't think it's going to matter anymore or as much. All right.

Starting point is 00:19:31 Fred said people do keep leaving Open AI. That's the truth. All right. Let's keep going. Because Open AI still has their old model, quote unquote, old, right? GPT40, which was announced in May. It is still batting at the top of these, like, chatbotter, which is another reason why I think they might not technically win the race.

Starting point is 00:20:04 I think the race is just going to be redefined, right? In terms of how we measure the large language model race, I think it's going to be more about creating business value than it was about these other things. So reason number two, open AI might not win the race. Chad GPT search is seriously flawed. Seriously. Okay. So without going into too much of a side tangent, most large language models, aside from Claude, for whatever reason, are not connected to the Internet.

Starting point is 00:20:41 All right. So that is problematic. So Internet connectivity is a huge part, at least of using these models on the front end. Because in often, sorry, in many cases, the training data, right? So essentially, think of large language models. There's a training cutoff, right? So, you know, you have all these smart researchers. They, you know, they gobble up all the information on the internet, a lot of it copyrighted, right?

Starting point is 00:21:07 They have smart humans, train these models and then release it to us all. And there's usually a knowledge cutoff. But that knowledge cutoff is generally somewhere from nine months to 18 months in the past, right? And most things you're working on, you need up-to-date information. So a large language model's ability to connect to the web is huge. All right. So previously, OpenAI used, they had a feature called Browse with Bing. All right.

Starting point is 00:21:38 And then I believe it was late, we'll just say October, October, November of 2024. Open AI rolled out chat GPT search. So from a UI, U.S, right? So from a user interface user experience perspective, there's great things, right? It's bringing aspects of Google map. It's bringing these rich snippets, right? ChatGBTGPT search. It's, they're still using as far as we know the Bing, the Microsoft Bing technology somewhat on the back end.

Starting point is 00:22:12 They haven't really described it a lot. But Chad ChbT searches how open AI and how chat Chbt stays connected to real time, up to date information past its knowledge cutoff. It's seriously flawed, though. Brows with Bing to not have these problems. And I think this is important to talk about. I always keep saying like, oh, maybe I'll have a dedicated show on this. But I know at any time, Open AI could just fix this. They have to know it's a problem.

Starting point is 00:22:41 Here's what's seriously flawed right now. Yes, it's great. You can ask ChatGBT, GBT, what's going on this weekend in Chicago, the city I live in, right? It'll give you this nice Google-esque like search results, right, with rich snippets, lists, you know, little photos, nice to use, right? Now we're seeing rollouts on mobile that give you essentially map results, right? Very nice and intuitive to use. However, iterative prompting is broken when you are using chat GPT search.

Starting point is 00:23:14 Okay. So there's a little globe icon. Sometimes chat GPT will use this feature or call this tool on its own, even if you don't call to it. But so much of using a large language model is what happens after the first prompt, right? It is the iterative nature. It is going back and forth, right? It's going like as an example, you know, our prime prompt polish, you know, our PPP method.

Starting point is 00:23:39 It's not just putting in one giant prompt. It's having a conversation. It's going back and forth after your first response. For whatever reason, since it came out, chat, GBT search, it gets stuck in a loop, right? you can't really iterate or build upon a result. I don't know why. It's been broken for many months. And that's concerning when a feature that big, not 100% of the time, but a good chunk of

Starting point is 00:24:07 the time, it gets stuck in a loop. So let's say if you ask as an example, what's the biggest AI news, right? And then chat GPT is going to use chat GPT search because it knows it needs real up to date information for that. it might spit out some trends, right? And then you go back and you refine it and you say, no, please give me the top AI news for January 2025. Guess what?

Starting point is 00:24:33 It's going to, in most cases, spit back the exact same response. I don't know why Open AI hasn't fixed this. It's a little concerning because there's hundreds of millions of people using chat GPT and chat GPT search. It's concerning that they haven't fixed this. I know I'm not the only one complaining about it, but I've been complaining about it pretty loudly. But that's a reason they might not win.

Starting point is 00:24:57 The fact that they haven't fixed this yet, and it has been a terrible, terrible user experience for multiple months. I get it. December, they shipped like two years worth of features and updates, but it looks like chat TBD search, it just got glazed over, right? The core functionality, because at its core, it is broken.

Starting point is 00:25:20 If you compare it to browse with Bing, which after the browse with being updates of the latter part of 2024, it was essentially a perplexity light already. All right. Reason number three, open AI might not win the large language model race. Well, they're burning cash, right? And they're facing an increase. This is all reportedly, right? Reportedly, OpenAI lost billions of dollars in 2025 or sorry, in 2024. So according to reports, OpenAI experienced a $5 billion loss in 2024.

Starting point is 00:26:10 So that's another reason why Open AI might not win the large language model race. They're burning cash, right? reportedly. And they need to turn a profit. They did they did just release their new and more expensive pro plan. That's $200 a month. Open AI, Open AI CEO Sam Altman did go on Twitter and say, oh, we're actually losing money on this, right? So I'm sure investors weren't thrilled to see that tweet, which led to a lot of news and media coverage on, hey, open AI is losing more money. So one reason they might not win it is because the advancements that they might want to be working on from a large language model perspective might not be getting the resources that it needs. They're losing key people like we talked about at the beginning at the beginning of this show.

Starting point is 00:27:03 You know, Google's new AI team and now they have Tim Brooks, formerly a co-lead on their video generator, SORA. So they're burning cash, right? They're reportedly losing money. And that might, I mean, between that and they're increased, at least from an external perspective, their increased focus on AGI, on ASI, so artificial general intelligence, artificial super intelligence, right? Their increased external focus on this could in theory keep them from their internal, daily driver, which is improving their two classes of models, right?

Starting point is 00:27:44 So they have their GPT class of models. So we have GPT40. And then they have their reasoning class of models. The 01, you know, so 01, 01, 01 Mini, O1 Pro. And then you have your O3, which may or may not get released in 2025. We'll see. But they could lose their focus on actual large language models,

Starting point is 00:28:08 chasing agentic AI, chasing AGI, chasing artificial superintelligence. So it could keep them from that day-to-day race. All right. Before we get into the three reasons, I think they might still win the large language model race. Let me tell you a little bit more about Microsoft WorkLab. So why should you listen to the WorkLab podcast from Microsoft?

Starting point is 00:28:33 Because it tackles your burning questions about AI at work. Like, how can I got it? my org's AI transformation. How can AI help maximize value and create new products and business models? What mindset shift do we have to make if we want to tap into its full potential? Find the answers on WorkLab. That's W-O-R-K-L-A-B, no spaces available wherever you get your podcasts. All right, straight into it.

Starting point is 00:29:00 Now, three reasons why Open AI might win the large language model race of 2025. Number one, they got the users, baby. They got everyone. They got the users. And with the users comes the data. And with the data comes better models. Still, I don't think people realize how good of a deal that $20 a month to use a pro plan of Claude Anthropic, right?

Starting point is 00:29:27 Even though if you look at Claude the wrong way, you hit a rate limit, right? I saw someone in the comments here on our live stream. someone was tweeting on the live stream. I think it was Michael just about like rate limits, right? Yeah, but even at these $20 plans that are extremely $20 a month plans, right, from Microsoft co-pilot, from chat GPT, from Gemini, from Anthropic Claw, from all these other large language model makers, if you're not opting out of data, you are the product, right?

Starting point is 00:30:04 And so many people don't know any better. So many people don't know how to turn off their training data, right? And there's more, I would say, protection over your data as you were on higher plans. Right. So I have normal, you know, normal paid accounts. I have team accounts. I have enterprise accounts, right? Because we advise companies on how to use this at scale in their organizations.

Starting point is 00:30:32 Right. So I know the different kind of data controls that you have at different levels. So even at the base level or the free level, right, so many people are on free plans and they're just dumping in all their company info. That's why I think they're going to probably still win the large language model race. They have the data. They have the users. All right.

Starting point is 00:30:57 So let's look at this here on my screen for our live stream. audience. This is just a Google Trends comparison. All right. And this isn't like overall searches. This is just interest over time, right? Comparatively. So comparing chat GPT to Gemini to perplexity to Claude, right, just as an example, talking about some of the popular AI systems there. And yes, perplexity is more of an answers engine. So that's why I didn't really include them in this conversation either. Right. I'm talking about large, language models. For the most part, perplexity, you just use one of these models and then its technology is more of an answers engine. All right. But what this graph shows is the interest,

Starting point is 00:31:45 the search volume, and the users for chat GPT are 3x, 5x greater than all other competitors combined. It is not even close. Open AI is synonymous with AI, right, which is weird because artificial intelligence has been around for many decades, but you ask the average person on the street, hey, have you heard of AI? Not all you all, right? You guys like me have probably used dozens of large language models, right? But ask the average non- Everyday AI listener, right? Hey, do you know anything about AI?

Starting point is 00:32:24 They're going to say, oh, like chat GPT, right? My mom uses chat GPT. I didn't even tell her to do it, right? She probably did it from, I don't know, maybe listening to the show. So hi, mom. But, you know, most people don't know anything about AI, right? We live in a bubble here on this show, on social media, in our own echo chambers of artificial intelligence. Most people want to hear AI, they just think chat, TBT, right?

Starting point is 00:32:51 Not the fact that, you know, traditional machine learning and neural networks have been widely used for many decades. It is synonymous. That is one of the benefits of Open AIs go-to-market strategy. They made a huge splash. I think, you know, at the end of November 2022, it became especially heightened. You know, we were still, you know, kind of in this COVID phase, right? People were spending more time indoors using technology, right? More people were working at home.

Starting point is 00:33:24 And it just came at the perfect time and it blew up. but they have more users, more interest, more name brand recognition than everyone else combined and it is not even close. More users, more data means you're probably going to win. All right. Let's keep it going. I'm going to wrap this one up. Try to go quickly.

Starting point is 00:33:44 Reason number two, they might win. Yeah, good, good question here. Actually, Cecilia saying doesn't Google inherently have the potential users, kind of, right? Yes. Google has hundreds of. millions of users of their technology. What people don't know is you have to be on a paid plan. So it's an additional add-on right now, right, Gemini.

Starting point is 00:34:10 Or if you are on a Gmail plan, you know, you can use Gemini for free. But for the most part, it is extremely hard to use Gemini on the front end. It is hard for organizations to roll this out, right? You have to have literally like a degree sometimes in order to give your organization. a pro version of Gemini. So yes, I do believe that Google will catch them eventually. But right now, in terms of active users, right? ChatGPT is blowing everyone else away.

Starting point is 00:34:41 All right. Reason number two, open AI still might win the LLM race. Only, they are the only front end model with a reasoning model, projects, the internet access, code rendering, and tools. All right. Google Gemini is catching. up their front end was essentially, you know, the redheaded stepchild of AI until December of 24.

Starting point is 00:35:09 No offense against redheads or if you are a stepchild or if you are a redheaded stepchild. I'm just, you know, using analogies here. Sorry. But they were large, like the front end of Gemini, jemini.com was largely ignored until December. Google tucked away all of its best technology. inside developer platforms, inside of Google's AI studio, inside of vertex, and people don't know that. I did a whole ranty episode on this a couple of weeks ago, so go listen to that if you want to. But Google, I think, ended up losing trillions of dollars in market value because they didn't

Starting point is 00:35:46 understand it is non-technical people making decisions for Fortune 500 companies. And what they're doing, what everyone's doing to test out AI, right, quote unquote, to test out large language models before implementing it in their organization, right? I've literally talked to dozens of Fortune 500 companies that do it this way. Nothing wrong with it, right? Usually an individual or a group of individuals or teams will start using a large language model on the front end, usually before their company has an official AI policy. Then they'll go to leadership and show them something.

Starting point is 00:36:24 They'll log on to chat, gbt.com or Gemini.com or claw. or co-pilot.microsoft.com, right? I'm like, oh, wow, look at this, right? What I'm saying is very, I won't say very rarely, but it's not commonplace that you have your technical people, your CTO, your Csos, your CMOs, doing these things on the back end. Front end is where decisions are made. And chat GPT has a stranglehold on front end features.

Starting point is 00:36:53 It's not close. Google Gemini catching up finally. But until four weeks ago, Google Gemini, sorry, was trash on the front end. They didn't put their most recent models. Five months ago, it had a problem using Google, right? Yeah. Claude, great. I mean, Claude's great at some things.

Starting point is 00:37:14 It's not connected to the internet. They don't have a reasoning model yet. So OpenAI is the only model that has everything. They have everything you need on the front end. Are there improvements to be made? Absolutely. Do other models excel in other areas where Open AI doesn't? Yes.

Starting point is 00:37:35 Google Gemini is technically has the better model right right now by a thin margin, but they have a better model. Claude, their artifacts feature that you can render code way better than OpenAIs canvas, even though they're kind of two different things. So yes, these front ends for Google and Claude, have advantages, but Open AI has it all. And just like we saw, Open AI came out with projects, right? Which tells you, yeah, anything good that a competitor has on their front end,

Starting point is 00:38:11 Open AI is going to implement it or it's probably already been in the works. All right. And, I mean, we haven't even talked about what else might come to the front end of Open AI, right? Hopefully we'll see an updated dolly or maybe it'll just be sour. photo right now it's a different front end. Maybe we'll see SORA inside the chat GPD interface. You know, maybe we'll see the new operator,

Starting point is 00:38:35 which is the agentic system inside the chat GPT interface. There's something that's been rumored called tasks where you can schedule, essentially prompts to run, right? So the front end interface is only improving. And I think for whatever reason, their two biggest competitors have been too slow, too stagnant. in bringing features consumers want to the front end.

Starting point is 00:39:03 All right. Business leaders are not making, at least across the board, they're not making decisions based on API, even though that's where they may ultimately be using the large language models, where they're testing them out, where they're making their decisions is on the front end. ChatGBTGBT.GPt.com, Gemini.com, claw.

Starting point is 00:39:24 and OpenAI is running away with it. All right, reason number three, I say the best for last, y'all. OpenAI is actually crushing one of the most important games that I don't think anyone's paying attention to, the small language model game. All right, let me share. And I'm on record saying this back in 2023.

Starting point is 00:39:51 I've said the future of large language models is small language models. as hardware becomes more powerful. Okay? The AI chips are getting better, right? Your GPUs, your NPUs, right? Edge AI, using a large language model on your device, on your phone, on your computer is going to become more and more commonplace in 2025.

Starting point is 00:40:19 Why you might ask, well, it's faster, number one, it's more secure. But right now, when you use all these models on the front end, right, you are sending all this information to the cloud. It makes it more expensive. It's worse for the environment. And there's less safety. Are we going to be able to use, you know, Open AI models or Google, well, Google already has some or clawed models locally? I don't know.

Starting point is 00:40:45 But Open AI is winning the game of smaller models that are more powerful. No one's paying attention. attention to that. And that is, I think, one of their biggest wins they have going for them right now. Let me quickly explain here. All right. There was a Microsoft research paper. We covered this yesterday in our AI news that matters.

Starting point is 00:41:12 That's our weekly Monday wrap-up live stream audience. Do you guys catch it or is informed to you guys? Let me know. But we talked about this yesterday, a Microsoft research paper that essentially somehow uncovered model sizes for some of the most popular proprietary models. All right. So these proprietary models, for the most part, they're secret, right? No one really knows how big they are, how many parameters they are.

Starting point is 00:41:39 Think of parameters as a model size. So your open models, right, they say it, right? Because you can download them, you can fork them, you can build off of them, etc. Right? So with meta, right, meta, you have your meta, what is it, 3.27B, 7 billion parameters, 11B, 11 billion parameters. Their 3.1 has a 405B, 405 billion parameters.

Starting point is 00:42:01 That's the size, how big the models are, right? The weights, the training, everything that makes that model special. For the most part, we don't really know. All we've really known is the GPT4 model was 1.7 or 1.8 trillion parameters. Giant, right? So this new Microsoft research paper shed some light. So like I said, GPT4 had 1.7 trillion parameters. And if we talk about its benchmarks, right?

Starting point is 00:42:32 Sorry, non-dork, stick with me here for a second. 86.4 on the MMLU. GPT40, right? So the successor or the updated version of GPT4, GPT4, so the Omni model, 200 billion parameters. What does that mean? one-tenth the size performance went up. All right. But you're still like, all right, Jordan, well, you know, a 200 billion parameter model.

Starting point is 00:43:03 You can't really run that locally. Well, yes, you can. Right. Not GPD 4-0 because you can't download it. But you might be saying, oh, that's a huge model. You can't run that locally. Well, look at what Nvidia just literally announced, right? You could run a 405 billion parameter.

Starting point is 00:43:22 you can literally chain two of these new digits, project digits, chain two of them together, and you can run a 405 billion parameter model, Meadows 3.1, 405B, you can run that locally, which is mind-blowing, right? If you don't follow this stuff, it's, I can't even explain it, right? but the fact that you can run a model that big locally, huge. So, GPT-40, a tenth of the size, more powerful. Why is that matter? Well, look at their, quote-unquote, small model. Their small language model, GPT-40 Mini,

Starting point is 00:44:06 8 billion parameters. That is tiny, all right? the next iteration of smartphones in one year, smartphones would be able to hold an 8 billion parameter large language model, right? Edge AI. Right now, usually most edge AI smartphone kind of models are between 1 billion in 3 billion parameters. Now, one's doing this math since this study came out.

Starting point is 00:44:35 Like, that was the first thing I saw. Maybe it's because I'm a dork, but I'm like, wait, GPT40 is only 8 billion parameters. And it's still highly capable with an 82 on the MMLU, right? Think of an MMLU as, you know, I know more people are looking at MMLU pro or other benchmarks. I like MMLU. It's a nice standard that's been around for a long time. And you might think of it like, okay, that's a big drop off, right?

Starting point is 00:44:58 To go from, you know, an 88.7 in GPT40 to an 82 with GBT40. That is an $8 billion parameter model. That is tiny. Let's look at some of the other models that are in that same size, at least that we know the parameters and we have an MMLU score for. Lama, the 3.211B. So a bigger model, technically, 73 MMLU. All right, Microsoft, 5-3.

Starting point is 00:45:32 It's a 7 billion parameter model. 65 MMLU. All right. If you don't know anything about MMLU scores, right, they fight for a 0.1 percentage, right? Like when you're getting into the 88s, the 89s, like a 0.1, 0.2, 0.3 improvement is huge. Open AI is silently crushing the small language model game. Why does that matter?

Starting point is 00:46:06 One, I told you, edge AI, right? In theory, you would be able to run something like that locally. Who knows? Maybe OpenAI will allow that one day. Maybe that will lead to us actually having a state-of-the-art model on our devices, right? Who knows? Maybe the iPhone 18 might have GPT5 Mini on it running locally, which in terms of what that means for humans, what that means for society, what that means for work is crazy because then there's really zero reason, right, for anyone in the world to be like, nah, our organization's not going to do this. AI thing. They're silently crushing the small language model game. No one is paying attention. Why else does that matter aside from edge AI?

Starting point is 00:46:51 Well, I believe in the future, we're going to be using thousands of small language models. I think your 01s, your 03s, these reasoning models, they're going to take, let's just say a GPT 50 mini. Let's just say, let's just say there's a GPT5, 50 mini. And let's say we have an 03. I believe the 03 is going to start taking the place of that reinforcement learning with human feedback is going to become reinforcement learning with reasoning feedback.

Starting point is 00:47:23 You are going to have these reasoning models fine tuning, right? And this is when we step into that line between AGI and ASI, right? But AI is going to be creating thousands of versions of these smaller models. I think we're going to have what's called a mixture of models. Make sure you tune into our 2025 prediction show in two weeks. some to be talking about that. We had this thing called mixture of experts, and we have for a couple of years.

Starting point is 00:47:48 I think we're going to have something called mixture of models. I think we're actually going to be using thousands of small language models. In all the large language model, the front model is going to do is congregate information and orchestrate the large language model to go out and do things. All right. I got a little dorky there at the end, y'all. But let me go ahead and end this this way, right? Because this was a lot.

Starting point is 00:48:17 Long episode, sorry. Sorry if you're still on the treadmill. I gave you three reasons why OpenAI might not win the small language model race. I gave you three reasons they might. I asked the audience, who do you think is going to win? So let me wrap it up by saying this. Yes. Open AI is going to win the large language model race in 2025.

Starting point is 00:48:40 However, they actually have competition. because if you look at the 24 months from November 2020, when Chad GPD was released until November 24, right, 24, 25 months. It was a one-person race. It wasn't even close. Google, I think, had the best month in AI ever in December, 24, not just from a large language model perspective, but generative AI, tons of features that I think are going to be useful and actually used.

Starting point is 00:49:21 For the first two years, Open AI was running by itself. Yet they still innovated and they still, I'd say, dominate. There's a reason Microsoft, number one, number two, Apple, there's a reason they chose Open AI to power the future of their devices, of their technology, of their software. Right. Apple and Microsoft are smart. They build their own models. Yet they said, ah, we're going to use Open AI for a big part of our future.

Starting point is 00:49:58 Open AI did that with, they knew. If I'm Sam Altman, if I'm in leadership at OpenAI, I knew it was a one pony race for two years. It's not anymore. So, yes, we're going to see Open AI go in different directions. Yes, they may get sidetrack going after AGI, ASI, their operator agents, all these other things. But they know now. Google is on their toes. Poor Claude.

Starting point is 00:50:23 Poor Claude. I think Claude could be one of those sad stories in 10 years where everyone's like, oh, remember Claude. And I don't know. Maybe they get acquired by Amazon or aqua hired by Amazon or maybe they just fade into oblivion. I think, you know, at least going back to our original three, I don't think Claude is in the race.

Starting point is 00:50:43 I don't think there are. But I think Open AI is going to win the race, but it's going to be a lot closer than it was for the first two years. I hope this was helpful, y'all. If so, please go to your everyday AI.com. Sign up for our free daily newsletter. Also on our website, there's a ton of information. Like I said, hundreds of episodes, no matter what you care about. Do you care about HR?

Starting point is 00:51:07 We have a category for that. Go learn from HR leaders. Do you care about marketing? We have a category for that. Do you care about enterprise technology? We've talked to the experts. Literally, our website, your everyday AI.com, is your new best friend. If it's one of your goals in 2025 to better learn AI, there is no better unbiased,

Starting point is 00:51:30 no BS resource. It's all there. Make sure you sign up for our newsletter while you're there. Thank you for tuning in, y'all. I hope this was helpful. If so, please, if you're listening on the podcast, subscribe to the channel. leave us a rating, all that good stuff. If you're listening here online, click that repost button, share it with someone who needs to know it. Thank you for tuning in. We'll see you back tomorrow

Starting point is 00:51:50 and every day for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI.

Starting point is 00:52:40 Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see. you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 434: Will OpenAI run away in the LLM race in 2025?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.