This Week in Startups - AI Model Showdown: Grok 4.1 vs. Gemini 3 | E2211

Starting point is 00:00:00 I heard Gemini 3 is out. I heard GROC 4.1 is out. Everything is happening so quickly in our industry. Look at what XAI did. Grog 4.1 is really good across a number of metrics. Compared to GROC 4 fast, it's dramatic improvement. When 4.1 was released onto the charts, it went straight to the top. And I think it just shows that there's several different AI labs here in the U.S.

Starting point is 00:00:23 that are able to successively create state-of-the-art models. And I think it also pushes back, Jason, against some of the domerism. that's been kind of building in the last three months? One of the great things is I guess that now that we have these leaderboards, it's motivated for these companies and gotten these teams super excited about claiming the top spot. That's where I'm just getting a little concerned of, you know, like, are we actually making progress at solving the real world problems people have? Or are we getting good at acing the SATs?

Starting point is 00:00:54 And that is a question for me. This week in startups is brought to you by Zite. Zyte is the fastest way to build business software with AI. Build apps, forms, websites, and portals that connect to the tools you already use. Go to Zyte.com slash twist and get 50% off your first project. Every. Running a startup is hard enough. Every takes care of incorporation, banking, payroll, benefits, accounting, taxes, and more so that you focus on building, not backoffice admin.

Starting point is 00:01:25 Visit every. And goldbelly. Gold Belly ships America's most delicious, iconic foods nationwide. Get 20% off your first order by using the promo code Twist at checkout. All right, everybody, welcome back to this week in startups. There's tons going on in the news. I heard Gemini 3 is out. I heard GROC 4.1 is out.

Starting point is 00:01:48 Everything is happening so quickly in our industry. I'm having a hard time keeping up. No, it's been really, really busy. I want to start, though, Jason, with the Cloudflare outage. I don't know if you were awake or asleep. I know you've been going through some jet lag. The internet broke this morning. It actually made getting the docket ready kind of hard.

Starting point is 00:02:06 It's funny when a news story really impacts us because services like Jad GBT and X and other kind of major planks, you might say, at the podcast economy, broke. And for several hours, Cloudflare had these really embarrassing downtime outages. And the good news is that now that we're here, it has been resolved. They've come out and apologize for it. If you're curious, everyone, what drove this? They said that, quote, a latent bug and a service underpinning our bot mitigation capability started to crash after

Starting point is 00:02:33 a routine configuration change we made. And Jason, you'll know this, that cascaded into a broad degradation of our network and other services. Sure, whatever that means. It was bad, though. Yeah, you know, Cloudflare is this great CDN, I guess, content delivery network. And if you have a website and you get these DOS attacks, these denial of service, DDoS, where, you know, people would send tons of pings to your server, try to slow things down, and cause a disruption in service,

Starting point is 00:03:04 they figured out ways to, when all those attacks come in, to just drop them. And because it's a network, you know, if some hacking group has, you know, 10,000 servers or maybe even individual machines on the Internet that they've hijacked to send in these denial-of-service attacks, just to think about it simply, they'll hijack 10,000 Windows machines, send a bunch of attacks to a website, and then all of a sudden you can't read the Drudge Report or New York Times or whoever their target is, they know those IP addresses, they know what it feels like when those kind of attacks happen, they know the signature, and they just drop those requests. It's pretty sophisticated.

Starting point is 00:03:39 It's pretty basic, but Cloudflare has become kind of a standard for this CDN. I remember when we were running Weblogs, back of the day, this 20-year-old knowledge, but we started using these different CDNs, different caching servers. We put up our own caching servers. We were handling these issues ourselves. now everybody abstracts it to providers. What's the value of being able to abstract it to a provider like this? Well, you can focus on other things in your business.

Starting point is 00:04:05 What's the downside? If everybody trusts, everybody's got their eggs in one basket, whether it's Amazon Web Services or Google or Azure or in Cloudflare, everybody gets affected at the same time. That's kind of what I want to talk about, because normally this is a little bit outside of our wheelhouse, Jason, But there was a Cloudflare outage also in June. There was an AWS outage in October.

Starting point is 00:04:29 There was a Google Cloud outage in June. And there was an Azure outage in October. So you're a startup founder and you are running a software business, as most of them are. You are using cloud services as most of them are. Communication strategies for downtime mitigation. I'm curious what your advice is to founders as they go through this apparently recurring issue. It's a recurring and non-consequential issue at this point. The downtime happens for an hour or two, and frankly, we're all permanently online, so it's actually good.

Starting point is 00:04:59 Everybody gets a little break. I would be concerned if your service is down for more than a couple of hours. If it gets into like, okay, it was down this morning, but it's still down this afternoon. It's still down this afternoon. That's when people start to go, wait a second, maybe I should look at another option. And you just don't want to get into the, okay, it's time for me to look for another option. If you're a startup founder and your services down in the morning for an hour, hour, no big deal. If it goes into the afternoon, now you got a big deal. People start looking for

Starting point is 00:05:27 other solutions. When AWS and their U.S. East one region went down, everyone kind of lost service, if you lose service with everyone else, Jason, is the penalty just not that high for your individual company? Because that's good news and better than I expected. If it was AWS, everybody gives you a moll again. No big deal. No harm, no foul. Now, you might, if you're a financial services company, you know, if you're a Robin Hood or your stripe, and you go down and it costs people, of money, that's a whole different ball of wax. You might need to give a credit, you know, and things can get, yeah, a little bit ugly. People lose money. If you are selling pay-per-view and it's Netflix and it's that boxing night, yeah, that is a big problem. So I think it, you know, it's service

Starting point is 00:06:11 dependent. The financial companies have a different order of duty. But, you know, if you look at the video game industry, they'll be like, hey, we're having downtime Saturday afternoon. You can't play your for three hours, expect downtime, and they communicated to you ahead of time. I don't mind that, but like, you know, Tuesday morning, I mean, I think Cloudflare likes to say they power or protect about 20% of the websites online. That's a pretty big chunk of the internet to bring down yourself. Going back, though, to what you said about their footprint and how they help protect websites from DadaDOS attacks and they do content delivery around the world,

Starting point is 00:06:45 Cloudflare just bought a company called Replicate, Jason. And this is the startup that I hadn't spent enough time looking at, raised a couple of rounds, including a series A from Andresen and a series B from Andresen. And what's interesting is they're kind of offering serverless AI access. So if you want to just basically not deal with an individual provider of AI services, you could go to replicate and just kind of via one API hook access a bunch of different AI models. And they handled all the technical backend.

Starting point is 00:07:14 And so it kind of makes sense to plug that into Cloudflare's global infra footprint because then you can kind of have AI at the place of your consumers with limited lag and so forth. But like, is it kind of crazy that I didn't really know anything about replicate before this and it raised several rounds? How many companies are they doing this sort of thing? This is a company I'm not aware of either, but being able to use an API to just send something to a model and get a result back is kind of how people are running these systems now. instead of, you can run your whole application on AWS or Azure or Google Cloud, Oracle Cloud,

Starting point is 00:07:54 and then just be calling APIs from GROC, Claude, you know, Anthropic, or ChatGBT, TB. That does make it a lot easier, doesn't it? You don't have to, like, stand up a bunch of hardware. You don't have to stand up like a bunch of instances. So it has become quite modular. I do think at a certain point in time, people are going to want to stand up their own language models with their own data to just not educate other LLMs and to keep their data private. So I think the future is going to be, you know, a lot of people running their own deep seek instance, you know, over at a Google Cloud or Azure, etc. Yeah, I do think, I'm just seeing it with startups.

Starting point is 00:08:39 Let's say you have a startup that does some obscure thing like, I don't know, building codes, right? And you're going to fine-tune this model to do building codes. And it's going to be for architects and builders, developers, to understand all these different codes on a global basis. You know, your Starbucks, your opening stores everywhere. You need access to all these codes. And you're taking all the time to go collect that data and to normalize it and train the model. Do you want to train chat Chp.T's model or Anthropics model, have them get the benefit of it, or do you want to just create an island, put all your data into it, and not have them see the prompts,

Starting point is 00:09:22 not have them learn from your prompts, not have them learn from what are the queries you're sending from your customers to them? I think people are going to become very, very cautious about the large language models becoming competitors. said it before on the shows. Everybody listening to me knows it's going to be a big end of the year, November and December. You're trying to get a lot of things done at work. But Thanksgiving is coming. Christmas is coming. And you're going to need some help. Gifting and hey, getting ready for your Thanksgiving table. And we've got the perfect partner. My favorite goal belly. Yum. Yum. Get yourself some amazing gifts for your partners. Get some amazing food for your table like

Starting point is 00:10:05 Martha Stewart's salted caramel chocolate cake. How about this? Johns of Bleaker Street, New York style pizza, amazing. Lombardy's pizza. These are some of the best pizza makers in the world. It's a great way to reward your team members. I do it. I order great stuff from around the country.

Starting point is 00:10:22 Just surprise my team with it when we have team meetings. So if you're looking to reward your team for doing a great job in 2025, maybe you want to surprise your loved ones with a special holiday treat, here's how you do it. You go to gollbelly.com. use the promo code twist and you get 20% off your first order. That's goalbelly.com and use the offer code twist. Some of them are positioning themselves as like neutral third parties.

Starting point is 00:10:48 They're not going to be in your business. Other ones that have to spend a trillion dollars. Quite possible they need to make $2 trillion. How are they going to make that money? I think they're going to compete with you. So for the startup that was working hypothetically in building codes. And they took the time to go find all these building codes. Let's say they hired 100 researchers to, you know, full time for a year.

Starting point is 00:11:12 And they spent $10 million on these hundred researchers to go collect all this information, normalize it, whatever. And then they feed it into somebody else's language model. And now they learn this information. Not good. So in that case, Jason, the corollary, the follow-up question is just, are the startups that you're seeing that are kind of rolling their own models? Are they using, I presume, all models from China,

Starting point is 00:11:35 then the open source stuff from your moonshots, your deep seeks, et cetera? I think this is the year this is happening, so TBD. Okay. But we'll, you know, let me ask a couple of them if they're willing to talk. So producer Marcus, like maybe we should see if there's somebody in our portfolio who's doing this. We can just ask in Founder University, et cetera. Anybody working on open source AI models and training themselves

Starting point is 00:12:00 and running themselves and maybe they can come. come on the show and talk a little bit about it. But it is definitely a trend that's going on. And I saw Andreessen Horowitz was publicly saying, hey, a lot of our startups are using deep C. That's what I was going to say. Martin Casado, I think, said that it's like 80% of startups they see are using open source models from China. They're probably using other models as well, by the way. It's just not exclusively. But I mean, like, you know. I think that's how people took that quote, though. People took it as like, oh, they're picking the open source Chinese model over open AI's model. I think it's in addition.

Starting point is 00:12:31 or maybe they're experimenting with them. So in other words, they're aware of them. They've got an instance running and they're sending some jobs to it. And why wouldn't you? I mean, I think ultimately what's going to happen is we're going to have a large language model on our Mac Mini M17.

Starting point is 00:12:49 If it's an M4, I think I have an M4 right now and I guess someday I'll have an M5. You know, somewhere around M10. You know, I'm going to have four M10s or an M10 will have four microprocessing. processors in it and some amount of bandwidth and RAM that will give it the ability to run my language model locally. Then when I do that search and I say, hey, show me all the photos of us skiing and, you know,

Starting point is 00:13:14 a bulldog in the background because I'm looking for those pictures of the bulldogs playing on the mountains while we're skiing. It's just going to do that locally, not in the cloud. It's going to index my photos locally. It's going to have all my Gmail local. It's going to, you know. So do I then have like a server in my basement, Jason, that has. has my family collection of GPUs, or am I just running a single Mac mini pro 2015 edition

Starting point is 00:13:38 or 2030 edition, whatever? I think that the ultimate manifestation of all this is going to be Apple will have local language models and they'll be pitching us on they're encrypted, they're private, your data is safe, just like they do for iCloud. We can't get your iCloud stuff. If the government comes and asks us to crack it open, we can't get there. Just like, you know, I guess Sam Waltman gave a warning, like, hey, if somebody sends us a letter, we're sending them everything you've talked about.

Starting point is 00:14:09 And when this kid tragically committed suicide recently, they had all the logs, right? And like, yeah, they're going to dump the log. So if you're in a relationship with chat GPT, you know, they have all that. Whatever you said to your chat JPT, they've got it stored. But the Apple point's great, though, Jason, because if you wanted to have that model succeed, you would need to have a smartphone, a desktop computing, a slate computing experience, a headset computing experience, a future glasses community experience, and you need to be in cars, perhaps do something like car play. They literally have all the surfaces

Starting point is 00:14:41 already that they would need. Whoever has the operating system, I think is going to have a real operating system plus privacy is going to be a really interesting combination. I've talked ad nauseum about my love of the comment browser and watching my browser go to United Airlines and find a flight for me and put it into, you know, the shopping cart was like, hmm, that's interesting. Or I used it the other day. I said, go to Blue Sky because I don't have, I'm not following it in Blue Sky. And I just said, I think I said follow anybody on the Explorer page who's like featured. Just follow like 200 people.

Starting point is 00:15:18 And it went and it followed like 20 people. It's like, ah, 20 is the limit. And I was like, well, it's 20 the limit. Okay, do it five times. It went to get it five times. The idea that Apple is watching what you do on your phone. and there's this give Apple intelligence. I don't know if you've seen this setting.

Starting point is 00:15:35 You can pull it up. We've talked about this, yeah. Give Apple intelligence access to what you're doing on this app. So if you go to the settings for any app on your phone, there's a thing like give Apple intelligence access to it. What does that mean? What that means is Apple intelligence is watching how you use your phone and they're watching how you use, I don't know, the contacts app, your phone app, your superhuman.

Starting point is 00:15:59 app. And then over time, you're going to be able to say, go do this in the Stubhub app. Go find Nix tickets. But it will have recorded and studied it over and over again. People asked online quite a lot of how to turn this off. But to me, if you're going to use any sort of intelligence on your iOS device, you're going to want it to connect into other applications, Jason. So just... So Apple Intelligence, here, yeah, learn from this app. Next to the docket, we have GROC 4.1. We'll talk about Gemini 3. a second, but I think it's worth stopping and saying, look at what XAI did, because this is a company at Jason that started later than Open AI and has really done, I'm just going to say,

Starting point is 00:16:40 impressive things. I know a lot of my friend group aren't the biggest Elon fans, but I do think it's important to give credit where it's due. And GROC 4.1 is really good across a number of metrics. One, they did a lot of work on reducing hallucinations, which I think is a very important step compared to GROC4 fast. It's a dramatic improvement. And I think most importantly, based on the usual kind of panoply of things that Ella Marina tracks in its battleground for different AI models, when 4.1 was released onto the charts, observed. It went straight to the top. GROC 4.1 and GROC 4.1 thinking took the absolute pole positions.

Starting point is 00:17:17 But this is a real accomplishment. I think it just shows that there's several different AI labs here in the U.S. that are able to successively create state-of-the-art models. And I think that's very encouraging for startups. I think it's encouraging for the AI economy writ large. And I think it also pushes back, Jason, against some of the, the doomerism that's been kind of building in the last three months. It feels like AI needs a win.

Starting point is 00:17:38 And I think this is an example of such a win. Less hallucinations, obviously better. And one of the great things is, I guess, that now that we have these leaderboards, it's motivated all these companies and gotten these teams super excited about claiming the top spot. So kind of. Kind of cool that these things, you know, somebody came up with this, this concept of the arena and all these different tests. I do wonder if building for the tests at a certain point is kind of like kids mastering the SATs. And like at a certain point, maybe that's not what's important.

Starting point is 00:18:18 Maybe like there are other things. So I wonder if these tests are changing. I don't have enough information on that. of like how are these tests changing? I know early on some of these, one of the tricks that was occurring is when you give people an incentive like a leaderboard like this, people then try to game the leaderboard. What's a way to game the leaderboard doing unnatural acts?

Starting point is 00:18:41 Like if you know it's going to ask certain types of questions, then studying for those types of questions or feeding those exact questions or those, you know, multiple versions of those questions. into your training set. In other words, kind of gaming the system. Precisely. And I think that that's where I'm just getting a little concerned of,

Starting point is 00:19:04 you know, like, are we actually making progress at solving the real world problems people have, or are we getting good at acing the SATs? And that is a question for me. People are worried about, I'm trying to remember the exact phrase of this, but it's like diffusion or, when the model gets like so suffused

Starting point is 00:19:22 with people knowing how it asks, questions that it kind of loses meaning because you can tune. And if you think back to meta's launch of Lama 4, Jason, if you recall, they actually made a bunch of different versions of Lama 4 and then ran them through the same test to get the best results. And they got mocked for this pretty mercilessly because it was clear they were trying to cook the numbers to appear impressive. The reason why I shared L.M. Arena information versus just kind of a raw rundown of individual benchmarks is that this is their users in a head-to-head battle choosing which result is better with the model names masked. And so it shows

Starting point is 00:19:54 what people that are actually using these models think. And I'm hoping that provides a better perspective on actual progress here. I'm a big enough deal now that I can afford to hire my own admin team. Look at me. They handle all the details of running my company. But if you're a startup, you need to spend your time obsessing about your product, not filling out paperwork and doing all these books. That's why I want to tell you about every.

Starting point is 00:20:16 They've worked with over 1,000 startups from first-time founders to VC-funded teams and more. And they've got the experience to help your company navigate. whatever's coming around the bend. Maybe it's time to incorporate your startup so investors take you seriously. Every's going to take care of those filings for free without any unnecessary delays or legal fees. Hey, maybe you just got some new funding and it's time for you to scale up. With every, you'll get 3% cash back on every dollar you spend on your company's card. Hey, maybe it's time to say goodbye to a member of your team.

Starting point is 00:20:45 It happens. Well, they've got an employee offboarding checklist. And you know I love a checklist. It's going to help you ensure a smooth transition while protecting your business, both legally and financially. Plus, you'll get a $2,000 bonus when you move 250K or more into your every account. So for your incorporation, banking, payroll, benefits, accounting, taxes, and other back office administrative needs, visit every.io. That's eV-E-R-Y. But anyways, let's move on to this morning's big news, similar idea, a new AI model,

Starting point is 00:21:20 this time from Google. Gemini 3 Pro is out. limited early access. It's really not as we speak. Earlier, Jason, you mentioned how the benchmarks are a little bit dodgy. So I don't want to overindex on this particular data set, but I think it is worth to share with people what the numbers show. So here's the Gemini 3 Pro benchmark results. I just wanted to talk about the difference in how things have changed. If you look at their humanity's last exam results, Claude's on at 4.5, it's about 14% right versus about 38% for Gemini 3 Pro. big improvements in math and screen understanding. And these numbers really are materially better than other models that are kind of currently

Starting point is 00:22:02 state of the art. And I just really think this pushes back against the idea that AI improvement has slowed dramatically. So I beat this all as very, very bullish. And there's a little bit of a teaser, though. One last thing here. If you take a look at this chart right here, there's an upcoming variation of Gemini 3 Pro called Gemini 3 Deepthink.

Starting point is 00:22:23 does even better and is crushing the ARC AGI benchmark, which I think is very important. So I believe the kids say that Google cooked here and that a lot of folks are really going to dive into this pretty much right away because this is going to be. Looks incrementally better. You know, to be honest, we need to have an expert on here to tell us if it's actually better or not or like the actual extent. Because this history's last exam, I do know having talked to some folks, like they are some pretty hard questions. I would like to have an expert on here who created that. I wonder who

Starting point is 00:22:57 created histories. Somebody created this. Somebody creates the new questions on it. I want to know who that person is. In a more real world application than Jason to put it into more kind of functional context, browser use, a 200 company gave it very strong marks. Instead it was much better than previous models from the company. Boxes Aaron Levy compared it to Gemini 2.5 pro and found that in the box context, it was much better across a number of metrics. And Stripes, Patrick Collison gave it pretty strong review. You might say he gave it a pretty hard task and was very impressed with what it put out, when they put together a compendium of recent research into genetics.

Starting point is 00:23:32 So some people in the field are actually reporting. That's good to know. Got rave reviews from some leaders who actually use it at their major company. So Aaron Levy likes it. Right boys like it. That's pretty good endorsement that they're seeing a difference. It's at least as good as people hoped, unlike GPT5, which was not as good as people hoped. So at a minimum, the vibes are persisting.

Starting point is 00:23:54 So that's pretty good. And then just for fun, Jason, I thought we talked about the polymarket perspective here. There's a funny little wiggle here in this polymarket. If you're on the audio version, we're looking at which company has the best AI model by the end of 2025. The end date, of course, is December 31. And there was this really interesting wobble, Jason, resumed into a one-day context year. But look, Croc 4.1 came out and people got really excited about it. Maybe X-A-I is going to win.

Starting point is 00:24:17 And then Gemini 3 came out and got crushed it again. So the market is once again betting that Google is going to run the AI game through the end of 2025. How does this one resolve? I always bring the same thing up, which is we need to understand how the bet resolves. So how do they define which one has the best AI model again? The market will resolve to, yes, if any model owned by Google has the highest arena score based off the chatbot arena LLM leaderboard. So it's the chatbot subset of the LN arena leaderboard that we discussed earlier. Again, head-to-head, no models listed. People just pick which one they think is best.

Starting point is 00:24:52 So it's kind of a blind taste test, and Google has to have the highest score in that area. And it didn't when Grogh 4.1 came out. And now it does again, now that Gemini 3 has come out and surpassed it. That means Google is going to have the best model, period, full stop. Like if the sharps are saying it, there's somebody on the inside who's making this bet, I bet.

Starting point is 00:25:13 I bet there's people inside the, there's some group of, developers, this is my hypotheses. There's some group of developers who run L.M. Marina and there's people inside of Google who are betting on themselves, like a basketball, like Michael Jordan betting on himself to win a game or cover the spread. It's like it's in his control. So when I see like every, that sort of level of consensus, I kind of think it's the developers themselves working on the language model or the person and or the people who actually run the leaderboard who are watching the leaderboard activity and they have some inside information.

Starting point is 00:25:56 This is why these, you know, polymarket is like really changing the world because there's an opportunity to make money from your knowledge. People have some proprietary knowledge, some inside line, or they're in control of it. But this is what makes these things so interesting. Do you want to see a funny little addition then, Jason, while we're talking about this, if you look at this, this is the same chart,

Starting point is 00:26:20 but zoomed out to the entire year for this kind of end of year bet. And you can quite literally see people's enthusiasm for GPT5. And then it came out and just collapsed. It's a percent chance of being correct. So not everyone trading on here as inside information, because some people are still reacting to news as it comes out to the public. But you can make a lot of money if you know what's going to happen before everyone else. You can see right there.

Starting point is 00:26:42 I think I'm with them. It just seems like Google's going to run away with it for this year. And they also have this massive advantage with their profit machine and the number of searches growing. They can build out their CAPEX off of profits. And so that also gives them a massive advantage in all of this. And it makes sense. They bought DeepMind back in the day. And they can just keep using all the data they have from

Starting point is 00:27:12 browsers, Gmail, YouTube. I mean, you think about their data advantage. Gosh, what is, and have they come out with an agentic Chrome browser yet? Has there been any rumor about them releasing an agentic Chrome browser? I think they're taking the same approach to adding AI to Chrome as Microsoft is with Windows, which is trying to slap it on the top so they don't disrupt what currently works. And I think that's why they're right for disruption. I know there's plugins and extensions.

Starting point is 00:27:37 I don't think there's a hardcore agentic Chrome version, days or at least I know I don't think there is. Gemini is in Chrome like in the top there is like a a Gemini button yeah that will like I don't know summarize the page you're on and do basic stuff like that I'm not playing with this this okay this is better I never actually clicked that button until now so sorry if I'm behind but like I don't think I I know ask okay you see you can chat to Gemini inside of Chrome about your current page and you can change the tab but this seems kind of bolted on to me and not as as deep of an integration as we seem with Comet, Atlas, and other browsers. Yeah, I think they'll do like some basic things, like create a calendar

Starting point is 00:28:21 event for you. But I don't think you can have it go do a task for you like Comet does, where you could say like, hey, go find me 10 camping, put 10 camping supplies I should have for my camping trip this weekend into my, you know, shopping cart and give me two versions of each flashlight, two versions of each sleeping bag, and I'll delete the one I don't want. Give me the highest rated and average price. And I think those are the kind of features people are going to be looking for. But integrating with like, you know, summarizing your YouTube video or making a calendar event, I suppose those are nice.

Starting point is 00:29:04 But when they, if they are studying everything you do inside your browser, We were talking about Apple intelligence earlier, studying how you use an app. Now, I think about that with your Chrome browser, plus your Gmail. They're going to know everything you shop on Amazon. So then they could, every time your, this would be super aggressive and probably trigger some antitrust stuff. But imagine if they were studying your purchasing and then just taking that into account with an AI agent for like, hey, I can build your shopping cart for you for your groceries

Starting point is 00:29:36 next week based on, you know, what I've seen you do in the past, right? I think you're going to be out of milk. I think you're probably needing more eggs. It'll be very interesting. Your startup needs custom software, but building your own secure, production ready apps is hard, right? And not every founder has the expertise or time to get started.

Starting point is 00:29:55 Well, now you don't have to. Just use Zite, Z-I-T-E. Make whatever you need from idea to a working app in just minutes. Easily, all on your own. Plus, you can fill it in with forms, apps, databases, and other automations that you need. as part of your product. So many of the no code and vibe coding tools are just flashy demos,

Starting point is 00:30:12 where they work on legacy systems that takes so much time and effort to learn, let alone master. But Zyte is both easy to use and powerful enough to build out the apps you want. And it quickly connects all the other tools you're already using. For example, earlier this month, they added the ability to bulk, create, and update record. So it's easier and faster than ever to design data-heavy apps that are constantly sinking between systems. That's the kind of attention to detail that sets Zite apart.

Starting point is 00:30:38 Start using the number one AI powered business software generator on the market. I'm going to give you 50% off your first project. Go to zite.com slash twist to get started. That's zite.com slash twist for 50% off your first project. We just did confirm, Jason, that the current instantiation of Gemini in Chrome is non-agentic. I bet you if you count to like 35, they're going to update that because even Edge, Microsoft's current Brownsler has the same kind of right rail. agentic interface. That's very similar to, again, Atlas and a comet from perplexity.

Starting point is 00:31:16 Speaking about data and which companies have it, Microsoft has a lot of data because they run, of course, the office suite and have for a thousand years. And they are going to invest in Anthropic. This is the other enormous AI news story from today. Both Invidia and Microsoft are going to invest into Anthropic, up to $10 billion from Nvidia, up to $5 billion from Microsoft. Wow. Also, yeah. Also, Microsoft is going to become a compute provider for Anthropics. So you're going to be able to use Anthropics clawed models inside of the Azure ecosystem. And they're going to buy compute from Azure as well.

Starting point is 00:31:52 So it's kind of one of these classic AI multi-part deals in which Microsoft's going to work with Anthropic and invest in it. Anthropics are going to work with Nvidia to ensure that its models can run efficiently on Nvidia GPUs. And everyone's very excited about this. This is going to be fun. Now, no matter what cloud you're on, Google Cloud, Azure or AWS, you can get access to cloud models. I think it's the first AI company to be available on all three major clouds, which I think speaks to their business focus. They're very much don't want to compete on an application level with their customers. They want to be a neutral third party, although they do have a coding product.

Starting point is 00:32:30 I think they're not going to do what Open AI is going to do, which is try to invest in every single space. But Microsoft investing $5 billion after just settling their open AI deal, I think that's probably why this is happening now. Yes. Is the open AI deal closed and they seem to have worked that out that they're going to go public and they have certain ownership. And now this drops. Microsoft investing $5 billion, Anthropic will use $30 billion worth of Azure compute potentially. So that's pretty amazing. you know, these contracts are now starting to say, we'll invest up to. So I think people want to understand the contours of these deals a little bit more, Wall Street, that is. And so now

Starting point is 00:33:21 you're starting to see that those qualifiers come in up to $10 billion, up to $5 billion. Although here in the notes, it says they're going to purchase $30 billion worth of Azure. I wonder if that is actually locked in or that's based upon, you know, certain targets being hit. But That's kind of big news. Anthropic has committed to purchase 30 billion of Azure compute capacity and to contract additional compute capacity up to one gigawatt. That's pretty sturdy language. To my point, like, I think you're starting to see more precise language in these deals

Starting point is 00:33:51 because people are wondering, like, is this going to actually show up or not? So when they put up to and they put committed, like, there might be, there could still be an out in that, by the way. They committed to this amount. And there could be caveats if, you know, they can cancel this. They could push out this spend. So, you know, it could occur over 10 years. It could occur over 20 years.

Starting point is 00:34:14 There's, you know, all kinds of contours and qualifiers being put on these things. These companies want each other to succeed. So I don't think they're going to be so hard-nosed about individual contractual things that they're going to cause a ruckus. Like, Microsoft doesn't want Anthropic to look weak, nor does NVIDIA, well, Microsoft to look like it made a bad deal. They want everyone to look brilliant and profitable. And then that's just revenues raining from the sky.

Starting point is 00:34:37 So I kind of presume there's some goodwill built into these, Jason. Maybe I'd be naive, but... You know, if they are making commitments to buy a certain number of NVIDIA's chips, well, Nvidia needs to actually make those. So there could be outs. They're time-based, obviously. So the nature of these is rising tide lifts all boats, and we need more power, more GPUs.

Starting point is 00:35:05 But Dario was also on 60 Minutes this weekend, and there was an interesting clip of him talking about white collar jobs, and it was exactly in line with some of the reporting I've had. He basically is saying for the early stage, and I'll summarize it, but he's concerned that wrote work in the next couple of years. In the short term, it's going to be really cataclysmic. and he just kind of comes out and says it, which is refreshing. Let's hear the clip. You've said AI could wipe out half of all entry-level white-collar jobs

Starting point is 00:35:42 and spike unemployment to 10 to 20 percent in the next one to five years. Yes. That's shocking. That is the future we could see if we don't become aware of this problem now. Half of all entry-level white-color jobs? Well, if we look at entry-level consultants, lawyers, financial performance, You know, many of kind of the white collar service industries, a lot of what they do, you know, AI models are already quite good at and without intervention. It's hard to imagine that there won't be some significant job impact there. And my worry is that it'll be broad and it'll be faster than what we've seen with previous technology. This is interesting because he's pretty conciscer. Again, back to contours, qualifiers. He thinks this could be 10% unemployment. Now, we've had technology.

Starting point is 00:36:32 10% unemployment quite recently amongst young people, it's 8% right now coming out of college. So saying 10 to 20% is basically not that 4% number. You'll hear me quote all the time. That's like the sort of national. But amongst young people, unemployment is like 8, 9% already. So going to 10 to 20% is not that big of a jump. This is the unemployment rate amongst 16 to 24-year-olds. And you can see what's that number?

Starting point is 00:36:58 It's climbed from the bottom of 6.6 in April of 2023 to. a high of 10.5% as of August. Recall government shutdown data delays. That's the most recent data point out. Yeah, so it's already at 10% amongst this 16 to 24 group. The higher group, which is like I think 20 to 27, I think it's 8% or so. That group is an important one to look at because that includes like the college grads who've been in the market for a couple of years. The point is what he's saying is not far-fetched and it's not doomerism. It's actually really well thought out. And you will hear people dismiss it as dumerism. And the people you hear maybe dismiss it as dumerism. Maybe they have horses in the race. Maybe they have political agendas, et cetera.

Starting point is 00:37:45 And they don't want people to be as matter of fact about this. But he is exactly right. If you were to look at an entry level PR person or an HR person or an entry level researcher, entry level, apprenticeship level, accounting person or a legal person, a lot of what they do is a drag on the senior people. And the senior people are now able to do that young person's grunt work with AI. So then you have to ask yourself, if you're a senior level person, do you want to mentor these two or three annoying kids who are just learning how to like show up at work and they can't even get in on time or they're goofy and they fool around or they do some stupid stuff at work. Everybody's had this experience. Like mentoring people is hard. It takes time out of your day. Some

Starting point is 00:38:34 people find it rewarding. Most people find it annoying. Sorry. It's just the reality of it. And then, you know, you could just have run a query with your LLM or use your co-pilot from a legal provider, a tax provider, a coding provider. Or, you know, if it's an HR person and they're like, I'm going to put people on recruiting for these new positions and writing the job descriptions. Then I've got to go edit the job descriptions. It's just easier for me to do it myself. You take it. You take it. You take a senior person like yourself, Alex, or somebody like Lon, you know, you got to prepare the docket. You have a young person prepare the docket. You edit them. It's like, uh, I could have just done this myself with a large language model and it would have been done already. I think we're going to have

Starting point is 00:39:15 to make a civilizational decision here because everything you're saying, Jason, is right. Often mentoring people slows you down. It burns time. And it makes the senior person who's more expensive, less productive net because they have to go back and do a lot of busy work. But if we want people to have kids, they're going to need employment. And stable employment is better. Yeah, so you're thinking on the societal basis, but that's not how people work. That's not how business leaders work day to day. But maybe that's what I'm saying.

Starting point is 00:39:41 We need to change our mindset. Because if we do automate these jobs away and we take a lot of careers that should start right after college and begin to accrete value and income and therefore wealth and therefore the ability to buy a house and have children, if we short circuit that process at the beginning, we're going to have a lot of people that are just 45 with two roommates and a shared dog. Yeah. It's already happening. Yeah. So, I mean, these conversations that people don't want us to have because they don't want us to, you know, tell people the truth about what's coming. I mean, who's offering? Is anyone even trying to find a solution to this other than just

Starting point is 00:40:14 throwing their hands up? Because I agree with you. And it scares me. It doesn't need to be scary. It should be concerning that smart people building in the space, Dario, Elon, myself, not that I'm on their levels, but, and Bernie Sanders and other folks are saying, hey, that is a little bit concerning and we should be thinking about it. Individual businesses are not going to think collectively. The government is like, they're never going to be able to, like, get in here and solve these problems. So what I'll say is you're on your own, young people, the message to young people, you're on your own. Nobody's coming to help you. And you've got to figure it out for yourself. That's my best advice because businesses are just going to do what

Starting point is 00:40:59 businesses do, which is lower costs and be more efficient. They're going to adopt the technology. They are adopting the technology. Oh, quickly. Yeah. And so they are not going to think about the overall society issues. Our government has never been able to manage any of this stuff. They never will. They're not going to do it. There'll be like 10 years in the review mirror by the time they get involved. I mean, they're just thinking about breaking up Google. search monopoly in year 30, so they're out to lunge. Government moves slow. But what about making conscious decisions, like putting together a consortium of leaders of

Starting point is 00:41:34 businesses and say, hey, we are not going to stop hiring young people. We are going to keep investing in the future generations because we would like to not only have a society in 50 years, we like to have senior people in 50 years. Like this is the thing that businesses could get together and choose to do if they wanted to. I'm trying to figure out if there's any precedent for what you're saying. It's idealistic, but these are competitive businesses. So it would be like during the, I don't know, the banking era saying like,

Starting point is 00:42:02 oh, we're going to put in, you know, teller machines. Let's all the banks get together and we'll coordinate, you know, having more tellers to do more things, which eventually exactly happened. You go inside the building and they had more services they offered other than just counting you money and giving you a deposit. They moved those people to work on other offerings, I think, in some cases. often have, but it's different this time. The thing is, I don't think this is just another introduction of a specific technology that impacts one slice of one part of the labor force.

Starting point is 00:42:33 Like ATMs are automated teller machines. They automated tellers in a machine. Tellers were not 10 to 20 percent of the workforce, right? So when Dario says this, and you say he's right and he's being upfront, we are implying that we are going into a great depression level of unemployment because we're going to automate away tens of millions of jobs. I think it will be low millions of jobs. I don't think it will be tens of millions, but I think it will be low millions per year will be automated away.

Starting point is 00:43:00 So then the question is like, do we catch up with new work? Just start a company, go to Founder University. I think nobody's solving these problems. I think it's going to be a tragedy of the commons kind of situation. Nobody's, you know, going to be looking out for the interns and the apprentices. They're just going to adopt the technology, go faster, lower their costs. their costs. Amazon's just going to be a machine. They're not going to like think collectively about anything other than faster deliveries at a cheaper price. You better get accustomed

Starting point is 00:43:33 to Mondami winning in New York City because what you're saying here is that there's no way for businesses to look out for their own interests. And that's surprising to me. Because you're saying essentially it has to be such short-term thinking. They can't think long-term. And that's an indictment of the capital information process. They can't think collectively. They can think midterm, long term, but it's not like Amazon's going to go, you know what, we need to keep adding staff to do this instead of robots so that our staff can buy stuff on Amazon or can go to Starbucks on their way to work.

Starting point is 00:44:06 That's just not how businesses think, but they're just going to be radically pursuing efficiency. Now, the question is, do people want to create more companies and products and services using these new tools? If you know these tools, you're going to be infinitely employable. And that's the best advice I can give any young person who is going to be faced with, you know, 15% unemployment amongst their peer group. I think it would be 15% which I think when I graduated college in 93, I think I remember 14 or 15% unemployment. Amongst that 20 to 27 year old, if you look at that. Okay. Here's the 20 to 24 year old unemployment rate historically.

Starting point is 00:44:48 Okay, that's interesting. That's like really graduates. If you go to the 93, it was like 12%. It was just starting to decline from about 10 and then it fell consistently down to 6. Right when I was graduating, it was over 10%. Now it's at under 10%. It could be faster, as he's saying, than people anticipate. If it's not faster than he's anticipating them or spending too much on AI. And if it's going to go as fast as he says it is, we're going to end up with socialists running every single major city in this country. So something for business people to think about as they approach productivity. So we're going to host our dim sum demo day in San Francisco. We'll have our latest accelerator class. Maybe I'll do a fireside chat with one of my friends or besties.

Starting point is 00:45:30 And we'll eat some dim sum. And it's just going to be great networking for investors. I'm going to invite 150 investors. Everybody from my high net worth LP base, the syndicate members who like to write 10 to 50k checks into startups, as well as all my seed fund and venture friends in Silicon Valley. You can apply to come if you're an active investor. You have to be an active investor. This is only going to be like 150 seats, I think, or even 100. There's no room for friends to just hang out or other founders to hang out. You have to be an investor. You have to be able to prove your investor. You can apply. Launch.com

Starting point is 00:46:07 slash dim sum, D-I-M-S-U-M. Launch.co slash D-I-M-S-U-M. So jealous. That's going to be so much fun. I love DIMSum. Yeah. I love DIMSum in San Francisco and I love me some nerds. So that's like my three favorite things. It's going to be in San Francisco area. I like to make it like, you know, these demo days, I think 70% of the values to get to catch up with other investors candidly. And then 30% is to get to see the companies and get to meet them. So there's the reason to come. If you want to fly in. to San Francisco, you got a reason to do it December 5th, and I think December 6 is the all-in holiday party if you happen to be coming in for that. Awesome. And that'll do it for us today.

This Week in Startups - AI Model Showdown: Grok 4.1 vs. Gemini 3 | E2211

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.