This Week in Startups - Why data is the biggest AI bottleneck (feat. Arthur Mensch of Mistral AI) | E2212

Starting point is 00:00:00 But year, an AI model can do drive any city, drive anywhere from in Europe, Madrid to Moscow, and now it's safely 100 out of 100 times, 1,000 out of 1,000 times. Pick a year. Territically, I think it would actually already work. 10,000 out of 10,000 times perfectly? No, no, probably not. Some folks have moved to hiring experts for 50 bucks an hour, asking them questions and then just having proprietary streams of knowledge with a company, micro 1.

Starting point is 00:00:27 How do you think about hiring experts? to build out the knowledge base. What we found is that onboarding and having full-time employees that actually have the expertise to judge whether we're actually making progress is super important. There's still an amount of information

Starting point is 00:00:42 that you're not going to get to whatever you pay for. If you're not partnering with a company who actually has the knowledge, you need to source people that are experts in their fields, usually PhDs, and have an interest for computer science. If you find these two things, suddenly you have someone who is interested

Starting point is 00:00:57 in driving the competence of a, an AI model forward in their field. This week in startups is brought to you by Nexus. Stop Shadow AI in its tracks with the unified platform for secure AI adoption and productivity. Try it with a free 14-day trial at nexos.aI slash twist. LinkedIn ads. Start converting your B2B audience into high-quality leads today. Launch your first campaign and get $250 free when you spend at least 250. Go to LinkedIn.com slash this week in startups to claim your credit.

Starting point is 00:01:33 And Squarespace. Turn your idea into a beautiful website. Go to squarespace.com slash twist for a free trial. When you're ready to launch, use offer code twist to save 10% off your first purchase of a website or domain. Hey, everybody. Welcome back to this week in startups. We have a great show for you today on the back half of the show. We will have one of the co-founders of Mistral, you know,

Starting point is 00:01:57 the language model from France, from Europe, their champion. We have a great interview that Alex and I did. But first up, some news. What's in the news, Alex? The biggest thing that I saw today that got me the most excited, Jason, is that in the wake of the Gemini III launch, Alphabet shares shot up by 5%. Now, why do we care about a 5% move for a company? Well, the company's worth 3.5 trillion, which means that 5% Jason is worth about $175 billion, which means that I think the market just repaid Alphabet for all of a business. it's AI work ever in a single day. I think you've kind of nailed it.

Starting point is 00:02:31 If they increased their market cap by $100 billion, and they're going to spend $100 billion this year on their buildout, if they had done a secondary offering for their shares and raised $100 billion, yeah, they could have put that to work. So in some ways, the market is rewarding them for their progress. Now, I don't know that they've got a hundred. $100 billion in market cap. So let's think about that for a second. If they were to get a 10x multiple on revenue,

Starting point is 00:03:05 they would need to have $10 billion in AI revenue. Do they have $10 billion in AI revenue? Or will this drive $10 billion? I don't actually know how to do that because they do charge for their product, right? Yes. We have no insight into how many paid customers they have. And they do have an API and they do have their cloud computing. So we do know how much money their cloud is making.

Starting point is 00:03:26 and we know its growth rate, which has been very strong. Strong and profitable. Yeah. So that's actually, I think, where it's just an interesting note. They've got over 10 million developers using their models. They've always been very popular with developers. They have a global market. And their AI is helping their ad network, helping their search results.

Starting point is 00:03:46 And this is why I think a lot of people got it wrong. Everybody thought chat GPT would just run over Google because people would, their behavior would change. They'd stop going to Google. They would start going to this new product. Well, network effects are strong, and strong companies realize when they have competition. There's a similar thing going on with DoorDash and Uber and Grab and Dedy. Every saying, oh, wait, there's new entrance into the market who are doing self-driving.

Starting point is 00:04:15 Therefore, they're going to run over the incumbents with network effects. Well, it's not easy to run over DoorDash because they have all those relationships with the restaurant. So if you are a zipline and you want to compete with DoorDash, which I believe is their business models to compete with DoorDash, well, they have to go and they have to replicate the network of restaurants. That DoorDash has spent over a decade building those relationships and building that fabric. It's not impossible. Like I think Zipline could be attaching themselves. They're building essentially the drive-through window for drones. So if you think about a standalone Starbucks, not in the city, but when you look at a stand-alone Starbucks, but when you look at,

Starting point is 00:04:54 live out in the suburbs, a Starbucks is like a standalone island and you drive around it, right? Because there's a drive-thru. And the drive-through becomes a lot of the business. People order while they're driving, maybe hopefully they're going to sell a driving car from their app. They order their Starbucks and then boom, or they order it from the little box there. They use a language model probably, as we saw. I showed on a previous episode. Long story short, what you're having happened is the incumbents are now going to have to deal with. You know, zipline. So here's zip line.

Starting point is 00:05:26 And that zip line is people having to walk over to that little area. So that's like the one we're seeing on their home page. The delivery is the same. They drop down a little box on a tether and, you know, they fill it in this little area so that the humans are far away from the drones. They don't get clipped by them, which will not be a pleasant experience. No. However, they have a new version that's on the side of a Starbucks.

Starting point is 00:05:51 So imagine that same setup. but the person's inside a Starbucks. So we see, you try to find that. I think it might be on YouTube. They've shown it before. But imagine you've got a Starbucks. It's one of those island Starbucks. We're a McDonald's.

Starting point is 00:06:04 And instead of handing food out the drive-through window to the person in their car, imagine there's a second window. And that drone comes down to the roof. And they just put into that second window, into the drone, the Big Mac, the Frappuccino. and then it leaves from there. Well, they're doing that at the same time

Starting point is 00:06:26 that DoorDash is sending their proprietary robot. There it is on the side of a building and it drops down into the chute there and then it's available at the bottom for easy pickup either inside or outside of a building. Yeah, so is this from Zipline or is this from... This is from Zipline. Yeah.

Starting point is 00:06:40 This is from four days ago. Look at how interesting this is. I got told about this privately but you see that Zipline being tethered, the little thing going down, the carrier. Yeah, so it's right here. You can see if you pause it, you'll see it come down.

Starting point is 00:06:58 And then inside, see there it is. So the zip line drone is above it. They bolt this onto the side of the McDonald's. So great. So they don't have to build a new depot. They just figured out a clever way to do this. So long story short, competitors are coming.

Starting point is 00:07:15 Just like Chad GPT was coming for Google. Google realized that Serga's like, you know what, I need to go back to work five days a week. Sergey starts going back to work, starts motivating the troops, and now they're competitive. Every time the new leaderboard comes out, they're in the top one, two, or three slot. Congratulations. Grog, Gemini, chat, chip, PT, maybe not as much these days. They're all just starting to aggressively go after that.

Starting point is 00:07:38 But this idea that Google search would go away and would just get run over and Google would just sit there and take it, or Uber or DoorDash will sit there and take it when a competitor comes. O'Contray, I'm on Ferrer. Show the DoorDash robot because they built their own purpose built one that drives not on the sidewalk like the slowbots. They built one that drives in the street. Yes. That can go 35 miles an hour, I believe, 25, 35 miles an hour. This is the future.

Starting point is 00:08:02 So DoorDash isn't going to take it sitting down and they've already got the network effect. But here is there the DoorDash robot. So instead of DoorDash partnering with folks, they made this, it can drive, as you can see, in the street. Look at it. Boom, it can park in a parking spot. So it comes to your, you know, door,

Starting point is 00:08:20 It comes to a DoorDash restaurant, comes to the McDonald's, and the person steps out, drops it in the front of the bay. And this thing zips, I believe, 25 or 35 miles per hour on the street. And obviously, when it's on the sidewalk, it goes five. So we can do both. This is happening fast and furious. Wow, man, there's another upgrade to this. That's basically the size of a stroller.

Starting point is 00:08:39 Imagine if you could put your baby in there, right? Yes. And then send it to daycare without having to go yourself. Right. Just put a little window on there. That's going to be safer than dad. doing it on their phone playing chess or, you know, putting in their prize picks for the Knicks game tonight. And when I say, dad, I'm referring to me. I was going to say, which finger are you

Starting point is 00:09:00 pointing here? Me? I mean, come on. Dad's going to do dad things. I know. I know. Also, you know what's really not safe is having everyone drive up in their 16,000 pound SUV and then crawl ahead right next to a school where they're tiny children. Yeah. Absolutely. So these things are going to be safer. I'll be totally honest. Sending your kids to school, in that thing, it reminds me of Grogu from the Mandalorian. Remember, he's in that protective little floating case. That's where they got the inspiration for that. It's the clamshell from the Mandalorian,

Starting point is 00:09:33 where Grogu is in a little thing. That's where they got the inspiration from. It's from science fiction, pop culture. Is this what you're referring to? I found a small frog sitting in a mixing bowl. Is this? Is that it? Let's see. Yeah, that's not a frog.

Starting point is 00:09:48 That's Grogu from the Mandalorian. It's a frog in a mixing bowl, but we can call it Grogu if we want. Baby Odom. Squarespace has been a friend and partner of this week in startups for over a decade, and it's because they are one of my favorite startups of all time. You know Squarespace, the all-in-one platform that's going to help you make a beautiful website to help establish your brand online and scale your business with you as you grow. It doesn't matter if you're selling content or a course.

Starting point is 00:10:13 But what if you want something super customized? Well, you can use their AI-powered blueprint feature. That works right alongside you as a good. creative partner. You tell it what vibes you're going for, what the aesthetic is, and it makes something totally beautiful and personalized to you and your business. And new features keep coming from our friends at Squarespace. You remember they did SEO, you know, they've been doing that for a decade. Now they're doing it for AI. Yes, AI search. People are starting with AI. And then you see all those citations. They lead to Squarespace websites because they do AI search optimization. Go to

Starting point is 00:10:44 Squarespace.com slash twist to get 10% off your first website or domain purchase. Anyway, important story there. You will face competitors, and that's a great way to rally the troops to compete. Uber has done 15 investments in every self-driving company other than Zooks and partnerships with other than Zooks and Tesla. So it's 15 people on one side versus those two, Uber plus those 15 versus those two players. DoorDash, they're not taking the zip line or the street robots. For granted, they're making their own. And here, obviously, Google's not taking it sitting down.

Starting point is 00:11:27 All right, next up on the show, we are talking to a company called Mistral AI. They are out of Europe. They are Europe's foundation model champion. We have Arthur Minch here on the show. Arthur, I'm so glad that you're here. We talked about your company, Adnauseum, but it's good to hear directly from the source. Welcome to Twist. Thank you for hosting us.

Starting point is 00:11:43 So thank you for hosting me. I guess the first question is, what's it like? You've got so much competition. And this is every week here on the show. We have a new model racing up the L.M arena, solving humanities, you know, most difficult questions. What's the pace like at the company when you're up against Gemini, GROC, Open AI? It just seems like a bit of a marathon filled with sprints.

Starting point is 00:12:11 What's it like running a large language model in 2025? Well, it's been quite an intense race for ever since we started, around two years and a half ago. I think what has been very interesting is that we had simultaneously to build a science team that we were able to compete with very large competitors that were actually better fun than we were.

Starting point is 00:12:31 So we had to be more efficient. We had to be more careful about the way we spend the money, but we actually managed to create models and to lead on the open source front. What has been a challenge and very interesting as well has been to build a business on top. So building the science and the business on top able to generate the revenue,

Starting point is 00:12:46 I think that's been, that has maintained quite, quite a lot of pressure on the, on this trial, but we learn fast. So, so we're now in quite quite a good position. Tell me about the business. Obviously, there is a consumer business that's emerging where people pay 20 bucks a month, maybe pro-sumer. They'll pay 30, 40, 50 bucks a month. They have these corporate accounts. I know we pay for, I think, Claude and Open AI at the current time people on the team will go AWOL and just start subscribing to AI products on their corporate cards because, you know, hey, listen, they're so affordable. And then, of course, there's the API space. What's Mistral's business today? And then who are you competing?

Starting point is 00:13:32 Our business is really to serve the enterprise. So we have, we have self-served versions of our products, but really what is driving the business forward and the value proposition that we have is to partner with enterprises and to go for their. their most complicated problems. So we see two things in the enterprises. They either want to automate things to drive efficiency or they want to create models that are driving growth and making their product better.

Starting point is 00:13:56 So on the physical space or on the digital space. And so what we have is, so we got known for releasing open source models, and that's the mission of the company is really to bring AI into everyone's hands. But the business we've built on top is an enterprise business where we orchestrate the models, We connect the models to the enterprise data.

Starting point is 00:14:15 We deploy the platform on secured spaces, if need be. We can go on private cloud. We can go on prem. We can address sovereign needs. And based on that, the value proposition we have is that we actually go all the way to the delivery of the use cases. So we have a forward deployment engineering team and a forward deployment scientist team that are using our orchestration platform to build applications and automation, that are using our training platform to build models that are connected to proprietary

Starting point is 00:14:42 our data, to proprietary images, to proprietary signals. And so that's how we drive value. So we sign large contracts. We generally, in multi-years, we bring the platform to train, the platform to orchestrate, the platform to observe. And we bring the experts,

Starting point is 00:14:59 because what we realized in the last two years is that early enterprises are testing out a few things, they're buying out a few things on the AI side. But they are not getting to value. And that's really their pain, is that everybody is telling them that AI is magic, that is going to change entirely the way enterprises are run. But then down the line, they've done a few prototypes and nothing is working.

Starting point is 00:15:20 And so what we do is really to correct that and to bring them to the value. Yeah, that seems like a really great jumping off point for this debate that's going on. Are enterprises actually deploying AI, getting value from it, and able to actually capture that efficiency, or this MIT study that's just constantly being cited, oh, they're deploying but not getting value. What's the reality as you're seeing it? Because you're embedded inside these enterprises and really trying to help them on-prem, it seems like,

Starting point is 00:15:58 close that gap between this is a pilot, a project, a skunk works, you know, versus, hey, this is fully deployed and providing value that the CFO, the CFO, the C-O, can quantify and, you know, sort of make into the future roadmap. I think the reality is mixed. There's been a lot of pressure on the C suite of large enterprises to deliver value. And the way they went about it is to do a bunch of prototypes. And then they realized that it was actually hard to scale.

Starting point is 00:16:29 Because the problem of building with AI is that you're still building software. You still need to iterate. You need to create a first version, a second version. But to create a new version, you need to do data science. Because to improve your AI agents, you need to get new data. You need to get feedback on the behavior of the agent, on the edge cases it hasn't handled, and you need to fix it. So that iterative mindset of going from a prototype to production by reducing the errors and improving the accuracy through acquisition of data is something that is kind of missing in enterprises because it turns out to be the mindset that you would find in AI scientists. And those are pretty rare to find because they just started to, they started to, they started.

Starting point is 00:17:10 to work five years ago. And so that's really the problem that we solve. So we come, we either go on iconic use cases, so we try to work backward from the outcomes. We work with enterprises, tell them that they shouldn't be looking at AI as they should start from a problem, not from a solution. So they should start from a problem, look at how the process is being run, how do you map that onto an AI agent that is going to execute it for you? And if they do so, they should realize that it's going to be to take some time. So first version, going to work 80% of the time, you will have hedge cases. Then you go forward and you make it better and better by getting more data, more feedback

Starting point is 00:17:49 from users. You need to manage everything. And that's what we bring. That's the technology we bring. But if you take that iterative mindset and you combine it with ability to change the organization in the company so that you actually save costs or you actually make, make your researchers work faster, then you can get to the value. And so that's, that's what we bring.

Starting point is 00:18:08 So in general, our ICP, oftentimes tried a few things on its own, failed. And when they start working with us, they realize what they need to do and they get to... On the going to enterprises, helping them get to 80%, use their data to improve the model, get better accuracy and more value out of it. Is that a system in which eventually that they don't need to have Mistral staff on site to help them get this stuff right? Or is there a long-term need for Mistral? or a similar foundation model company to kind of stay on site

Starting point is 00:18:42 to help ensure that these systems stay functional and keep delivering value. I think the end goal is for us to disappear and for the AI products and the AI platforms to be fully usable by the business users. Today, this is not the case. Today, this is still too complex. And I think what is really missing

Starting point is 00:19:00 is the mindset of iteration and the methods to follow to use the right tools are still a little bit missing. So we come and we engage our teams on the long term. So like for two, for three years. I think down the line, the products are getting more abstract. The models are getting better. The tools are getting better. And so it's all going to get easier.

Starting point is 00:19:21 But we should realize that it's going to take 10 years for enterprises to transform themselves with AI. And to your point, whether it's with people or not, processes and automation that you deploy with AI systems are, inherently they need to grow with time. When you run a company, your processes are never stable. You change them over time. You reorganize yourself. You change the people. You adapt the software.

Starting point is 00:19:47 You change the IT stack, etc. The same goes. And the people in the company actually can deal with that because they are humans. They can deal with change it. What's interesting is that the same goes for AI agents. They can deal with changes. They can deal with legacy. They can deal with badly connected systems.

Starting point is 00:20:02 They can deal with errors. And they can be adapted. So we should really see, and that's what we tell to our customers, that it's always going to be iterative. It's iterative to build and to improve. But even when it's running, you should always monitor what's happening because the input can change, the situation can change, and you need to then adapt the system. Founders, listen, if you've got a B2B company and you're marketing your product or service to other businesses, you need to use LinkedIn ads. Otherwise, you're wasting your time and you're wasting your money. LinkedIn isn't some random social network where everyone's using a fake name.

Starting point is 00:20:35 and posting nonsense. It's a network for professionals. LinkedIn knows who the big players are and who the key decision makers are in your vertical based on the data they're already collecting. Everyone's on LinkedIn. Me, you, everyone. The platform has more than a billion members, and that includes those 10 million C-suite executives. And that's the kind of reach you need to grow your business. LinkedIn's advertising tools aren't just throwing your messages around for anyone who's clicking. No, B2B marketers report 2.5 times. higher returns on ad spend on LinkedIn compared to other social platforms. That's why my firm launch advertises our own projects on LinkedIn. We want to reach people who are actually doing

Starting point is 00:21:16 exciting work at real companies, not randos, not bots, not trolls, real executives. So here's your call to action. We've got a special offer just for Twist listeners. Launch your first campaign and get $250 free when you spend at least $250. That's one for one. Go to LinkedIn.com slash this week in startups to claim your credit. LinkedIn.com slash this week in startups. I want to talk about the models you guys make. We keep pretty close tabs here on, you know, market share and who's leading the leaderboards.

Starting point is 00:21:48 But one interesting divide out there, Arthur, is just open source versus closed source. And from where we sit here in the U.S., it feels like American companies are doing really well at closed source models. And over in China, there is a bevy of high-quality, open-weight models at a minimum. Mr. Lowe has long had an open source, open weights core to it.

Starting point is 00:22:08 Is that more of a philosophical approach, or is that something that enterprise customers are actually demanding from the company? I think it's both. The reason why we started the company was to disrupt the market with open source models. I was at Google, which wasn't following that strategy. And Guillaume and Timothy were at Meta, which was at the time not following that strategy. And so we started and we showed the way to actually build great models. And actually we kind of triggered, I think, the Chinese company to do similar things. They've been competing with one another there to innovate and they go actually quite fast.

Starting point is 00:22:41 Yeah, yeah. But for us, because it's philosophical and because we have the funding and the business to do it, we'll be continuing to largely contribute to that. We feel in particular that Europe and the US are really missing a leader in that field. And that because of all of the progress we've made, we're able to take the lead. So we'll have very interesting things to announce. pretty soon. To your second point of whether this is actually interesting for businesses, it is.

Starting point is 00:23:10 It is for two reasons. One of them is sovereignty, which means that when you use an open weight model, you are able to deploy it wherever you want. And when you do so, you can actually get rid of the data dependency and the processes that you need to involve if you use coastal APIs. Sure. It's really a problem when you're running critical workloads or you're running public sector services

Starting point is 00:23:31 or you're running different systems. You can't afford to rely on a closed source API. So that's the first thing. Second thing is that because of... We've all made similar progresses in compressing the knowledge available onto our models. The next frontier is today to actually use the data that is within enterprises

Starting point is 00:23:51 to make models better. So you fine-tune them, you do reinforcement learning on them, you use human feedback on them. The only way you can do that is actually to get access to the weights and to change them. If you don't have enough handles and you can't go deep enough to make your own models. So our customers are actually coming to us

Starting point is 00:24:12 and they want to own the weights of the models. They want to start from an open source model. They don't know how to train it because it's actually pretty hard. You have multiple steps. You have 10 steps to actually train a model these days. And we actually help them prove these steps so that they get something

Starting point is 00:24:25 that is actually distilling their internal knowledge onto the model itself. Should people trust OpenAI with their data? There's a big debate going on in the U.S. now for startups, let's say, who are starting to embrace open source models. But there's a concern. Hey, Open AI is going into a lot of different businesses. And if you're, to your point, taking your data, giving it to another closed system, you're kind of training that system. They may eventually become your competitor. So is part of, you're, you're part of, your positioning in the market, hey, we're not going to compete with you as a startup company. You should use our systems as opposed to say OpenAI, which is competing on all fronts with

Starting point is 00:25:09 all companies. Because we have a cost structure that is fairly different from some of our competitors, we don't need to do everything to actually sustain the business in the long term. So what that means is that there are parts in which we are not going. So like we're not doing consumer seriously. We're exposing our technology to showcase what we can do with enterprises. but we'd rather work with a consumer company than to actually build to compete with them.

Starting point is 00:25:34 And then I guess the questions that startups need to ask themselves is whether in the long term can you depend on a company that needs to steal your business to become competitive and to remain in business itself because it's spending a lot of money. So I think that definitely drives people towards us.

Starting point is 00:25:51 That aspect around strategic autonomy, it's driving defense companies toward us, but it's also driving AI companies towards us. I would say for similar reasons. And so, yeah, I guess it's not for me to tell, but of course, I mean, depending on the supplier that will compete, it's something you can do on the short term, but at some point you need to create leverage. And open source models happen to be the best place in which you can create leverage because you can just take it and then provided you have the right platform and we can help with that, you can customize it, you can deploy it where you want, and then suddenly you're no longer dependent on someone who can compete with you. Yeah, that seems to be an important question for startups to think about if you watch opening eye that, you know, they're launching a co-pilot for coding, they're going to launch,

Starting point is 00:26:33 you know, SORA and have a social network and an image. They need to make a trillion dollars or pay for $1.4 trillion in infrastructure they're going to build out, which kind of is my next question is, what's your take on this colossal, no pun intended, amount of infrastructure being built? Is it necessary? Are we going to overbuild and are there problems we can only solve if we have massive amounts of compute? Do you think we're kind of getting to the point where it's not about the compute as much as it

Starting point is 00:27:04 is the tuning of the models? I think it has always been the case that you have two bottlenecks in machine learning. One of them is compute. The other one is data. And sometimes we didn't have enough data. Sometimes we didn't have enough compute. I would say we didn't have enough compute from 2020 to. to 2024. No, we are actually running out of data. That's the problem. We're kind of down compressing

Starting point is 00:27:31 the world knowledge into pre-trained models. We can create like a reinforcement learning system that are creating new kind of signal on our own. Then there's a lot of reinforcement learning system and knowledge that sits within people's brain and employees' brand. And a lot of the knowledge that remains to be used is actually the IP of companies. So I think the race today is, really to get the right reinforcement learning environment and to partner with companies that can bring their specific knowledge onto those models. And so for us, what that means is a decentralized system is actually a much better way of doing it. So you bring the model to the company and you help them build something that is really custom and defendable from them. The question around compute,

Starting point is 00:28:12 I mean, you can always go faster with more compute. The question is at some point you need to pay for it. And at some point, you need to create a long-term value for enterprises. And that's going to take time. It actually takes some people because the expertise is not there. So we need to, that's why we have really built that full stack offering of going from model to services and then also to the infrastructure when we build things in Europe. And so I think we need to create this long-term value in order to pay for the infrastructure that we're building. So it's useful. I think in particular the data centers and the concentration of power is going to be useful. The question is, can we actually justify that kind of investment over such a,

Starting point is 00:28:51 short period of time, given the viscosity that we experience in the enterprise, then our point of view is that we'd rather be on the downstream and on the creation of value than being on the infrastructure itself and only on the infrastructure itself. How do you think about hiring experts to build out the knowledge base? If all of the information on the web is being captured now, you have the open crawl, Reddit's licensing its data, Kora's licensing its data, you know, there's a finite amount of data out there. some folks have moved to hiring experts for 50 bucks an hour, asking them questions, and then just having proprietary streams of knowledge.

Starting point is 00:29:26 We have a company, Micro One, in our portfolio that we see invested in, you know, that does this and helps LLMs sort of build up their knowledge base and fill it in. So are we getting to that point where now that we've collected all the information on the open web that's legally able to be, you know, indexed and or legally able to be licensed, We now have to say, you know what, we just need to hire 100 lawyers, 100 accountants, 100 experts on construction and ask them very fine-tune questions and show them queries that actual users are doing and making sure that this knowledge gets put in in a more deliberate way than just, you know, scrape it all off the web.

Starting point is 00:30:08 There are so many amazing AI tools out there that can make your team faster and more effective. Any founder would be a fool not to embrace these tools. but a large team making use of a diverse array of cutting-edge AI apps can also pose a privacy and security nightmare for your organization. But now there's Nexus AI, the workspace that will help your team up their game with greater efficiency. Without sacrificing your data or your peace of mind, nexus AI is going to give your teams a simple browser-based environment to work in, integrating all the latest and greatest AI models

Starting point is 00:30:45 and replacing the various potentially problematic tools and subscriptions, your staff can't even keep track of anyway. You're probably paying for two or three things over and over again redundantly. Finally, your admin team gets full visibility and oversight into what's happening, so your team can enjoy all the productivity gains of AI without the data leaks, and your platform remains compliant while your data stays safe and secure. But don't take my word for it. Try it yourself. Go to nexos.com.a.i slash twist for a 14-day trial.

Starting point is 00:31:15 That's N-E-X-O-S-D-A-I-S-T-T-I-S-T-I-I-S-T-I-I-S-T-I. It is indeed very important, and there's really two ways in which you can do it. So either you do it on horizontally and on your own, and that's what we do. As the model grow in capacity, the frontier of their capacities become bigger. So you can make the model much better in physics, you can make it much better in mathematics. For that, you can either come up with synthetic environments. Those are useful, but at some point you also need to get the right expert. You don't need to hire them necessarily.

Starting point is 00:31:49 You can actually subtract them to some of the startups you mentioned. What we found is that onboarding and having full-time employees that actually have the expertise to judge whether we're actually making progress is super important. So that's why we're analyzing that that aspect around evaluating whether we're making progress and setting up the right evaluations, looking at what kind of data sets could be actually useful to move us forward. So that's in it quite a big focus of the company today.

Starting point is 00:32:16 And say the second thing which is important is that there's still an amount of information that you're not going to get to whatever you pay for. If you're not partnering with a company who actually has the knowledge. So when we work with companies like ISML, SML has unique knowledge in how to build machines that can do lithography at 8 nanometer precision. And their physicists and their research actually have this kind of knowledge that is nowhere else to be found, which is sufficiently precise that they will never get rid of it. They will never license it. They will want to chip it. It's their defensibility.

Starting point is 00:32:51 And so by exporting our training platform to such customers, we actually enable them to do it on their own. So part of our approach is to annotate ourselves, but they also export the annotation tool to our customers. So there's an annotation tool. You actually hire experts. What is the title for that person? And where do you source them from? And they previously just experts at universities or they were journalists or analysts at banks? Where do you source an expert on one of these topics to do that horizontal

Starting point is 00:33:22 training and do that annotation tool? You can call them AI trainer and the people you source are, you need to source people that are experts in their fields, usually PhDs, and have an interest for computer science. If you find these two things, certainly you have someone who is interested in driving the competence of an AI model forward in their field. So that's how you, that's how we look for people. Do you keep them long term, though, or do you need your physics team for like two years and then you gently let them go? I mean, we're a two years old company, so I don't really have an insight to answer. But no, I mean, the evaluators, the people that know and that can judge whether we're doing progress, those needs to be experts in the field and those you keep, of course.

Starting point is 00:34:04 Then at some point, you have surge in annotation campaigns and then you look for people that can do it faster than what you could do internally. But this is still a part of the company that is important and that the people are driving and defining the evaluations and verifying whether we're making more models better at conversation, mathematics, at medical aspects, at this is critical, actually. Does that show up in the benchmarks? Jason and I were talking yesterday about the benchmarks for some new AI models and we're trying to figure out how much stock to put in them, Arthur, how much to trust them? Because it sounds like you're doing quite a lot of work to make your model

Starting point is 00:34:38 smarter for individual companies in a way that might not show up on LMA arena or, you know, just a normal kind of like benchmark testing set. So LM Arena shows progress on the conversational and ux, which is one of the area where we are investing and that's why we are, we like the when we released our last release was in August. And so that it actually ranks second at the time so you can, and we have new releases coming. The question is, it's a hard question like evaluations are very critical.

Starting point is 00:35:07 The problem with evaluations that are widely used is that they tend to be be benchmarked a bit by, because it's always, you look better if you optimize for it, either consciously or then consciously, but at some point, the benchmarks stop making sense. So they are useful if they remain proxy of the performance, but they always tend to fall. At some point, they cease to be a good proxy for performance. And what we find is that the actual good proxy for performance is usually how enterprises think about it when they run their evaluations themselves. So that's why I wouldn't put too much money on the benchmarks.

Starting point is 00:35:43 It's still useful. Certain of them are. The lower you are from, the farther away you are from 100%, on maximum, the better the proxy may be. So are people gaming those rankings, do you think? Like do engineers and developers look at them and say, oh, I know how to hack this? So we go up the rankings, you know, which would be,

Starting point is 00:36:08 performative, get you a little bit of press, maybe a social media cycle or two, but don't actually help move them forward. Because I have heard that a lot of the questions on them get recycled into the LLMs, and then, of course, they perform better on them, but they're not performing any better on a unique job given to the LLM. And so maybe it's an incentive that needs to be de-incentivized. I mean, the way to look about it is that if you too explicitly set a benchmark as a goal to a team of AI scientists, they will optimize for it. And they will put the right guard wells to not overfit. But down the line, if they look at it too much, they will overfit a little bit. Because they will find data sets that are correlated or they will just hill climb the thing in whatever way possible.

Starting point is 00:37:00 They know they can't train on training sets, but they can look at ways of all. doing selection. We've seen some companies actually do selection on a Llamarina, which is if you do it with two models, then that's fine. If you do it with 30 models, then you're definitely overfitting. So there's a very thin line in between people are not cheating, but people

Starting point is 00:37:20 are optimizing a little bit for benchmarks, either explicitly or implicitly. So that's why the best way to think about it is that you train your model, you come up with your own internal proxies, you train for your right proxies, and at the end of it, you look at the LA Marina score.

Starting point is 00:37:35 And then if you know that you're getting high there and that you haven't been optimizing for it, then that means you've actually made progress. If you have been optimizing for it, then maybe you don't know whether you've made progress or not or whether you've been benchmarking, benchmarking a little bit, even inconsistly. Benchmaxing. That's the internal term. I like it. Arthur, I want to ask about Europe.

Starting point is 00:37:58 There is discussion about changes to the AI Act, the 20th regime, possibly making it easier for startups to be formed around the EU. Lots of debates about the place of government in technology today. What's the Mistral perspective on the European regulatory regime and how to ensure that your company and others that follow it can grow into world size? Well, yeah, I mean, Europe is our anchor market. We do a little more than half of our activity in Europe. The rest is in the U.S. and in Asia.

Starting point is 00:38:28 We've just announced actually a partnership with SAP, so really address the enterprise needs and the public states needs in Europe. Yesterday with the French president and the German Chancellor, Europe is going into the right direction. I think they made some mistake on the EA Act back when we were very early, so 2023. They started to regulate the technology. There was actually quite a lot of lobbying from US companies to make it happen, because then you get the strategy being you put the right terms in the EU law,

Starting point is 00:39:03 and then you import it to the Californian law, and then suddenly you're starting to do regulatory capture. So consequence of that is that there's some amount of regulation on the technology itself on the AI Act, which doesn't make sense. We've worked with the enforcer to make it workable, and we can actually ensure that our customers are compliant with the AI Act, etc. We're big enough to deal with it.

Starting point is 00:39:26 The problem that we have is that this is not sending the right signal to the investors and to the founders, when it comes to creating a company in Europe. And that's exactly what we need. We need people to create company in Europe. We need investors and global capital to be invested in Europe. So I think people have started to understand that. No, the problem is that Europe tends to get a little bit into its own way

Starting point is 00:39:50 because it's a lot of bureaucracy. But you can see with the recent announcements on the 20F regime, which is going to simplify things, there's a lot of good decisions that are being taken. it's going too slow. The AI Act was not, was poorly designed. We can deal with it. I'd say, I think the, the, we have tailwind in Europe.

Starting point is 00:40:14 Because, because I think there's a growing realization that strategic autonomy on digital services is important. And that is, that has really grown our business quite significantly. Now, the regulatory framework could be simpler. We need to make it simpler. We'll continue to say it and I think we'll get there. These language models coming down to the device level. There's been some discussion of some of these open source models,

Starting point is 00:40:39 the smaller ones, last generation, being able to be run on your MacBook Pro, your M4 Mac Mini, maybe even your phone. Do you have a take on where we will be in a couple of years in terms of on-device, language models, AI running locally? Obviously, you brought up this incredible example of ASML wanting to not have all their important, you know, information and innovations on somebody else's LLM, you know, that's going to eventually happen.

Starting point is 00:41:16 I think consumers will realize maybe I shouldn't have all my photos educating SORA and then having, you know, my kids' pictures informing somebody making photos on some public social network or whether it's Meta's or SORA. So how do you think about that? And specifically, Apple seems like a really interesting player that you could potentially partner with or others to kind of bring an LLM to the device level. We are, I think we're the only Frontier company

Starting point is 00:41:45 which is actually working on the edge today. And you mentioned language models, but really the one thing which is interesting on small models that can be deployed on device are not really the language models, but more the audio models and the emails, because everything that sits on the edge typically intel that you don't have a keyboard available,

Starting point is 00:42:03 so you want to use different ways of interacting. So that can be audio or that can be video. Now, so those are the two modalities where we have invested on that domain. We've released actually the strongest transcription model on audio called Voxtrial in July. It's open source, can test it, can deploy it on device. You mentioned MacBooks,

Starting point is 00:42:27 and smartphones as a natural place to deploy them. You can do that, but it's not that interesting. Because in general, when you have a MacBook, you're connected to the web. What is much more interesting is that when you start deploying it on drones, when you start deploying it on body-di-I, when you start deploying it on devices that may go out of network and that can actually take actions. It's important for safety, so for firewall drones, for instance.

Starting point is 00:42:54 It's important for defense. So we work with Helsing, which is a defense AI company, on drones that are able to detect mines. So quadruped drones detecting mines. And we actually bring the vision model to them that's enabled to control the drone in a fully automatic way. So to me, the opportunity of edge AI is multiple, but the biggest one is with robotics. The other one is around portability. So when we work with SML, they have their own customers. and when they deploy their technology with their own customers,

Starting point is 00:43:27 well, they need to take their software to their own customers, and the data is actually not flowing back to them. So they need a portable solution to analyze the data of their customers themselves. So the portability is super important in many B2, when you're doing B2B to B. So when you're serving business to business companies, they have the same problem as you do. They need to handle the IT settings of their customers. And so the portability means that you can actually,

Starting point is 00:43:54 make them portable and they can address more needs. So portability, which doesn't necessarily need going small because you can actually have servers that gets deployed on-prem is important. I think device deployment is useful in certain domains. Audio for smartphones definitely useful. For privacy, it's very useful and it's getting better. So the models are getting, that can run on device are getting better. So we'll get to a point where some of the workloads do not need to sit on the cloud.

Starting point is 00:44:23 But really the opportunity is in robotics because just do so much more thing with edge models there. Knowing what you know about those sort of edge models in robotics, what's your take on when a humanoid robotic will be able to meaningfully help somebody as a housekeeper just to pick one, you know, really sought after use case. And obviously, if you can do the dishes, make dinner and, you know, fold laundry, feed the dogs, take them for a walk, you're going to be able to do a lot of other things in the real world. These are kind of fine motor tasks, right, like folding laundry, etc. When do you think those will, because obviously they're only going to cost 10 or 20K? So putting costs aside, when do you think they'll actually be ready to make an impact and be as good as the average housekeeper? Well, first disclaimer, the thing we do at Mistral is to build the software for robotics. So like the, the image. match to action models, but the one thing, I'm not an expert in everything aptique related and find motor control related. This is actually a complex problem on the hardware side that requires to build a hand that is working very well. These are problems that are actually not solved yet. My take is that there is a lot of value creation to be done in robotics with

Starting point is 00:45:40 businesses before doing it with consumers. Deploying an housekeeper with consumers is going to be extremely painful on the regulatory side for, I think, obvious reasons of safety. And it's an extremely complex problem on the hardware side. On the other hand, the model we built are actually already able to operate drones that are operating in a scenario where you don't want to send a human. So it's actually safer to send a drone than to send a human. So suddenly you don't have the safety problems that you run into. It's quite the opposite, actually.

Starting point is 00:46:14 sending a drone into a fire situation is better than a human. Sending a robot into a home is dangerous, right? So it's interesting. Exactly. Suddenly regulation is a tailwind and not a hand with. And the other thing is that you don't need to have fine motor control to do many things in the factory, to actually bring a drone to fire.

Starting point is 00:46:36 You have different kind of problems, but not that kind of fine motor control problems. And so we're more, I would much, bullish on the B2B opportunity of robotics and on the B2C opportunity. Down the line, it will work. And the kind of models we build are actually good for that, I think. And they will be useful. But it will take a fair amount of time.

Starting point is 00:47:00 So just look at like cars that self-driving cars. They're still very far away from general availability. I think it's going to stay the case. and housekeeping robotics will be even more than case. What year, you know, an AI model can do, drive any city, drive anywhere from in Europe, you know, from, you know, Madrid to Moscow. Car can drive from Madrid to Moscow, just and nail it safely, a hundred out of a hundred times, a thousand out of a thousand times.

Starting point is 00:47:37 Pick a year and then Alex and I will gamble on it. You pick the year. We'll do the over-under. We need action here. Theoretically, I think it would actually already work. 10,000 out of 10,000 times perfectly? No, no, probably not. I think it's the same when we deploy, like, AI agents and enterprises.

Starting point is 00:47:53 The problem and what takes time and what prevents from going into production is are the edge cases. Yeah. Need to hire out the edge cases. So I'm not the experts. I think it's getting there. I have some good, good companies out there, like Wave AI, which is actually doing this. Yeah. And I think we'll get there on robotics as well.

Starting point is 00:48:12 well. If you ask me, like, as the one who knows about whether the model can do it and whether they can process images and take the right decisions, the answer is yes. If you have the right data, then yeah, you can actually drive the accuracy right. There's a lot of other problems, I think. You got to pick a year, though, man. You can't, you can't just give that really polite answer or not answer the question. Come on. Give me a year. 2008, 2008, 2009? Test driver would say, like, yeah, four years from now. Four years from now, So, 2020, so, 29 can go from Madrid to Moscow 10,000 times out of 10,000 times. Yeah.

Starting point is 00:48:45 Wow. What would you take, Alex? Oh, the under over. Under. Under, yeah. I think we're both going to take the under on that one. Oh, dang. I think we both have a lot of faith in what you're working on.

Starting point is 00:48:56 I've also driven in Russia and the roads at times leave something to be desired. So it may be that the last stint to Moscow might take some more time. that I think about it. Oh, and then some of those streets in Spain are pretty narrow. Pretty narrow. Listen, continued success and appreciate you coming on the program

Starting point is 00:49:15 and educating us on all the great stuff. Thank you very much going on over there. Appreciate your time. Thank you. One of the major large language models coming on the pod and giving us all the details. I always love talking to founders that I've never met before. And I also love to figure out what a company is doing

Starting point is 00:49:30 past what you and I can kind of see from the surface. Like we knew they were making models and open stuff and so forth. But I didn't know how deep they're forward deployed engineer framework was, how deep they were going to enterprises, the on-prem stuff. So this is why we talk to founders, folks. This is why we do it. Always good to learn.

Starting point is 00:49:46 Next up on the docket, Suno raises $250 million. This values the AI music generation startup at $2.45 billion, Jason. And I was kind of surprised by this for a couple of reasons. One, we have seen its competitor Udio solve one of its lawsuits with major rights holders. We have not seen Suno yet reach any settlements. or any kind of agreements. Is Suno still in lawsuits? Yeah. I believe they're still being sued by all the big three. So of the two companies and six available lawsuits, one has been kind of sorted out. Now, my read is, and tell me if I'm wrong here, Jason,

Starting point is 00:50:20 but if the investors, in this case, Menlo and a couple of other venture firms are going to bet $250 million on this company, they must have a line of sight to resolution of those lawsuits. It really depends on if Suno did the wrong thing or not, right? So the penalty, if they actually actually pirated music is $250,000 per infraction. It is giant. As I've said on previous episodes, you don't F with the music industry as a startup. You get their explosive permission before you do anything.

Starting point is 00:50:52 Yes. And you use royalty-free, generic stuff with a signed contract if you're going to do anything so that you don't get into problems. The lawsuit claims suno-pirated music by downloading tracks from YouTube and then you're streaming ripping methods violating the DMCA. They're contesting it, arguing AI-generated music outputs are new independent creations and that use of copyright music for AI training constitute fair use. It does not.

Starting point is 00:51:19 They're going to lose 100%. I'm happy to place a bet with anybody who wants to place a bet. They will settle a major settlement with the music industry. It will wind up costing them a percentage of their company plus hundreds of millions of dollars, is my guess. So I think they're going to have to give 10%. of their company to the music industry, minimum, maybe 20% to settle this, or they're going to have to give them a quarter billion dollars, something gigantic. So this whole amount, if it's true,

Starting point is 00:51:46 that they trained it. And this is so dumb, because when we talk to Mistral on today's episode, they talk about hiring AI training experts. Here's what you could do. You could hire musicians, if you're Suno, with a $2.5 billion valuation. Musicians are not expensive. Go to tone base, and you'll, dot com, and you'll see the greatest classical musicians performing, and you can hire them to be your tutor for a very modest amount of money. Tomebase, one of our investments, and they do educational training. Really great, awesome, if you're trying to teach your kids or yourself how to do classical. But they have, like, the world's greatest flute player, bass player, guitar player, you get the idea, and they're available for work. Let's just leave it at that. Musicians are

Starting point is 00:52:31 available for work. So you could have a guitar player come in. you could have an entire band. You just go here to 6th Street, hire a band and say, we want you to play these classic songs that are in the public domain. We want you to play

Starting point is 00:52:44 all of the different chords. We want you to play the chords in this tempo. And they would just sit there all day for 50 bucks an hour and do it to the cows come home. And you would have a proprietary data set. You could videotape them doing it.

Starting point is 00:52:57 You would have musicians lined up. You could have five different guitarists do every, not blues songs, but the classic blues riffs that are not owned by anybody. You can literally have five do it, then have the AI pick which one of the five to use or to make an amalgamation of the five, drop the worst, do the best, whatever.

Starting point is 00:53:17 You can have every possibility. Why would you take the Rolling Stones or Bob Dylan from their labels and treatment? Why? Because people were lazy at the beginning of this. And technologists, they all want to break rules. Please listen to me, folks. Don't break the rules when it comes to copyright.

Starting point is 00:53:37 You will get pinched, especially when it comes to the music industry, the book industry, and now the news magazine industry. You used to be able to steal from the news and the industry a little bit because they were dopes and the magazine industry were dopes and they would just let it happen and they let Google run amok with their one box

Starting point is 00:53:56 because they were getting traffic. They were kind of dopey. New York Times was dopey in the past. They let people steal from them. They let people index them. They fell for it. Not anymore. They're going to the mat with Open AI.

Starting point is 00:54:06 And they're going to get it. I predict they'll get at least a billion dollars. I'm making two predictions right now. They're going to settle these suits, these collection of suits for a minimum of 250 million dollars, which means they could have hired every musician in Austin and Nashville for 10 years. Literally could hire it every single one of them for 10 years. Put that aside. Guaranteeing billion dollar settlement with the New York Times at Open A minimum one billion.

Starting point is 00:54:31 All right. Next up in the docket, cracking the crypto exchange. Oh, yeah. Two-part announcement here. On the 18th, they announced they raised $800 million, including $200 million from Citadel, Jason, that valued them at $20 billion. Then they won up themselves by today

Starting point is 00:54:45 announcing they filed privately to go public. Now, we've seen a couple of crypto companies list this year, Circle being the best known one, the stable coin company and giant that we love on the show. But my question for you, Jason, is very simple. Why would you raise a mezzanine round and then the next day, file privately to go public.

Starting point is 00:55:04 I always thought that you raised a chunk of money with the banks and other people you want to have on your tech your cap table when you list maybe like 12 months before. So I was just a little surprised by the timing. You're just de-risking. It's a straight-up de-risking move. It would be like, why would I,

Starting point is 00:55:20 I don't know, take an incremental speaking gig if I didn't need to? It's like, yeah, you got the money. You make hay when the sunshine. So what happens in a situation like this is there's somebody who has a lot of faith in the company. could be a private equity firm, could be a mezzanine financing company, and they know it's going

Starting point is 00:55:36 to do good post-IPO. They want to secure a big position now. So let's say that company is worth. Did they mention in any way the valuation of what that company was worth or a previous valuation? So 20 billion now. 20 billion. Okay. So 2 billion would be 10%, 1 billion would be 5%. So they're buying about 4% here. 4% ownership now. Let's say they're even paying the prospective IPO price. What they're doing is they're saying, I have so much faith in this company, I want to buy my position now. We think it could pop in the IPO. It's just easier for us to lock it in now. So whoever that person is locks it in now.

Starting point is 00:56:14 And on the other side, there might be early investors in Cracken who don't want to wait for the six-month lockdown. So they're saying, hey, I know this thing could go down and I'm already weigh in the money. I need to get my LPs, their money back. I'm an individual investor. I could be a founder. I could be the founder's parents. We had the founder of Cracken on. He's a unique cat.

Starting point is 00:56:36 I like him. Jesse Powell, he was on episode 1487 back in June of 2022. I like Jesse because he's like a non-consensus thinker, kind of a fun guy. I like the, he's kind of Alex Carp territory, like, you know, non-consensus. Interesting cat. I like him because Cracken sponsors the Williams F1 team.

Starting point is 00:56:54 And anyone who's supporting Williams F1, good egg in my book. I'm going to F1 this week. I'm leaving right after the show to go to Vegas. Shout out to the Venetian, which gave me a ridiculous suite. It's good to be podcast famous, microcelebrity. But the F1 team reached out from McLaren, and they offered me to do a dry run. I don't know if this is in the Porsche dry run

Starting point is 00:57:15 or in the actual F1, if they can fit two people in it. No, you can. They're all single seats. But I think they might have a modified one where there's two seats where VIPs get to do a dry lap. So anyway, I've been offered a dry lap. It's either in a Porsche or something because they have like a whole set or it's in like one of these converted F1 cars. And I know this because they just asked me to sign a waiver.

Starting point is 00:57:37 I don't know if I'm going to do it or not. But should I do a drive run? If you don't do it, I'm going to scream. I will come down there and give you a nuggie. I get motion sickness. Maybe I'll vomit in it. You move to Texas. You bought a weighted vest.

Starting point is 00:57:50 You shoot guns. Get in the car. Go around a couple of times. FaceTime me. FaceTime me while you're doing it. And then put it in the front facing camera so I can go with you. I think they do have you crying like a baby on camera. So I guess I'll do it.

Starting point is 00:58:06 You know, I have kids and stuff like that. So I got to just make sure, has anybody ever died in a dry lap? That's what I need to know. Has anybody been dried or maimed in a dry run? Please get me the statistics. I want a risk analysis. Wow, I'm not a very jealous person, but I'm actually for the first time in a while burning. Life of Jake House is a little bit crazy these days.

Starting point is 00:58:28 Let's keep going. else. Let's do a really tiny note on some regulatory stuff. First of all, meta beat the FTC this week. We're not going to go through the whole thing here. But the gist is that meta is not going to have to divest WhatsApp or its other acquisitions, Jason. Was this done under Lena Con or was it before? It might have been before Lena Khan. It started during Trump 1. It continued through Biden. And then the FTC under Trump 2 pursued the case and kept at it. So this was a multi, multi-platform bipartisan, everyone taking part in this. But they wanted to go backwards. an unwind 10-year-old

Starting point is 00:59:01 plus-year-old acquisition saying that they should not have approved them? That was the... I remember that was the case, yeah? That was the case. And they lost. And the judge said, effectively, you didn't prove your case.

Starting point is 00:59:12 You didn't prove that they have a monopoly in personal social networking. That was the defined market in question. The FTC said we are deeply disappointing in this decision. The deck was always stacked against us with Judge Bozberg, who was currently facing articles of impeachment

Starting point is 00:59:25 where we're reviewing all our options. Now, why do we care about this for startups? Oh, I can tell you why this is yum-yam time. Oh, this is a yum-yum time. Speaking of racing, this is the five lights. This is the start of an amazing M&A race. It's all on the table, folks. Everybody, it's time to activate your M&A teams.

Starting point is 00:59:44 Google, Apple, meta, Microsoft, XAI, chat GPT. It's on like a donkey Kong. Lena Khan is busy dismantling the greatest city in the world. She has no job. She's completely ineffective. She's going to make groceries. She's going to be delivering groceries for free in socialist New York. Guess what?

Starting point is 01:00:07 It's capitalism all day, every day for the next three years. Buy everything in sight. Go buy DoorDash, make an offer for Uber, Robin Hood, hostile takeovers, buy 10% of the stock on the public market. Let's have added, folks. Merge these companies. Who cares? antitrust, all of these shenanigans were wrong. Meta doesn't have a monopoly. They have 60%. That's not a monopoly. You got TikTok. You got X. Is meta the lead player? Yes. But the bogus FTC claims are gone.

Starting point is 01:00:44 And it's on like Donkey Kong. The thing I'm curious about is, if it is on to that extent, Jason, does this save the unicorns that have been sitting and dying on the private markets? Sure. Sure. Those are all, now those are all worth twice as much. Any of those that were like, you know, 400 million in revenue, growing 20% a year, everybody gave up and their peak valuation was $8 billion and then they went down to $3 billion. Now they're back up to $7 billion. And there's going to be a massive amount of M&A. Because these were the high profile cases. The Google case wound up being like a whimper. The meta case has been tossed. I mean, are there any other major antitrust cases now. Apple lost their case with Epic, right? And they have to like open

Starting point is 01:01:30 the app store. The Google advertising case. So there was the Google search monopoly case. Yeah. And then the Google search advertisement monopoly case. So that one's still going on. It's all going to be settled, folks. It's, right? There's nothing left. What about companies like crunch base that raised? There's no secondary market for those shares. I would think somebody wouldn't want them. But tens of millions of ARR, but it's not, it's like fine. And I'm curious if the companies that are like, because I see VCs talk about, Oh, that should just get bought by Carda. That should just be bought by Angelist or Angelist plus Crunchbase merge, et cetera. The hilarious thing is Crunchbase is worth much more than TechCrunch.

Starting point is 01:02:06 Oh, that's been true for a long time. Yeah. Just think about that. Like, what a crazy turn of events. But, you know, a database of, you know, any of these databases of startups and corporate information, they're worth tens of millions to low hundreds of millions because there are, not the hardest things in the world to replicate. And the data from the historical data is not actually super valuable.

Starting point is 01:02:31 And data is not copyrightable. So the data from any chart, like somebody produces data, and you see this all the time. Like there was this e-marketer website, was that the name of it, that just took everybody else's data and made their own beautiful charts. And then they sold a subscription to it. I know we would use their charts. And I was like, wait a second. How does this possible?

Starting point is 01:02:50 And then I looked it up. And there were all kinds of lawsuits around this. but you can produce whatever, any data that's on the internet, you can just take it and make a chart, put your own logo on it, and then just put source of the data and then put your own interpretation of the data.

Starting point is 01:03:05 You can't copyright data. You can have a terms of service that says you can't go to crunch base and scrape it. You could have that terms of service. Yeah. There are other ways to protect it, but a person in Israel or India or Manila could buy a crunch base account,

Starting point is 01:03:22 I believe. And in those markets where, you know, it's even harder to sue somebody for this kind of stuff, they could just hire people for a dollar an hour to make their own databases and essentially copy the information. And there's very little recourse. And LinkedIn lost their case when they did the LinkedIn case. Yeah. There was somebody scraping it.

Starting point is 01:03:45 I think Mark and Facebook lost their case. There was somebody in Israel, I believe, doing data scraping and made a social data. of like profiles from Instagram and sold it. But the middle market is kind of what I was more trying to get at than just the crunch basis. I probably shouldn't have made the company. But like just it's an interesting diversion. But the mid market is going to be back big time.

Starting point is 01:04:03 Great news for fricking venture liquidity for those 2021 funds, Jason, who have been. It should be absolutely great. And I see in the notes here that Tomas, Tunguv, GP at Theory Ventures, he says the same thing. It's going to eliminate the gymnastics that major requires are going through and should really open up the door. He's 100% right. I'm with him. The thing that I, not to be negative, but the FTC pursued this and kept going with the case, Jason. So do you think that the FTC has, has changed temperature? Or do you think that this loss in court is just going to kind of scare them away from being active in the next couple years? Because I hope you're right. What it comes down to

Starting point is 01:04:42 is these lawsuits were on the margins political and they were 100% of the time in the review mirror. So you put those two things together. They weren't the strong. cases. The strongest cases would be against Apple for the app store, which they have like, you know, a certain percentage of the revenue versus Googles and a certain percent of the market share and that duopoly. And I guess the next best one would be the Google search one. And those are going to be go out with a whimper. So business wins. We'll see you next time on this week in startups. Bye bye.

This Week in Startups - Why data is the biggest AI bottleneck (feat. Arthur Mensch of Mistral AI) | E2212

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.