Big Technology Podcast - Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch

Starting point is 00:00:00 What does the AI business look if all the leading models perform the same, which they kind of are? We'll find out with the CEO of Mistraw right after this. Can AI's most valuable use be in the industrial setting? I've been thinking about this question more and more after visiting IFS's Industrial X Unleashed event in New York City and getting a chance to speak with IFS CEO Mark Muffet. To give a clear example, Mufford told me that IFS is sending Boston Dynamics spot robots out for inspection, bringing that data back to the IFS nerve center, which then with the assistance of large language models

Starting point is 00:00:35 can assign the right technician to examine areas that need attending. It's a fascinating frontier of the technology, and I'm thankful to my partners at IFS for opening my eyes to it. To learn more, go to IFS.com. That's IFS.com. Hi, this is Alex Cantowitz. I'm the host of Big Technology podcast, a longtime reporter, and an on-air contributor to CNBC.

Starting point is 00:00:57 And if you're like me, you're trying to figure out how artificial intelligence is changing the business world and our lives. So each week on Big Technology, I host key actors from the company's building AI tech and outsiders trying to influence it, asking where this is all going, places like Invidia, Microsoft, Amazon. So if you want to be smart with your wallet, your career choices, and at dinner parties, listen to Big Technology Podcast and your podcast app of choice. Welcome to Big Technology Podcast, a show for cool-headed and nuanced conversation of the

Starting point is 00:01:25 tech world and beyond. We have a great show for you today. We're going to talk all about what's happening to the AI business and technology race as some of the leading foundational models start to look the same and how that changes the balance of power in the industry. We're joined by the perfect guest to do it. Arthur Mention is here with us. He is the CEO and co-founder of Mistral.

Starting point is 00:01:46 Arthur, welcome. I'm happy to be here and thank you for hosting us. No, it's great to have you. So Mistral is a name that those who are deep in the AI world know very well, but might be new to some of our listeners and viewers. So for folks who are new to Mistral, let me give you a couple of stats. It is, Mistral is an AI model builder, does some other things which we're going to get to. It's based in France. Companies valued at $14 billion after starting in April, 2023, so little under three years or two and a half years to make a $14 billion business. Not bad.

Starting point is 00:02:22 There's 500 people at the company. And Arthur, you are leading it after spending some time in the academy and two and a half years at Deep Mind. Exactly. We're headquartered in Paris, but we have around the fourth of our workforce, which is actually in the US, and a lot of our activities actually here. So that's why I'm spending a lot of time as well,

Starting point is 00:02:43 and that's why we are here in New York. All right, well, great to have you in studio. Let's just go right to what I think is the most pressing issue for AI today. There's been so much talk about how Google, at the end of 2025, started to equal Open AIs models and how Open AIs models were somewhat on par with others. And to me, it seems like we're just hitting commoditization of the foundational model much faster than I thought it would be. I thought that there was going to be a race where some companies would leap out further ahead and would take others some time to catch up. But it

Starting point is 00:03:21 looks like right now you have lots of model builders with their frontier models, exhibiting performance that's so similar is difficult to tell which is the best. So what do you make of that? I would say that inherently, this is a technology that is going to get commoditized. The reason for that is that it's actually not hard to build. You have around 10 labs in the world that know how to build that technology, that get access to similar data, that follows the same recipes and algorithms, which are very, it's very short, actually, like the knowledge you need to actually train a model is fairly short.

Starting point is 00:03:58 So because it's short, it actually circulate. So there's no IP differentiation gap that you can create. So it's very hard to actually leapfrog and to be way ahead of the competition because there's some diffusion of knowledge that is just making everybody do the same things. And so the question there is therefore, where is the value accruing? And what kind of business model should you pursue to actually make sure that in the end, you're turning profitable? And then the challenge that we see with some of our competitors is that they're investing billions or hundreds of billions into creating assets that are deprecating very fast because those are commodities.

Starting point is 00:04:35 And so for us, it has always been, administrative, it has always been question, one of the biggest question of the industry is that you need to invest enough to actually bring value to enterprises, but you also need to invest reasonably so that you can build unity economics. that makes sense in a world where the creation of model, which is capital intensive, is actually just bringing you assets that are just in a community competition. So let's talk a little bit then about this race to build the best possible model. I mean, like you mentioned, it's very expensive.

Starting point is 00:05:10 Opening Eye is going to put $1.4 trillion into building infrastructure for its models, or at least it says so. If the models are effectively at par, company is going to say, hey, wait a second, maybe it doesn't make sense for us to invest all this money into building the next evolution of a better model because people can catch up. I mean, strategically, I think it's definitely there's some cursor to be set. How much do you invest in creating assets that are valuable enough for one company to bring, for one technology company to bring value to an enterprise or to bring value to a consumer?

Starting point is 00:05:52 And at the end of the day, all of these investments will need to be funded by the free cash flow and value creation that is being made downstream. And so the focus that we have as a company, but that I think is a reasonable focus is to be more on the downstream applications and to figure out what is the friction that enterprises are running into and try to lift these frictions. Because at the end of the day, I think one of the major challenge that the industry is facing today is that AI brought a lot of promises like, four years ago. But if you ask an enterprise, did you actually make money out of it? They will in general say no. And the reason for that is that they are not

Starting point is 00:06:31 customizing things enough. And they are not thinking backward from the problem they want to solve. So they think about the solution, but they don't think about the problem. And so trying and help them to actually go for the right use cases and actually do the right amount of customization

Starting point is 00:06:47 so that when it was a team of 20 people are actually operating some supply chain workflow, suddenly you can actually operate that with two people. And there's a lot of examples like this. But the challenge that the industry will face is that we need to get enterprises to value fast enough to justify all of the investments that is collectively being made. That is very interesting because for a long time you would hear these companies focused model, model, model, right? The next, what's GPT5 was, let's say, when you think about open AI,

Starting point is 00:07:18 the biggest news. Now they're starting to talk more about how do you, take the intelligence that you have and build the applications that work. Just one bit of reporting that I can share a couple weeks ago. You know, I had this story, this story basically inside a lunch with Sam Altman and a bunch of newsleaders in New York City. And Altman told him the companies, you know, one of their biggest priorities was building applications for enterprise. Basically, it's going to be a major priority in 2026.

Starting point is 00:07:50 And it's a little bit of a shift in rhetoric. from we want to build AGI to we want to build applications for business. So talk about why is that happening? Is that an offshoot of this commoditization issue? Well, I think the issue is, well, first of all, AGI is a very simple concept. So probably too simple for enterprises. There's no such thing as like one system that is going to be solving all of the problems

Starting point is 00:08:14 of the world. And so at the end of the day... Not yet or you just don't believe in that concept at all. It's never going to exist. I mean, there's, you have a wealth of problems, just like you don't have any human that is able to solve every task on the world. You, of course, need to have some amount of specialization to actually solve problems. And so we are back from magical thinking to system thinking.

Starting point is 00:08:33 We need to figure out what is the data that is going to be used to make the model better at a specific task. What is the file I will that we need to set so that we accrue more signal from humans interacting with the system so that eventually the application becomes better and better. And so in real life, enterprises are just complex systems. And you can't solve that with like a single abstraction, which is AGI. And so AGI to a large extent is what we were not able to achieve,

Starting point is 00:09:00 and which is basically the North Star of, I'm just going to make the system better all the time. But because, as you said, it's hard to explain no to investors that the technology you're building is never going to be matched by your competitors, then there's, of course, a shift in the narrative. That's your companies are not building like a North Star single system that is going to be solving all problems, but that we'll need to go into the weeds of enterprises

Starting point is 00:09:25 and solving their actual problems. And I think at Mistral, we've been ahead of time in thinking about this. That's kind of, that set us our story. Our story has been to assume that eventually AI will be more decentralized, that more customization would be needed because we were running into the limits of the amount of data we could accrue and the limits of scaling laws.

Starting point is 00:09:44 And because of that, we created the company on that premise, on the fact that we'll bring more customization ability to the entrepreneurs. Yeah, and we'll get to the Mr. All story in a little bit, but one more question about this. It seemed to me, and I wonder if you think this has been a shift, you know, you were ahead of this for sure, but it seems to me like there's been a shift in the AI industry where the idea was effectively make the models smarter and they'll try to figure out these, they'll be able to figure out these problems on their own. Like for instance, I'll just make it concrete, make the model smarter and it will be able to do a lower level associate's job or maybe do data

Starting point is 00:10:24 entry for multiple systems and be able to file reports. And now it seems like there's been a shift from do that to actually build out the infrastructure, that the models are just one component, that the infrastructure is super important. And things like orchestration and, you know, working through the applications that are built on top of the models is going to be where the value is found. It's interesting. Yes, I think if you look at it from a system perspective, you have two components and we'll always have these two components. The first components are like static definitions of how what the workflow should be and how a system should behave. And those static definitions are set by humans that are defining how the system should behave.

Starting point is 00:11:07 And so this corresponds to the manual information that you're using to define the system. And then there's a dynamic component where you're creating, you're connecting a model to tools and you're giving instruction to the model and the model can go and call the tools itself. And so it can decide on the graph of execution that it's going to follow. And so that part is dynamic and there's a static part where you're setting up guardrails or you're deciding you have a tree of decision sometimes. And I think it's a bit utopist and irrealistic to think that you can solve everything with a dynamic system without guidance from humans.

Starting point is 00:11:43 And what has happened in the industry the last three years is that effectively the dynamic part has grown because models can think for longer because they can call multiple tools because they can code. But the static part remains extremely important. And even if the dynamic part grows, then the static part allows you to create systems

Starting point is 00:12:00 that are even better and more interesting and you can solve problems that you were not able to solve before. So the combination of these static systems, which you can call orchestration if you want, And the dynamic systems that you can call agents is going to stay super important because the two things are moving up together so that we can tackle problems that are more and more complex.

Starting point is 00:12:20 Okay. And so now with that established, I'm thinking through what the business is. Let's say the model has commoditized. So what are the businesses going to be in AI? It will be, I imagine, some form of consumer products like chatbots where you could put open AI in that bucket. There will be a business where you could make your existing products better, like for instance, maybe chatting with Microsoft Excel. That could be

Starting point is 00:12:47 one way that current companies can make their products better. But then there is this other big bucket, which we've talked about a little bit already, which is the enterprise side of things. So how would you rank the business opportunity in those three buckets? Well, yes, I think on the consumer side, on the consumer side, because AI is starting to be, well, is becoming the way you access information. you basically have a NAD's business to be built, and that's pretty clearly going to be built. It's not the focus of our company. And then if you look at the enterprise side,

Starting point is 00:13:17 we're basically replatforming all enterprise software. So enterprise is about having the right... In enterprises, you have people, you have data, and then you have processes. Historically, there was a fragmentation of the tools to run multiple processes, multiple data systems, multiple system of records.

Starting point is 00:13:35 And there was a fragmentation in teams that were not able to access all information at the same time. And essentially what AI allows you to do in an enterprise is to start with a unified data, or even you can start with fragmented data sources because the AI is able to navigate them. Then you put an AI on top that is building the right amount of intelligence, understanding what's going on in the enterprise.

Starting point is 00:13:58 And then the AI system is able to somewhat generate the interfaces that is useful for every human to actually work. And so that part that represents, platforming of the entire enterprise software stack is the one thing where a lot of value can be created in the enterprise. Owning the context engine. So the system that is constantly running that is looking at what's happening and figuring out creating documentation for what's happening. Owning the front end as well that are more and more getting generated on demand. So let's say I'm a lawyer.

Starting point is 00:14:33 I want to fix one my problem and have very specific review to make. I just bring my documents and then the system actually evolve in like showing me the right widgets and show me the right information I need. So the generative interfaces on top of a context engine that is constantly updating its representation of what's happening in the enterprise on top of systems of records that are essentially going to be just pure databases. You don't need everything that was sitting on top before. This is where this is going. And that replatforming is going to be, I think it's going to take a decade because it takes a while to get enterprises to adopt these things.

Starting point is 00:15:12 But there's just immense value to be created because suddenly you can reorganize your company around the fact that for many of the processes where you had a lot of people, you can actually run those very much faster. That's on one side, efficiency. And the other thing, which is the most, so that's, I'd say that's one of the business model in the enterprise. The second one in the enterprise is about, working with enterprises to help them take their really proprietary data, the assets being produced

Starting point is 00:15:40 by their machines, if it's in the manufacturing industry, for instance, and turning that into intelligence that nobody else can reproduce. And so making models specifically good at a certain kind of physics when we're working with a company doing planes, for instance, or when we're working with ASML, making models that are specifically good at operating their machines, that's huge value because suddenly you're not building efficiency within the company, but you're effectively unlocking technological progress that was locked by the absence of AI. So that unlock that the new systems are providing, that's immense amount of growth. It's actually harder to measure because the first one is shorter term.

Starting point is 00:16:25 You can look at what a company will look like in five years because you've reduced certain parts of the company. you've reoriented other people to be creating growth. You can create models of that. On the technological side, I think it's a little harder because we know there are things like nuclear fusion or sharper engraving of semiconductors, for instance. These are things where we're starting to run into physical constraints,

Starting point is 00:16:51 and artificial intelligence can actually help to lift those physical constraints. And so the acceleration of technological progress is I think where most of the barrication will be, It will take a little bit of time, and it will be less measurable, less predictable than the efficiency gains that AI is going to produce. But the two things are as important. Okay, so let me see if I can sort of game this out here a little bit. So if that is going to be the key driver of value in the AI world, there's two ways to do it. One is to build a model that's better than everybody else and sell it for a premium.

Starting point is 00:17:23 But we've already talked about the fact that that doesn't seem like it's going to be a moat forever. And the other way is, you know, the model is actually not the value. It's the know-how and the implementation side of things. So you can make the model open-source, but then provide a service to businesses to be able to figure out how to take that model and put it into action and actually get results. Are those the two choices? Yeah, that's kind of the fork that we see in the industry.

Starting point is 00:17:49 And our view there has been to be it on the second one, to really- The open-source and implementation side. Which brings customization, but it also brings decentralization. in that if you assume that the entire economy is going to run on AI systems, well, enterprises will just want to make sure that nobody can turn off their systems. So the same way, if you have a factory, you connect it to the grid, you want to make sure that nobody is going to turn off the grid because they don't like you.

Starting point is 00:18:19 If AI effectively becomes a community, which is what's happening, and if you treat intelligence as electricity, then you just want to make sure that your access to intelligence cannot be throttled. And so that's also one of the things that open source technology can bring. If you're using open source, you don't have to worry about going astray of, I'm just saying, like, anthropics,

Starting point is 00:18:40 you know, user terms. And so then pausing your ability to do what you do. If you use open source, you can basically run it on your own terms. Yeah, you run it on your own terms. You create the redundancy you need. You can serve with higher quality of service. You can make sure that whatever

Starting point is 00:18:56 like the geopolitical situation may be, You can still run the systems if you want. And then, so that's really on the IT side. So if I'm a CIO, I really look at open source as a way to create leverage and independence. But on the scientific side, it's also the only way in which you can create systems that are effectively using the folklore knowledge of your employees. That's the knowledge that you've recruited for decades. The only way in which to turn it into an asset that nobody get access to is to create your own models based on those open source models.

Starting point is 00:19:29 And so that's, but it's hard. It's hard to actually build those. Right. And so that's where you need the right tools. You need the right expertise. And that's like the complement business model to building open source models. But even the close source model providers, companies like Anthropic, will say they'll be able to customize their models with your data. You don't believe that?

Starting point is 00:19:47 They will say that. But then they will put some guard wheels on top of it. So you're basically trusting that their engineers are going to give you enough access to the depth of the system. and can you trust that for eternity? I'm not sure. So the issue there is as much a question of control as a question of customization. Like a vendor is going to try to lock you in.

Starting point is 00:20:12 So if you get access and if you build on top of open source models, like all open source models or anyone, you're basically less locked into the vendor. And this is a technology which is so important that you don't want to be locked into a single vendor. So that's also the opportunity we'll bring. You know what's stunning to me? We're three years past Chat Chabutee, which basically brought this into a lot of people's consciousness,

Starting point is 00:20:37 although I think big technology listeners would have known about it a little bit beforehand, especially since we were interviewing the people that thought this stuff was sentient before Chatsyipt came out, but that's a conversation for another time. But what we're basically saying today, I'm going to sum up two of the main points that you've made. One is that today's AI models can't do it all themselves. they need orchestration. And the second big point that you made is to do that sort of orchestration or implementation with the current intelligence, you need a service, like a managed service.

Starting point is 00:21:08 So it is interesting to me that like we've gone from like this perspective of, you know, maybe working towards a God model that could do it all to the fact that, you know, this may be the most powerful technology that we've seen come through in our lifetimes. However, when you actually want to use it, you kind of need, it becomes a managed service. in a way. Yeah, that's interesting. This is true. I don't think it's the first time that we observe it in history.

Starting point is 00:21:31 It's a new technology. It's a new platform. And so the knowledge on how to use it is actually still pretty scarce. So there aren't that many people that can build systems that are performing at scale, that can run at scale reliably, that can actually solve an actual issue. And so when working with enterprises, you always need to have some services on top because of the complexity of implementation. even with like fairly well-understood technology like databases.

Starting point is 00:22:01 But for artificial intelligence, it's even more necessary in that it requires to transform businesses. So you need to also help in thinking how the team should perform around the system itself. And it does require to customize things. So you need data scientists that know how to leverage data and turn it into intelligence. And today this is still a pretty scarce resource. I would say I do expect the part of the software in those deployment to increase. So the amount of the way customization occurs today

Starting point is 00:22:34 with fine-tuning, reinforcement learning, this kind of things, this is going to be abstracted away from the enterprise buyer because it's too complex and they actually should just worry about having adaptive systems that are learning from experience and from deployment with people instead of thinking about should I use fine-tuning or should I use reinforcement learning

Starting point is 00:22:53 to actually put that knowledge into my models. And the work that we are doing is to try and abstract away from lower level routines that data scientists understand to higher level systems that business owners can actually use. And so it's going to occur and we're working on it. But the service part is still going to be quite important. And today, the combination of the two things is the fastest way to value if you're an enterprise. So we've been combining the two. You know, I started our conversation by calling you a model builder. And I kind of paused on it and I said in some other things that we're going to get into it later.

Starting point is 00:23:32 And here we are. Basically, what I'm hearing from you is at Mistral, obviously proud model builder. But it seems like without the services, without being able to sit with the business and showing them how to use it, just would be an incomplete puzzle. So do you consider yourself, like, is the most important thing you do building the models? or is the most important thing you do the service? Or are you primarily a model builder or primarily service provider? I mean, we're there to help our customers get to value.

Starting point is 00:24:01 So service. We're here to, but to get to value, they need to have great models. And to get to value, they need to have the right tools to train the models. And so the best way to create those tools is effectively to train the best models. So the two things are extremely linked together. We create models that are very easy to customize. We create models with tools that we then export to our customers so that they can use them.

Starting point is 00:24:26 And we help our customers train their own models. So you can't go and sell to an enterprise that you're going to help them create very custom systems. If you can't show to the world that you're effectively the leader in open source technology. And so that's the two parts as equally important. The first is enabling the other. And there's effectively a flywheel there. Because we make our choices when it comes to the model design

Starting point is 00:24:49 in a way that is enabling the various customers we have. One example is that we've put a lot of emphasis on having models that are great at physics because we work with manufacturing companies that runs into physical problems. So that's the flywheel that we have set up by having the science team and the business team actually sit together. Okay.

Starting point is 00:25:12 We're here with Arthur Mench. She is the CEO of Mistral, also co-founder. When we come back after the brief, break, we are going to talk about open source, the open source movement versus close source. Remember, deep seek and open source was supposed to surpass close source, well, has it. We'll also talk about the geopolitics and regulation and whether that's going to give this company a leg up and then maybe get into some more practical examples because we should talk about how the technology is being used on the ground. We'll be back right after this. Here's the problem. Your data is exposed everywhere.

Starting point is 00:25:45 Personal data is scattered across hundreds of websites, often without your consent. This means data brokers buy and sell your information, address, phone number, email, social security number, political views, and that exposure leads to real risks, including identity theft, scams, stalking, harassment, discrimination, and higher insurance rates. Incognity tracks down and removes your personal data from data brokers, directories, people search sites, and commercial databases. Here's how it works. You create your account and share minimal information needed to locate your profiles. You then authorize Incogni to contact data brokers on your behalf, and then Incogni removes your data, both automatically from hundreds of brokers and via

Starting point is 00:26:27 custom removal. There's also a 30-day money-back guarantee. Take back your personal data with Incogni. Go to incogni.com slash Big TechPod and use code Big TechPod at checkout. Our code will get you 60% off annual plans. Go check it out. And we're back here on Big Technology Podcast with Arthur Munch. she's the CEO of Ms. Strahl. Arthur, I want to ask you about, you know, the progression of

Starting point is 00:26:53 open source over the past year. I remember reading about DeepSeek, doing reporting on Deepseek in January, and the overriding theme was it was such a leap forward for open source that soon the closed models, models like Open Eyes GPT and Anthropics Claude, and maybe Google Gemini would be surpassed by open source because open source, the open source community was working together and building on each other's innovations where the closed source community was kind of going at it on their own. We just had this moment we talked in the beginning of the show about how maybe Gemini commoditized GPT models, but that conversation was not being had about like open source being, you know, living up to that expectation from the beginning.

Starting point is 00:27:46 of the year. So am I missing something or am I reading it wrong or what do you think? If something has held back open source, what has it been? Well, if you look at the trends in 2024, I'd say there might have been like a six months gap. If you look at the trend in 2025, I think the gap is more around three months. So I guess it's up to anyone else, anyone to guess what the gap is going to be next year. But effectively, this gap has been shrinking, has been shrinking quite significantly. The reason for that is that basically you have a saturation effect when you pre-trained models around 10 to the power 26 flops. The reason for that is that there's only that much data you can find to compress when you pre-trained models. And so effectively, labs that

Starting point is 00:28:38 maybe started a little behind created enough compute capacity to train models at this kind of scale. And efficiency has also increased. And so what it means is that today, everybody has access to 10 to 26 flops facilities over the course of a few months. And that's a measure of compute. That's a measure of compute. And that's a measure of compute times.

Starting point is 00:29:02 So you need to, yeah, 10 to per 26 flops is something that any lab today can achieve in a couple of months. And because of that, the saturation effect means that open source models have caught up because close source models that were started ahead kind of run into that wall of pre-training. So what that means

Starting point is 00:29:24 is that this is only going to continue shrinking. And if we look at the latest open release we did, which is Death Style 2, which is a coding model, well, it's performing, I think, around the performance of Anthropic around two or three months ago. So, yeah, I think

Starting point is 00:29:41 the gap is shrinking. And again, I think the question is probably not posed in the right way that way, because it's also offering very two different distinct value proposition. Because on one side, this is well managed and you will depend on the provider itself. On the other side, well, it takes a little more effort, because you will need to own it more.

Starting point is 00:30:05 You will need to learn about how to customize it. You will need to use the right tools for doing so. You will need to maintain its deployment if you choose to deploy it on your own facilities. But at the end, this is creating leverage you need against close source providers. So the two categories are effectively different, but if you look at the pure performance side, they are definitely converging.

Starting point is 00:30:27 You mentioned that there's a saturation effect. So without getting too technical, are the models sort of done with getting better? Like are, let me put it this way. Are AI models going to continue to get better given the fact that they all seem to be hitting saturation? they will get better in more and more specific domains. In that, I think we've really collectively made them very clever and able to re-evern about long context and able to call multiple tools, etc.

Starting point is 00:30:58 But if you go and want to effectively put them into production in a bank or in a manufacturing company, well, the models need to learn about all of the knowledge that is contained into the companies themselves. And so what it effectively means is that for very, precise directions, let's say I want to make my model extremely good at discovering materials or extremely good at designing planes, I will need to go and sweat it a little bit and get the right reward signal and get the right experts and ask them to make my model specifically good in that very precise

Starting point is 00:31:35 direction. And so we are definitely not done doing that because what we are all racing for is the right environment and the right signal provider for specific capabilities. But the broad horizontal reasoning capabilities are still going to improve them, but nobody is going to improve them in a way that is creating strong, that is creating a strong gap versus its competitors. So the strong gap is actually in the, in working with vertical experts that know exactly how they design a plane and that actually explain to the model how to do it. And you have like a wealth of directions that you can take because you can do it in physics,

Starting point is 00:32:16 you can do it in chemistry, pharmaceutical, in biology. And so to me, the most exciting part of what's going to happen in the next two years is that explosion of very precise directions in which the model are going to get better. So, and for us, the opportunity is to have the right platform for enabling those kind of verticalization, whether with enterprises or you have like AI startups, actually, that are working on very verticalized capabilities and we're happy to help them as well. So that's my view of where the field is going to go. We have been about horizontal intelligence, growing and things getting clever, more and more

Starting point is 00:32:55 clever. And the next two years is going to be about taking model and making them extremely good at a certain skill set. And that's actually more exciting because we're getting to a point where if you pick a domain can just make it superhuman. But we are not going to make it superhuman in every domain at the same time. Okay, but then on that note earlier in our conversation, you said that you're not going to have a model that can do everything.

Starting point is 00:33:20 But if that training gets done in certain verticals, why not? Well, we are also getting to a point where the verticals that you choose do not really transfer to the others. So there's no point in making a model that is good at very precise biology and very precise physics. because the transfer in between those things actually pretty unclear. The problem is that if you actually want your model

Starting point is 00:33:42 to be able to solve every problem at the same time, you're making it very big, very expensive, and very costly to serve. So specialized models is really, you're going to specialize one for bio, one for chemistry, one for like this particular physics problem. Well, it actually makes more sense because if you want to run it at scale,

Starting point is 00:33:58 if you want it to run on the background, if you want it to run day and night thinking about specific problems, well, you want it to be as small as possible because the cost of a model is actually proportional to its size. And if you inflate the size by making the model great at multiple modes, well, you're actually not very efficient if you want to deploy it and use it as much as possible.

Starting point is 00:34:20 So if you look at the economies of it, it does make sense to make specialized model in certain directions. Let me ask you a little bit about the Mistral competitive area. I think that we're here in the US. I'll just tell you what people in the US say. and let you address it because it's worth talking about. I think there is a feeling among some, not all, but some, that, you know, Mistral has been set up in Europe

Starting point is 00:34:43 to effectively take advantage of regulatory capture because U.S. companies have a hard time competing in Europe and therefore Mistral will be there to, like, pick up all the AI business. What do you think about that argument? Well, you know, we've built our technology so that we could serve, companies and states that wanted to have enough control. Artificial intelligence is not a technology that you want to fully delegate to a vendor, especially if it's a vendor that is from a foreign entity.

Starting point is 00:35:19 And that is, that was true before, it was true for data. It's going to be all the more true for artificial intelligence for multiple reasons. But one of them is the fact that this is, if you're depending on a next-starlight vendor, your, your commercial balance is effectively increasing and you're importing services. And that becomes a problem long term if you're importing too much digital services, for instance. So that's one thing. And then sovereignty and this kind of topic is also very important for defense.

Starting point is 00:35:48 As if you're an independent country, you want to have independent defense systems. And if you want to have independent defense systems, you will need them to, you will need your own independent artificial intelligence because this is making it into the defense systems. So it's really working. working for you, this pitch being like, we are not an American company, we're based in Europe, we'll be able to help you build, whether it's something with like important data protection or national security like defense.

Starting point is 00:36:14 Well, it's a technological differentiation we've built. So because we can build on the edge, because we can deploy wherever our customers wants us to deploy, we effectively can die and the system is going to still be up, which is, which actually matters for many, many industry. And the more critical it gets, the more it matters. And so what that also means is that we can serve the US customers. We can serve US customers that want to depend less on certain providers. We can serve banks that wants to have more customization, more control that are more regulated.

Starting point is 00:36:45 It also means we can of course serve the European industry where historically that's where we are based. You sell next door when you start your company. And that's what we did. But we also serve Asian countries. And Asian countries, they have similar problems. They want to have a technology that they can rely on, even if we were to die. They want to have a technology that they can customize to their own cultural needs. And so that has been driving our business for sure.

Starting point is 00:37:13 That aspect, that technological differentiation that we've built around control, open source, like a technology built on open source models, around customization. And do you have like European governments coming to and being like, We just don't trust Google or Anthropic, and we prefer not to build on them. Well, we have European governments actually coming to us because they want to build the technology and they want to serve their citizens. They want to increase the efficiency of their public sector. And we happen to have a good proposition for them, which is deployable on their premises,

Starting point is 00:37:49 where we can go send forward deployment people to help them get to value. And it turns out we're European as well. It's actually pretty good for European countries to invest in European technology because the investment they're making, the revenue that they are creating for us, is a revenue that we reinvest in Europe and we're effectively creating an ecosystem around us. So that investment of the flow of revenue from European countries to European technology provider is something that is very beneficial. And to be honest, in the US that has been working for the last 80 years.

Starting point is 00:38:25 I think in Europe we haven't been doing it enough for sure. Speaking of open source companies, there are efforts that have some links to geography. What do you think about China's open source effort? Because obviously they've made a lot of noise. It seems like things are going quite well there. Yeah, I mean, China is very strong on artificial intelligence. We were the first, actually, to release open source models, and they realized it was a good strategy. and they've proved to be very strong, actually.

Starting point is 00:38:58 And so we've been not sure if we're competing, because the good thing about open source that's not really competition. You build on top of one other. Right, you see everything they have out there and you learn what works well. Yeah, and the same is true. The reverse is true.

Starting point is 00:39:11 Like we released the first sparse mixture of experts back at the beginning of 2024. And they built on top and they released Deepseek free. Deepseek was built on top of that. Well, it was, it's the same architecture. And we released everything that was needed to rebuild this kind of architecture.

Starting point is 00:39:28 And the same is true. I mean, everything that companies that are investing on open source are releasing are things that all their open source companies are reusing. And actually, it's kind of the purpose. R&D is just much more efficient if you share your findings across different labs.

Starting point is 00:39:44 And so it's been very effective in China. They share knowledge across the different labs. It's been pretty inefficient here in the US because there's actually no there's like US incorporated company are not investing on open source. And we've taken the lead on just being the West open source provider. And I think it's going to be very much needed to have a Western open source provider. What do you think China's strategy is?

Starting point is 00:40:07 And do you think that there's like in the US, there's often this kind of very large conversation about the need to stay ahead of China? Do you think there's a risk of China runs away with this? Well, I think China is very strong. vertically integrated, they have strong engineers, they have compute, they have energy, everything they need to compete. Europe also has everything it needs to compete. I don't think we'll be in a setting where anyone is going to have one artificial intelligence ahead of the others. And if you look at like the world in its entirety, every large enough sovereign entity, which is a big economy, is going to want some form of autonomy in its usage of AI and its deployment of

Starting point is 00:40:52 AI. So that does justify the emergence of multiple centers of excellence, I would say, one of them, which is in Europe, which is led by us, one of their, which is more in Nangzhou in China, and then you have a bunch of companies here in the West Coast. Why do you think it's in China's strategic interest to develop these open source models? I mean, they don't have a similar business as as you do, right? They're not real, they're not like going out globally and becoming implementers. They have a big business in China, for sure. The companies that are building open source models in China are actually cloud providers in general. You have a bunch of startups, but you also have Alibaba, which is the cloud provider.

Starting point is 00:41:33 And so they have this vertical integration that allows them to create value there internally. So in China, but also in the markets where they are operating and growing. So in Asia, for instance, which for us is a place where we tend to compete with them, not in China itself, but in the rest of Asia. So does make sense for them to compete. internally. And then their best way of accessing the U.S. markets is by just giving the things for free. And so it does make sense. It's a very natural thing to do to build a business in China, which is protected, then to export the thing for zero. I would do the same if I were in their shoes.

Starting point is 00:42:09 Right. All right. I want to talk a little bit before we leave about the practical applications of this technology that you're building. You know, it's interesting. You were talking a little bit about AI being used for physics, AI being used. used in other research applications, AI being used for defense. None of this sounds like a chatbot. So talk a little bit about the applications that you are working on and whether we're going to see AI move beyond the chatbot. I mean, the chatbot is oftentimes the interface,

Starting point is 00:42:39 because artificial intelligence is a generative AI allows you to interact with machines in a human way. So it's a chatbot is a human machine interface, but it's not, the rest, it's only that. Now, if you look at the actual applications that are strongly exciting for us, you have two things. The things that are really on the end-to-end workflow automation that effectively changes the way a business is fully run. So examples are like cargo dispatching when we work with CMA-CGM, which is a shipping company. And we help them dispatch all of their containers when the cargo, the ship comes into the ports and they need to dispatch everything.

Starting point is 00:43:18 they need to contact like hundreds of people, they need to contact the harbor, they need to contact the regulators, they need to actionate 20 software differently. And so that takes like, I mean, I think a few hundred people to do it. And by working together around how to automate those things, suddenly you can save 80%.

Starting point is 00:43:34 So the LM is making those communications. And also deciding, not just making the call, but deciding who gets what? It decides and it wires the things and you measure whether it's doing the right thing. And if it doesn't, then you improve the system. How's it doing?

Starting point is 00:43:49 So it's working. It's live, actually, in certain agencies. So that's very, like, to me, it's very exciting because it has a physical footprint. It takes decisions in a safe way. And it's effectively bringing a very large efficiency gain to a company. Now, another example, which is more on the growth side, are things that we do with ASMR. We are working with them on vision systems.

Starting point is 00:44:13 And talk a little bit about what ASML is for those that don't know. So SML is a company that is doing computational. lithography and scanning. And their role is to build those big machines that are effectively engraving the wafers that are then used as the chips in Nvidia, for instance. Right. So they're like key industrial component of the semiconductor manufacturing process. They provide the machines for semi-films.

Starting point is 00:44:36 Right. And something so specialized, you would think, how is generative AI going to help them? Well, generative AI is generally, the generative AI models are predictive AI models. and one good thing they have is that they can see and reason about what they see. And so one of the things that SML needs to reason about are the images coming out of their scanners that are verifying whether there are errors in the engraving of the chips. And it's actually fairly complex because there's some logical thinking to be done.

Starting point is 00:45:07 And the combination of images and logical thinking is what enables us to actually automate those things much faster, which means that the throughput down the line of Fab, is going to increase. And so in that setting, customization is key because the kind of input that is coming in is nowhere to be found elsewhere.

Starting point is 00:45:26 Estimel is the only one who has access to these images. And so we find like a physical problem that is effectively a bottleneck in like a manufacturing process and we go and we train models that are effectively solving it. So that's, and this is going to occur in like many, many different places. And generative AI is needed there because you need a model that can reason about images.

Starting point is 00:45:50 And so the reasoning capabilities are critical. But customizing those reasoning models for a specific problem with a specific kind of input is the one thing that is the end of lock there. Yeah, the industrial applications of Genitive AI to me have been super surprising and interesting. Like there has been technology, for instance, computer vision technology that can take a look at a piece of machinery or an output and be like, that's not good or actually that's what we need, right?

Starting point is 00:46:16 But there hasn't been this nerve center that that information can be channeled to and then sort of have a decision made about it and then communicate it to somebody in the field. And that's what this stuff is enabling, is that that full line of technical work is starting to be able to be done by this technology. Basically, what you need is our models that can perceive multiple kind of information. and oftentimes in manufacturing information is visual. So having very strong visual models is super useful. And then based on those vision models, you can, on these inputs, you can make choices and you can rely on the LLMs themselves

Starting point is 00:46:59 to orchestrate calling an agent or going into the next step of the workflow or actually calling a tool or writing something in the database. And that's having dynamic agents that are able to see what's happening in a factory, that are able to see what's happening in the process, and that can take the next step, whether it's actually an automatic step or a call an agent step so that they validate a decision,

Starting point is 00:47:22 is where a lot of the value can be created. And that's going to reorganize manufacturing. You know, manufacturing had to reorganize itself multiple times when we invented the steam engine. We had to rebuild the entire factories around like a central steam machine because that was the energy provider. And so what's going to happen, I think, in the next 10 years,

Starting point is 00:47:43 is that all of the manufacturing process, will be rebuilt around LLM orchestrators. And it's super interesting because you have physical problems to solve the system as physical footprint, so there's some safety issue that you need to solve. Just the complexity of the system itself is huge, and so that's a fascinating problem for engineers like us. Let me see if I'm getting this right.

Starting point is 00:48:08 Okay, so I think what we're starting to see is the seeds of this stuff starting to be able to really have an impact. in business. We just did an episode with a reporter who was reporting on how some lawyers are really able to use this to sift through documents better. Is it perfect? No, we heard it in the comments, not perfect. But it has showing potential.

Starting point is 00:48:29 Same thing in industry, and maybe also in other areas that you touch on. But still feels nascent. So what's going to get it from like where it is today to something that's like, you know, effective in a way that like we, really see the impact in the economy. Is it just like time and patience on customization, or is it improving improvement of models or?

Starting point is 00:48:53 I think models are getting better, which helps. Whenever you have a stronger model, you can trust that it's going to reason for a longer period of time and that it's not going to fail less. But then the thing that needs to be embraced is iterations. You're never going to be able to build systems that work out of the box in a single shot. And the one thing that we try to convey to our customers is that they need to build a prototype.

Starting point is 00:49:20 It's going to work 80% of the time. But then how do they get from 80% to 99%? Well, they can move the thing into production. And the way to get it is to actually get feedback from users. If the system is not working, if the AI software you've built is not working, it means that you need more data and signal. And that's something that is quite different from the way we used to build software. because when the software was not working before, you basically went back to coding and you would fix the problem.

Starting point is 00:49:48 But because we are building organic systems, so systems that imitate humans, the way to make them better is to give them feedback and then to retrain the system. So that will take the seeds that you mentioned that will make them actual valuable things that's going to work. And you mentioned lawyers. I think it's one of the area where it's very knowledge intensive

Starting point is 00:50:12 you have very little physical footprint. And it's a lot of text. It's a lot of text. And so it's the easiest one. It's the easiest thing to do. It's not easy at all. It's not done yet. There's still a lot of subtleties to fix

Starting point is 00:50:25 to make models great at lawyering. But if you go into the physical world, then it gets even more complex. So we'll see applications on the knowledge world go faster into production than the one on the physical world. But arguably, the one of the physical world would be more transformative. That brings us to robotics.

Starting point is 00:50:44 So let's, let's end here. People have been talking about how we could see an explosion in robotics because of LLMs or the advancements in world models. But it still seems far off. I mean, they had this demo, what was it, the neo, the neo-huminoid robot where there's like a person controlling it, teleoperating it. Kind of weird. They might be in your house. So we haven't seen progress in robotics, you know, start to move as fast as we've seen it in the software side, in the large language model side. So where does that, when does that come if it ever does?

Starting point is 00:51:24 I think in robotics, you have the combination of two things that needs to work. Hardware platforms that need to be, you need to have the right actuators with the right haptic signals that needs to be built at scale with good economics. And this is starting to be true. and we're not the one working on it, but the industry has made a lot of progress on that domain. Then the other thing is that you need to be able to have control system that are sufficiently intelligent to be deployed on those robots. And so that's where actually we come in,

Starting point is 00:51:58 in that, again, you need to have custom models. Because the problem is the model needs to be customized to the platform, whether it's a humanoid robotic, or whether it's something on wheel or whether it's a flying drone. And it needs to be customized to the mission because the mission is going to bring different kind of images, the kind of actions that can be taken are going to vary across the mission,

Starting point is 00:52:20 maybe the guardrails are different. And so that adaptation to the world and to the wealth of data that the hardware platform that is being deployed is bringing does require the right platform and the right training platform. And so our bet in robotics and what we've been doing with multiple

Starting point is 00:52:37 companies in defense in particular is to build that platform that allows to train models fit to purpose that can then be deployed on the edge potentially because in robotics strategically in robotics i believe we'll see deployment of such systems first in areas where you don't want to send humans so firefighting i think is a very good example so when the risk and benefit the the risk of deploying the system is way under the benefit of deploying the system It's going to be the case in manufacturing as well, because there are places where you just want the factory to be dark. And I think that's where most of the, a lot of the value will be created. I would say midterm.

Starting point is 00:53:19 And then maybe long term, you have things that are sitting in your house. But, you know, it's a bit dangerous to have like some pretty strong thing out there. And so the same way we've been waiting for self-driving car for the last 15 years, we'll be probably waiting for like humanoid or boutiques in-house for meaningful time. And before that, what we'll see is at scale deployment in manufacturing. And that will take the right software platform. And that's the software platform that we're building. Okay.

Starting point is 00:53:49 All right. Really, the last one. We've talked a lot about AI and business. Some businesses have gotten a lot out of it. Some have not clearly potential, but also just like a shit ton of investment. Is, yeah, what do you think about the bubble question? Are we in a bubble right now? Well, we're in a setting where we need a lot of infrastructure, so we need to invest,

Starting point is 00:54:11 and that's what we do in Europe, for instance. But then the viscosity of adoption in enterprise is slow, is high. In that it takes time to understand how to build the software. It takes some building. You can't buy off-the-shelf solutions and then trust that you're going to make immense progress in your productivity. That has been the disappointment that a lot of enterprises went through in the last two years. So there's some building to be done.

Starting point is 00:54:36 You need to maybe buy the primitives, buy a certain number of factorized functions, but then you need to bring your own knowledge onto it. So it takes some time. You need to learn how to build. And then you need to learn how to reorganize. And that takes even longer because the teams are going to change. You need less management because you need less infrastructure to circulate information

Starting point is 00:55:00 because AI allows information to circulate faster. you need certain functions are going to disappear certain functions are going to grow so there's just a lot of work to be done on reorganizing things and it will take years and so the question is the infrastructure investment that are being made today are they going to create long-term value in two years

Starting point is 00:55:23 in five years or in 10 years and that does define whether some people are losing money or making money that's the that's the problem and we don't really know so maybe people are over-investing maybe people are under-investing some people will certainly lose money

Starting point is 00:55:39 some people will certainly like miss opportunities as well but today I would say my view is that we're maybe over-investing a little bit and over-committing a little bit not Mistral but some of those because we see how complexity is to actually create value in enterprises

Starting point is 00:55:59 But eventually we'll get there. Eventually, the entire economy is going to run on AI systems. That's for sure. But it might take 20 years because it's actually fairly complex. All right. The website is mistral.a.i. Our guest has been Arthur, the CEO of Mistrawl. Arthur, thank you so much for coming in.

Starting point is 00:56:17 Really appreciate being here. Thank you for hosting me. You bet. All right, everybody. Thank you for listening and watching. And we will see you next time on Big Technology Podcast.

Big Technology Podcast - Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.