Big Technology Podcast - Who Wins if AI Models Commoditize? — With Mistral CEO Arthur Mensch
Episode Date: January 14, 2026Arthur Mensch is the CEO and co-founder of Mistral. Arthur Mensch joins the Big Technology Podcast to discuss what the AI business looks like if all leading models perform the same. Tune in to hea...r how the commoditization of foundational models is changing the balance of power in the industry, what business models will be profitable, and why the focus is shifting from building better models to building applications. We also cover the open source movement versus closed source models, the geopolitics of AI, and practical industrial applications of the technology. --- Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b Questions? Feedback? Write to: bigtechnologypodcast@gmail.com EXCLUSIVE NordVPN Deal ➼ https://nordvpn.com/bigtech Try it risk-free now with a 30-day money-back guarantee! Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
What does the AI business look if all the leading models perform the same, which they kind of are?
We'll find out with the CEO of Mistraw right after this.
Can AI's most valuable use be in the industrial setting?
I've been thinking about this question more and more after visiting IFS's Industrial X Unleashed event in New York City
and getting a chance to speak with IFS CEO Mark Muffet.
To give a clear example, Mufford told me that IFS is sending Boston Dynamics spot robots out for inspection,
bringing that data back to the IFS nerve center,
which then with the assistance of large language models
can assign the right technician to examine areas that need attending.
It's a fascinating frontier of the technology,
and I'm thankful to my partners at IFS for opening my eyes to it.
To learn more, go to IFS.com.
That's IFS.com.
Hi, this is Alex Cantowitz.
I'm the host of Big Technology podcast, a longtime reporter,
and an on-air contributor to CNBC.
And if you're like me, you're trying to figure out
how artificial intelligence is changing the business world and our lives.
So each week on Big Technology, I host key actors from the company's building AI tech and
outsiders trying to influence it, asking where this is all going, places like Invidia,
Microsoft, Amazon.
So if you want to be smart with your wallet, your career choices, and at dinner parties,
listen to Big Technology Podcast and your podcast app of choice.
Welcome to Big Technology Podcast, a show for cool-headed and nuanced conversation of the
tech world and beyond.
We have a great show for you today.
We're going to talk all about what's happening to the AI business and technology race
as some of the leading foundational models start to look the same and how that changes the balance
of power in the industry.
We're joined by the perfect guest to do it.
Arthur Mention is here with us.
He is the CEO and co-founder of Mistral.
Arthur, welcome.
I'm happy to be here and thank you for hosting us.
No, it's great to have you.
So Mistral is a name that those who are deep in the AI world know very well, but might be new to some of our
listeners and viewers. So for folks who are new to Mistral, let me give you a couple of stats.
It is, Mistral is an AI model builder, does some other things which we're going to get to.
It's based in France. Companies valued at $14 billion after starting in April,
2023, so little under three years or two and a half years to make a $14 billion business. Not bad.
There's 500 people at the company. And Arthur, you are leading it after spending some time
in the academy and two and a half years at Deep Mind.
Exactly.
We're headquartered in Paris,
but we have around the fourth of our workforce,
which is actually in the US,
and a lot of our activities actually here.
So that's why I'm spending a lot of time as well,
and that's why we are here in New York.
All right, well, great to have you in studio.
Let's just go right to what I think is the most pressing issue for AI today.
There's been so much talk about how Google,
at the end of 2025, started to equal Open AIs models and how Open AIs models were somewhat on
par with others. And to me, it seems like we're just hitting commoditization of the foundational
model much faster than I thought it would be. I thought that there was going to be a race where
some companies would leap out further ahead and would take others some time to catch up. But it
looks like right now you have lots of model builders with their frontier
models, exhibiting performance that's so similar is difficult to tell which is the best.
So what do you make of that?
I would say that inherently, this is a technology that is going to get commoditized.
The reason for that is that it's actually not hard to build.
You have around 10 labs in the world that know how to build that technology, that get access
to similar data, that follows the same recipes and algorithms, which are very, it's very short,
actually, like the knowledge you need to actually train a model is fairly short.
So because it's short, it actually circulate.
So there's no IP differentiation gap that you can create.
So it's very hard to actually leapfrog and to be way ahead of the competition
because there's some diffusion of knowledge that is just making everybody do the same things.
And so the question there is therefore, where is the value accruing?
And what kind of business model should you pursue to actually make sure that in the end,
you're turning profitable?
And then the challenge that we see with some of our competitors is that they're investing billions or hundreds of billions into creating assets that are deprecating very fast because those are commodities.
And so for us, it has always been, administrative, it has always been question, one of the biggest question of the industry is that you need to invest enough to actually bring value to enterprises, but you also need to invest reasonably so that you can build unity economics.
that makes sense in a world where the creation of model,
which is capital intensive,
is actually just bringing you assets that are just
in a community competition.
So let's talk a little bit then about this race
to build the best possible model.
I mean, like you mentioned, it's very expensive.
Opening Eye is going to put $1.4 trillion into building infrastructure
for its models, or at least it says so.
If the models are effectively at par,
company is going to say, hey, wait a second, maybe it doesn't make sense for us to invest all this money
into building the next evolution of a better model because people can catch up.
I mean, strategically, I think it's definitely there's some cursor to be set. How much do you invest
in creating assets that are valuable enough for one company to bring, for one technology company
to bring value to an enterprise or to bring value to a consumer?
And at the end of the day, all of these investments will need to be funded by the free cash flow and value creation that is being made downstream.
And so the focus that we have as a company, but that I think is a reasonable focus is to be more on the downstream applications and to figure out what is the friction that enterprises are running into and try to lift these frictions.
Because at the end of the day, I think one of the major challenge that the industry is facing today is that AI brought a lot of promises like,
four years ago.
But if you ask an enterprise,
did you actually make money out of it?
They will in general say no.
And the reason for that is that they are not
customizing things enough.
And they are not thinking backward
from the problem they want to solve.
So they think about the solution,
but they don't think about the problem.
And so trying and help them
to actually go for the right use cases
and actually do the right amount of customization
so that when it was a team of 20 people
are actually operating some supply
chain workflow, suddenly you can actually operate that with two people.
And there's a lot of examples like this.
But the challenge that the industry will face is that we need to get enterprises to value
fast enough to justify all of the investments that is collectively being made.
That is very interesting because for a long time you would hear these companies focused model,
model, model, right? The next, what's GPT5 was, let's say, when you think about open AI,
the biggest news. Now they're starting to talk more about how do you,
take the intelligence that you have and build the applications that work.
Just one bit of reporting that I can share a couple weeks ago.
You know, I had this story, this story basically inside a lunch with Sam Altman and a bunch of
newsleaders in New York City.
And Altman told him the companies, you know, one of their biggest priorities was building
applications for enterprise.
Basically, it's going to be a major priority in 2026.
And it's a little bit of a shift in rhetoric.
from we want to build AGI to we want to build applications for business.
So talk about why is that happening?
Is that an offshoot of this commoditization issue?
Well, I think the issue is, well, first of all,
AGI is a very simple concept.
So probably too simple for enterprises.
There's no such thing as like one system that is going to be solving all of the problems
of the world.
And so at the end of the day...
Not yet or you just don't believe in that concept at all.
It's never going to exist.
I mean, there's, you have a wealth of problems,
just like you don't have any human that is able to solve every task on the world.
You, of course, need to have some amount of specialization to actually solve problems.
And so we are back from magical thinking to system thinking.
We need to figure out what is the data that is going to be used
to make the model better at a specific task.
What is the file I will that we need to set
so that we accrue more signal from humans interacting with the system
so that eventually the application becomes better and better.
And so in real life, enterprises are just complex systems.
And you can't solve that with like a single abstraction, which is AGI.
And so AGI to a large extent is what we were not able to achieve,
and which is basically the North Star of,
I'm just going to make the system better all the time.
But because, as you said, it's hard to explain no to investors
that the technology you're building is never going to be matched by your competitors,
then there's, of course, a shift in the narrative.
That's your companies are not building like a North Star single system
that is going to be solving all problems,
but that we'll need to go into the weeds of enterprises
and solving their actual problems.
And I think at Mistral, we've been ahead of time
in thinking about this.
That's kind of, that set us our story.
Our story has been to assume that eventually AI will be more decentralized,
that more customization would be needed
because we were running into the limits of the amount of data
we could accrue and the limits of scaling laws.
And because of that, we created the company on that premise,
on the fact that we'll bring more customization ability
to the entrepreneurs. Yeah, and we'll get to the Mr. All story in a little bit, but one more question
about this. It seemed to me, and I wonder if you think this has been a shift, you know, you were
ahead of this for sure, but it seems to me like there's been a shift in the AI industry where
the idea was effectively make the models smarter and they'll try to figure out these, they'll be
able to figure out these problems on their own. Like for instance, I'll just make it concrete,
make the model smarter and it will be able to do a lower level associate's job or maybe do data
entry for multiple systems and be able to file reports. And now it seems like there's been a shift
from do that to actually build out the infrastructure, that the models are just one component,
that the infrastructure is super important. And things like orchestration and, you know,
working through the applications that are built on top of the models is going to be where the
value is found. It's interesting.
Yes, I think if you look at it from a system perspective, you have two components and we'll always have these two components.
The first components are like static definitions of how what the workflow should be and how a system should behave.
And those static definitions are set by humans that are defining how the system should behave.
And so this corresponds to the manual information that you're using to define the system.
And then there's a dynamic component where you're creating, you're connecting a model to tools
and you're giving instruction to the model and the model can go and call the tools itself.
And so it can decide on the graph of execution that it's going to follow.
And so that part is dynamic and there's a static part where you're setting up guardrails
or you're deciding you have a tree of decision sometimes.
And I think it's a bit utopist and irrealistic to think that you can solve everything with a dynamic system
without guidance from humans.
And what has happened in the industry
the last three years is that effectively the dynamic part has grown
because models can think for longer
because they can call multiple tools
because they can code.
But the static part remains extremely important.
And even if the dynamic part grows,
then the static part allows you to create systems
that are even better and more interesting
and you can solve problems that you were not able to solve before.
So the combination of these static systems,
which you can call orchestration if you want,
And the dynamic systems that you can call agents
is going to stay super important
because the two things are moving up together
so that we can tackle problems that are more and more complex.
Okay. And so now with that established,
I'm thinking through what the business is.
Let's say the model has commoditized.
So what are the businesses going to be in AI?
It will be, I imagine, some form of consumer products
like chatbots where you could put open AI in that bucket.
There will be a business where you could make
your existing products better, like for instance, maybe chatting with Microsoft Excel. That could be
one way that current companies can make their products better. But then there is this other big bucket,
which we've talked about a little bit already, which is the enterprise side of things. So how would
you rank the business opportunity in those three buckets? Well, yes, I think on the consumer side,
on the consumer side, because AI is starting to be, well, is becoming the way you access information.
you basically have a NAD's business to be built,
and that's pretty clearly going to be built.
It's not the focus of our company.
And then if you look at the enterprise side,
we're basically replatforming all enterprise software.
So enterprise is about having the right...
In enterprises, you have people, you have data,
and then you have processes.
Historically, there was a fragmentation of the tools
to run multiple processes,
multiple data systems,
multiple system of records.
And there was a fragmentation in teams
that were not able to access all information at the same time.
And essentially what AI allows you to do in an enterprise
is to start with a unified data,
or even you can start with fragmented data sources
because the AI is able to navigate them.
Then you put an AI on top that is building the right amount of intelligence,
understanding what's going on in the enterprise.
And then the AI system is able to somewhat generate the interfaces
that is useful for every human to actually work.
And so that part that represents,
platforming of the entire enterprise software stack is the one thing where a lot of value can be created in the enterprise.
Owning the context engine.
So the system that is constantly running that is looking at what's happening and figuring out creating documentation for what's happening.
Owning the front end as well that are more and more getting generated on demand.
So let's say I'm a lawyer.
I want to fix one my problem and have very specific review to make.
I just bring my documents and then the system actually evolve in like showing me the right widgets and show me the right information I need.
So the generative interfaces on top of a context engine that is constantly updating its representation of what's happening in the enterprise
on top of systems of records that are essentially going to be just pure databases.
You don't need everything that was sitting on top before.
This is where this is going.
And that replatforming is going to be, I think it's going to take a decade
because it takes a while to get enterprises to adopt these things.
But there's just immense value to be created because suddenly you can reorganize your company
around the fact that for many of the processes where you had a lot of people,
you can actually run those very much faster.
That's on one side, efficiency.
And the other thing, which is the most, so that's, I'd say that's one of the business
model in the enterprise.
The second one in the enterprise is about,
working with enterprises to help them take their really proprietary data, the assets being produced
by their machines, if it's in the manufacturing industry, for instance, and turning that into
intelligence that nobody else can reproduce. And so making models specifically good at a certain
kind of physics when we're working with a company doing planes, for instance, or when we're
working with ASML, making models that are specifically good at operating their machines, that's
huge value because suddenly you're not building efficiency within the company, but you're
effectively unlocking technological progress that was locked by the absence of AI.
So that unlock that the new systems are providing, that's immense amount of growth.
It's actually harder to measure because the first one is shorter term.
You can look at what a company will look like in five years because you've reduced certain
parts of the company.
you've reoriented other people to be creating growth.
You can create models of that.
On the technological side, I think it's a little harder
because we know there are things like nuclear fusion
or sharper engraving of semiconductors, for instance.
These are things where we're starting to run into physical constraints,
and artificial intelligence can actually help to lift those physical constraints.
And so the acceleration of technological progress
is I think where most of the barrication will be,
It will take a little bit of time, and it will be less measurable, less predictable than the efficiency gains that AI is going to produce.
But the two things are as important.
Okay, so let me see if I can sort of game this out here a little bit.
So if that is going to be the key driver of value in the AI world, there's two ways to do it.
One is to build a model that's better than everybody else and sell it for a premium.
But we've already talked about the fact that that doesn't seem like it's going to be a moat forever.
And the other way is, you know, the model is actually not the value.
It's the know-how and the implementation side of things.
So you can make the model open-source,
but then provide a service to businesses to be able to figure out
how to take that model and put it into action and actually get results.
Are those the two choices?
Yeah, that's kind of the fork that we see in the industry.
And our view there has been to be it on the second one,
to really-
The open-source and implementation side.
Which brings customization, but it also brings decentralization.
in that if you assume that the entire economy is going to run on AI systems,
well, enterprises will just want to make sure that nobody can turn off their systems.
So the same way, if you have a factory, you connect it to the grid,
you want to make sure that nobody is going to turn off the grid because they don't like you.
If AI effectively becomes a community, which is what's happening,
and if you treat intelligence as electricity,
then you just want to make sure that your access to intelligence cannot be throttled.
And so that's also one of the things
that open source technology can bring.
If you're using open source,
you don't have to worry about going astray
of, I'm just saying, like, anthropics,
you know, user terms.
And so then pausing your ability to do what you do.
If you use open source,
you can basically run it on your own terms.
Yeah, you run it on your own terms.
You create the redundancy you need.
You can serve with higher quality of service.
You can make sure that whatever
like the geopolitical situation may be,
You can still run the systems if you want.
And then, so that's really on the IT side.
So if I'm a CIO, I really look at open source as a way to create leverage and independence.
But on the scientific side, it's also the only way in which you can create systems
that are effectively using the folklore knowledge of your employees.
That's the knowledge that you've recruited for decades.
The only way in which to turn it into an asset that nobody get access to is to create your own models based on those open source models.
And so that's, but it's hard.
It's hard to actually build those.
Right.
And so that's where you need the right tools.
You need the right expertise.
And that's like the complement business model to building open source models.
But even the close source model providers, companies like Anthropic, will say they'll be able to customize their models with your data.
You don't believe that?
They will say that.
But then they will put some guard wheels on top of it.
So you're basically trusting that their engineers are going to give you enough access to the depth of the system.
and can you trust that for eternity?
I'm not sure.
So the issue there is as much a question of control
as a question of customization.
Like a vendor is going to try to lock you in.
So if you get access and if you build on top of open source models,
like all open source models or anyone,
you're basically less locked into the vendor.
And this is a technology which is so important
that you don't want to be locked into a single vendor.
So that's also the opportunity we'll bring.
You know what's stunning to me?
We're three years past Chat Chabutee, which basically brought this into a lot of people's consciousness,
although I think big technology listeners would have known about it a little bit beforehand,
especially since we were interviewing the people that thought this stuff was sentient before Chatsyipt came out,
but that's a conversation for another time.
But what we're basically saying today, I'm going to sum up two of the main points that you've made.
One is that today's AI models can't do it all themselves.
they need orchestration.
And the second big point that you made is to do that sort of orchestration or implementation
with the current intelligence, you need a service, like a managed service.
So it is interesting to me that like we've gone from like this perspective of, you know,
maybe working towards a God model that could do it all to the fact that, you know,
this may be the most powerful technology that we've seen come through in our lifetimes.
However, when you actually want to use it, you kind of need, it becomes a managed service.
in a way.
Yeah, that's interesting.
This is true.
I don't think it's the first time that we observe it in history.
It's a new technology.
It's a new platform.
And so the knowledge on how to use it is actually still pretty scarce.
So there aren't that many people that can build systems that are performing at scale,
that can run at scale reliably, that can actually solve an actual issue.
And so when working with enterprises, you always need to have some services on top
because of the complexity of implementation.
even with like fairly well-understood technology like databases.
But for artificial intelligence, it's even more necessary in that it requires to transform businesses.
So you need to also help in thinking how the team should perform around the system itself.
And it does require to customize things.
So you need data scientists that know how to leverage data and turn it into intelligence.
And today this is still a pretty scarce resource.
I would say I do expect the part of the software
in those deployment to increase.
So the amount of the way customization occurs today
with fine-tuning, reinforcement learning,
this kind of things,
this is going to be abstracted away from the enterprise buyer
because it's too complex
and they actually should just worry about having adaptive systems
that are learning from experience and from deployment with people
instead of thinking about should I use fine-tuning
or should I use reinforcement learning
to actually put that knowledge into my models.
And the work that we are doing is to try and abstract away from lower level routines that data scientists understand to higher level systems that business owners can actually use.
And so it's going to occur and we're working on it.
But the service part is still going to be quite important.
And today, the combination of the two things is the fastest way to value if you're an enterprise.
So we've been combining the two.
You know, I started our conversation by calling you a model builder.
And I kind of paused on it and I said in some other things that we're going to get into it later.
And here we are.
Basically, what I'm hearing from you is at Mistral, obviously proud model builder.
But it seems like without the services, without being able to sit with the business and showing them how to use it,
just would be an incomplete puzzle.
So do you consider yourself, like, is the most important thing you do building the models?
or is the most important thing you do the service?
Or are you primarily a model builder or primarily service provider?
I mean, we're there to help our customers get to value.
So service.
We're here to, but to get to value, they need to have great models.
And to get to value, they need to have the right tools to train the models.
And so the best way to create those tools is effectively to train the best models.
So the two things are extremely linked together.
We create models that are very easy to customize.
We create models with tools that we then export to our customers
so that they can use them.
And we help our customers train their own models.
So you can't go and sell to an enterprise
that you're going to help them create very custom systems.
If you can't show to the world that you're effectively the leader in open source technology.
And so that's the two parts as equally important.
The first is enabling the other.
And there's effectively a flywheel there.
Because we make our choices when it comes to the model design
in a way that is enabling the various customers we have.
One example is that we've put a lot of emphasis
on having models that are great at physics
because we work with manufacturing companies
that runs into physical problems.
So that's the flywheel that we have set up
by having the science team and the business team actually sit together.
Okay.
We're here with Arthur Mench.
She is the CEO of Mistral, also co-founder.
When we come back after the brief,
break, we are going to talk about open source, the open source movement versus close source. Remember,
deep seek and open source was supposed to surpass close source, well, has it. We'll also talk about
the geopolitics and regulation and whether that's going to give this company a leg up and then maybe
get into some more practical examples because we should talk about how the technology is being
used on the ground. We'll be back right after this. Here's the problem. Your data is exposed everywhere.
Personal data is scattered across hundreds of websites, often without your consent.
This means data brokers buy and sell your information, address, phone number, email, social security number, political views,
and that exposure leads to real risks, including identity theft, scams, stalking, harassment, discrimination, and higher insurance rates.
Incognity tracks down and removes your personal data from data brokers, directories, people search sites, and commercial databases.
Here's how it works.
You create your account and share minimal information needed to locate your profiles.
You then authorize Incogni to contact data brokers on your behalf,
and then Incogni removes your data, both automatically from hundreds of brokers and via
custom removal.
There's also a 30-day money-back guarantee.
Take back your personal data with Incogni.
Go to incogni.com slash Big TechPod and use code Big TechPod at checkout.
Our code will get you 60% off annual plans.
Go check it out.
And we're back here on Big Technology Podcast with Arthur Munch.
she's the CEO of Ms. Strahl. Arthur, I want to ask you about, you know, the progression of
open source over the past year. I remember reading about DeepSeek, doing reporting on Deepseek
in January, and the overriding theme was it was such a leap forward for open source that soon
the closed models, models like Open Eyes GPT and Anthropics Claude, and maybe Google
Gemini would be surpassed by open source because open source, the open source community was
working together and building on each other's innovations where the closed source community was
kind of going at it on their own. We just had this moment we talked in the beginning of the show about
how maybe Gemini commoditized GPT models, but that conversation was not being had about
like open source being, you know, living up to that expectation from the beginning.
of the year. So am I missing something or am I reading it wrong or what do you think? If something
has held back open source, what has it been? Well, if you look at the trends in 2024,
I'd say there might have been like a six months gap. If you look at the trend in 2025,
I think the gap is more around three months. So I guess it's up to anyone else, anyone to guess
what the gap is going to be next year. But effectively, this gap has been shrinking, has been
shrinking quite significantly. The reason for that is that basically you have a saturation effect
when you pre-trained models around 10 to the power 26 flops. The reason for that is that there's only
that much data you can find to compress when you pre-trained models. And so effectively, labs that
maybe started a little behind created enough compute capacity to train models at this kind of scale.
And efficiency has also increased.
And so what it means is that today,
everybody has access to 10 to 26 flops facilities
over the course of a few months.
And that's a measure of compute.
That's a measure of compute.
And that's a measure of compute times.
So you need to, yeah, 10 to per 26 flops
is something that any lab today can achieve in a couple of months.
And because of that, the saturation effect means that
open source models have caught up because
close source models that were started ahead
kind of run into that wall
of pre-training.
So what that means
is that this is only going to continue
shrinking. And if we look at
the latest open release we did, which is
Death Style 2, which is a coding model,
well, it's performing, I think,
around the performance of
Anthropic around two or three months ago.
So, yeah, I think
the gap is shrinking. And again,
I think the question is
probably not posed in the right way that way,
because it's also offering very two different distinct value
proposition.
Because on one side, this is well managed and you will depend on the provider itself.
On the other side, well, it takes a little more effort,
because you will need to own it more.
You will need to learn about how to customize it.
You will need to use the right tools for doing so.
You will need to maintain its deployment if you choose to deploy it
on your own facilities.
But at the end, this is creating
leverage you need against close source providers.
So the two categories are effectively different, but if you look at the pure performance side,
they are definitely converging.
You mentioned that there's a saturation effect.
So without getting too technical, are the models sort of done with getting better?
Like are, let me put it this way.
Are AI models going to continue to get better given the fact that they all seem to be hitting saturation?
they will get better in more and more specific domains.
In that, I think we've really collectively made them very clever
and able to re-evern about long context
and able to call multiple tools, etc.
But if you go and want to effectively put them into production
in a bank or in a manufacturing company,
well, the models need to learn about all of the knowledge
that is contained into the companies themselves.
And so what it effectively means is that for very,
precise directions, let's say I want to make my model extremely good at discovering materials or extremely
good at designing planes, I will need to go and sweat it a little bit and get the right reward
signal and get the right experts and ask them to make my model specifically good in that very precise
direction. And so we are definitely not done doing that because what we are all racing for is the right
environment and the right signal provider for specific capabilities.
But the broad horizontal reasoning capabilities are still going to improve them,
but nobody is going to improve them in a way that is creating strong,
that is creating a strong gap versus its competitors.
So the strong gap is actually in the, in working with vertical experts that know
exactly how they design a plane and that actually explain to the model how to do it.
And you have like a wealth of directions that you can take because you can do it in physics,
you can do it in chemistry, pharmaceutical, in biology.
And so to me, the most exciting part of what's going to happen in the next two years is that
explosion of very precise directions in which the model are going to get better.
So, and for us, the opportunity is to have the right platform for enabling those kind of verticalization,
whether with enterprises or you have like AI startups, actually, that are working on very
verticalized capabilities and we're happy to help them as well.
So that's my view of where the field is going to go.
We have been about horizontal intelligence, growing and things getting clever, more and more
clever.
And the next two years is going to be about taking model and making them extremely good at
a certain skill set.
And that's actually more exciting because we're getting to a point where if you pick a domain
can just make it superhuman.
But we are not going to make it superhuman in every domain at the same time.
Okay, but then on that note earlier in our conversation,
you said that you're not going to have a model that can do everything.
But if that training gets done in certain verticals, why not?
Well, we are also getting to a point where the verticals that you choose
do not really transfer to the others.
So there's no point in making a model that is good at very precise biology
and very precise physics.
because the transfer in between those things
actually pretty unclear.
The problem is that if you actually want your model
to be able to solve every problem at the same time,
you're making it very big, very expensive,
and very costly to serve.
So specialized models is really,
you're going to specialize one for bio,
one for chemistry, one for like this particular physics problem.
Well, it actually makes more sense
because if you want to run it at scale,
if you want it to run on the background,
if you want it to run day and night
thinking about specific problems,
well, you want it to be as small as possible
because the cost of a model is actually proportional to its size.
And if you inflate the size by making the model great at multiple modes,
well, you're actually not very efficient if you want to deploy it
and use it as much as possible.
So if you look at the economies of it,
it does make sense to make specialized model in certain directions.
Let me ask you a little bit about the Mistral competitive area.
I think that we're here in the US.
I'll just tell you what people in the US say.
and let you address it because it's worth talking about.
I think there is a feeling among some, not all, but some,
that, you know, Mistral has been set up in Europe
to effectively take advantage of regulatory capture
because U.S. companies have a hard time competing in Europe
and therefore Mistral will be there to, like, pick up all the AI business.
What do you think about that argument?
Well, you know, we've built our technology so that we could serve,
companies and states that wanted to have enough control.
Artificial intelligence is not a technology that you want to fully delegate to a vendor,
especially if it's a vendor that is from a foreign entity.
And that is, that was true before, it was true for data.
It's going to be all the more true for artificial intelligence for multiple reasons.
But one of them is the fact that this is, if you're depending on a next-starlight vendor, your, your commercial
balance is effectively increasing and you're importing services.
And that becomes a problem long term if you're importing too much digital services,
for instance.
So that's one thing.
And then sovereignty and this kind of topic is also very important for defense.
As if you're an independent country, you want to have independent defense systems.
And if you want to have independent defense systems, you will need them to, you will need
your own independent artificial intelligence because this is making it into the defense
systems.
So it's really working.
working for you, this pitch being like, we are not an American company, we're based in Europe,
we'll be able to help you build, whether it's something with like important data protection
or national security like defense.
Well, it's a technological differentiation we've built.
So because we can build on the edge, because we can deploy wherever our customers wants us to deploy,
we effectively can die and the system is going to still be up, which is, which actually matters
for many, many industry.
And the more critical it gets, the more it matters.
And so what that also means is that we can serve the US customers.
We can serve US customers that want to depend less on certain providers.
We can serve banks that wants to have more customization, more control that are more regulated.
It also means we can of course serve the European industry where historically that's where we are based.
You sell next door when you start your company.
And that's what we did.
But we also serve Asian countries.
And Asian countries, they have similar problems.
They want to have a technology that they can rely on, even if we were to die.
They want to have a technology that they can customize to their own cultural needs.
And so that has been driving our business for sure.
That aspect, that technological differentiation that we've built around control,
open source, like a technology built on open source models, around customization.
And do you have like European governments coming to and being like,
We just don't trust Google or Anthropic, and we prefer not to build on them.
Well, we have European governments actually coming to us because they want to build the technology
and they want to serve their citizens.
They want to increase the efficiency of their public sector.
And we happen to have a good proposition for them, which is deployable on their premises,
where we can go send forward deployment people to help them get to value.
And it turns out we're European as well.
It's actually pretty good for European countries to invest in European technology
because the investment they're making, the revenue that they are creating for us,
is a revenue that we reinvest in Europe and we're effectively creating an ecosystem around us.
So that investment of the flow of revenue from European countries to European technology provider
is something that is very beneficial.
And to be honest, in the US that has been working for the last 80 years.
I think in Europe we haven't been doing it enough for sure.
Speaking of open source companies, there are efforts that have some links to geography.
What do you think about China's open source effort?
Because obviously they've made a lot of noise.
It seems like things are going quite well there.
Yeah, I mean, China is very strong on artificial intelligence.
We were the first, actually, to release open source models, and they realized it was a good strategy.
and they've proved to be very strong, actually.
And so we've been not sure if we're competing,
because the good thing about open source
that's not really competition.
You build on top of one other.
Right, you see everything they have out there
and you learn what works well.
Yeah, and the same is true.
The reverse is true.
Like we released the first sparse mixture of experts
back at the beginning of 2024.
And they built on top and they released Deepseek free.
Deepseek was built on top of that.
Well, it was,
it's the same architecture.
And we released everything
that was needed to rebuild this kind of architecture.
And the same is true.
I mean, everything that companies
that are investing on open source are releasing
are things that all their open source companies
are reusing.
And actually, it's kind of the purpose.
R&D is just much more efficient
if you share your findings across different labs.
And so it's been very effective in China.
They share knowledge across the different labs.
It's been pretty inefficient here in the US
because there's actually no
there's like US incorporated company are not investing on open source.
And we've taken the lead on just being the West open source provider.
And I think it's going to be very much needed to have a Western open source provider.
What do you think China's strategy is?
And do you think that there's like in the US, there's often this kind of very large conversation about the need to stay ahead of China?
Do you think there's a risk of China runs away with this?
Well, I think China is very strong.
vertically integrated, they have strong engineers, they have compute, they have energy,
everything they need to compete. Europe also has everything it needs to compete. I don't think we'll
be in a setting where anyone is going to have one artificial intelligence ahead of the others.
And if you look at like the world in its entirety, every large enough sovereign entity, which is a
big economy, is going to want some form of autonomy in its usage of AI and its deployment of
AI. So that does justify the emergence of multiple centers of excellence, I would say, one of them,
which is in Europe, which is led by us, one of their, which is more in Nangzhou in China, and then
you have a bunch of companies here in the West Coast. Why do you think it's in China's strategic
interest to develop these open source models? I mean, they don't have a similar business as
as you do, right? They're not real, they're not like going out globally and becoming implementers.
They have a big business in China, for sure.
The companies that are building open source models in China are actually cloud providers in general.
You have a bunch of startups, but you also have Alibaba, which is the cloud provider.
And so they have this vertical integration that allows them to create value there internally.
So in China, but also in the markets where they are operating and growing.
So in Asia, for instance, which for us is a place where we tend to compete with them,
not in China itself, but in the rest of Asia.
So does make sense for them to compete.
internally. And then their best way of accessing the U.S. markets is by just giving the things for free.
And so it does make sense. It's a very natural thing to do to build a business in China,
which is protected, then to export the thing for zero. I would do the same if I were in their shoes.
Right. All right. I want to talk a little bit before we leave about the practical applications
of this technology that you're building. You know, it's interesting. You were talking a little bit
about AI being used for physics, AI being used.
used in other research applications, AI being used for defense.
None of this sounds like a chatbot.
So talk a little bit about the applications that you are working on
and whether we're going to see AI move beyond the chatbot.
I mean, the chatbot is oftentimes the interface,
because artificial intelligence is a generative AI allows you to
interact with machines in a human way.
So it's a chatbot is a human machine interface, but it's not, the rest,
it's only that.
Now, if you look at the actual applications that are strongly exciting for us, you have two things.
The things that are really on the end-to-end workflow automation that effectively changes the way a business is fully run.
So examples are like cargo dispatching when we work with CMA-CGM, which is a shipping company.
And we help them dispatch all of their containers when the cargo, the ship comes into the ports and they need to dispatch everything.
they need to contact like hundreds of people,
they need to contact the harbor,
they need to contact the regulators,
they need to actionate 20 software differently.
And so that takes like, I mean,
I think a few hundred people to do it.
And by working together around how to automate those things,
suddenly you can save 80%.
So the LM is making those communications.
And also deciding, not just making the call,
but deciding who gets what?
It decides and it wires the things
and you measure whether it's doing the right thing.
And if it doesn't,
then you improve the system.
How's it doing?
So it's working.
It's live, actually, in certain agencies.
So that's very, like, to me, it's very exciting because it has a physical footprint.
It takes decisions in a safe way.
And it's effectively bringing a very large efficiency gain to a company.
Now, another example, which is more on the growth side, are things that we do with
ASMR.
We are working with them on vision systems.
And talk a little bit about what ASML is for those that don't know.
So SML is a company that is doing computational.
lithography and scanning.
And their role is to build those big machines that are effectively engraving the
wafers that are then used as the chips in Nvidia, for instance.
Right.
So they're like key industrial component of the semiconductor manufacturing process.
They provide the machines for semi-films.
Right.
And something so specialized, you would think, how is generative AI going to help them?
Well, generative AI is generally, the generative AI models are predictive AI models.
and one good thing they have is that they can see and reason about what they see.
And so one of the things that SML needs to reason about
are the images coming out of their scanners
that are verifying whether there are errors in the engraving of the chips.
And it's actually fairly complex because there's some logical thinking to be done.
And the combination of images and logical thinking
is what enables us to actually automate those things much faster,
which means that the throughput down the line of Fab,
is going to increase.
And so in that setting,
customization is key
because the kind of input that is coming in
is nowhere to be found elsewhere.
Estimel is the only one who has access to these images.
And so we find like a physical problem
that is effectively a bottleneck in like a manufacturing process
and we go and we train models that are effectively solving it.
So that's, and this is going to occur
in like many, many different places.
And generative AI is needed there
because you need a model that can reason about images.
And so the reasoning capabilities are critical.
But customizing those reasoning models for a specific problem
with a specific kind of input is the one thing that is the end of lock there.
Yeah, the industrial applications of Genitive AI to me
have been super surprising and interesting.
Like there has been technology, for instance,
computer vision technology that can take a look at a piece of machinery or an output
and be like, that's not good or actually that's what we need, right?
But there hasn't been this nerve center that that information can be channeled to
and then sort of have a decision made about it and then communicate it to somebody in the field.
And that's what this stuff is enabling, is that that full line of technical work is starting to be able to be done by this technology.
Basically, what you need is our models that can perceive multiple kind of information.
and oftentimes in manufacturing information is visual.
So having very strong visual models is super useful.
And then based on those vision models, you can, on these inputs,
you can make choices and you can rely on the LLMs themselves
to orchestrate calling an agent or going into the next step of the workflow
or actually calling a tool or writing something in the database.
And that's having dynamic agents that are able to see what's happening in a factory,
that are able to see what's happening in the process,
and that can take the next step,
whether it's actually an automatic step
or a call an agent step
so that they validate a decision,
is where a lot of the value can be created.
And that's going to reorganize manufacturing.
You know, manufacturing had to reorganize itself multiple times
when we invented the steam engine.
We had to rebuild the entire factories
around like a central steam machine
because that was the energy provider.
And so what's going to happen, I think, in the next 10 years,
is that all of the manufacturing process,
will be rebuilt around LLM orchestrators.
And it's super interesting because you have physical problems
to solve the system as physical footprint,
so there's some safety issue that you need to solve.
Just the complexity of the system itself is huge,
and so that's a fascinating problem for engineers like us.
Let me see if I'm getting this right.
Okay, so I think what we're starting to see
is the seeds of this stuff starting to be able to really have an impact.
in business.
We just did an episode with a reporter who was reporting on how some lawyers are really able
to use this to sift through documents better.
Is it perfect?
No, we heard it in the comments, not perfect.
But it has showing potential.
Same thing in industry,
and maybe also in other areas that you touch on.
But still feels nascent.
So what's going to get it from like where it is today to something that's like, you know,
effective in a way that like we,
really see the impact in the economy.
Is it just like time and patience on customization,
or is it improving improvement of models or?
I think models are getting better, which helps.
Whenever you have a stronger model,
you can trust that it's going to reason for a longer period of time
and that it's not going to fail less.
But then the thing that needs to be embraced is iterations.
You're never going to be able to build systems
that work out of the box in a single shot.
And the one thing that we try to convey to our customers is that they need to build a prototype.
It's going to work 80% of the time.
But then how do they get from 80% to 99%?
Well, they can move the thing into production.
And the way to get it is to actually get feedback from users.
If the system is not working, if the AI software you've built is not working, it means that you need more data and signal.
And that's something that is quite different from the way we used to build software.
because when the software was not working before,
you basically went back to coding and you would fix the problem.
But because we are building organic systems,
so systems that imitate humans,
the way to make them better is to give them feedback
and then to retrain the system.
So that will take the seeds that you mentioned
that will make them actual valuable things that's going to work.
And you mentioned lawyers.
I think it's one of the area where it's very knowledge intensive
you have very little physical footprint.
And it's a lot of text.
It's a lot of text.
And so it's the easiest one.
It's the easiest thing to do.
It's not easy at all.
It's not done yet.
There's still a lot of subtleties to fix
to make models great at lawyering.
But if you go into the physical world,
then it gets even more complex.
So we'll see applications on the knowledge world
go faster into production than the one on the physical world.
But arguably, the one of the physical world
would be more transformative.
That brings us to robotics.
So let's, let's end here.
People have been talking about how we could see an explosion in robotics because of LLMs or the advancements in world models.
But it still seems far off.
I mean, they had this demo, what was it, the neo, the neo-huminoid robot where there's like a person controlling it, teleoperating it.
Kind of weird.
They might be in your house.
So we haven't seen progress in robotics, you know, start to move as fast as we've seen it in the software side, in the large language model side.
So where does that, when does that come if it ever does?
I think in robotics, you have the combination of two things that needs to work.
Hardware platforms that need to be, you need to have the right actuators with the right haptic signals that needs to be built at scale with good economics.
And this is starting to be true.
and we're not the one working on it,
but the industry has made a lot of progress on that domain.
Then the other thing is that you need to be able to have control system
that are sufficiently intelligent to be deployed on those robots.
And so that's where actually we come in,
in that, again, you need to have custom models.
Because the problem is the model needs to be customized to the platform,
whether it's a humanoid robotic,
or whether it's something on wheel or whether it's a flying drone.
And it needs to be customized to the mission
because the mission is going to bring different kind of images,
the kind of actions that can be taken
are going to vary across the mission,
maybe the guardrails are different.
And so that adaptation to the world
and to the wealth of data that the hardware platform
that is being deployed is bringing
does require the right platform
and the right training platform.
And so our bet in robotics
and what we've been doing with multiple
companies in defense in particular is to build that platform that allows to train models fit to purpose
that can then be deployed on the edge potentially because in robotics strategically in robotics
i believe we'll see deployment of such systems first in areas where you don't want to send humans
so firefighting i think is a very good example so when the risk and benefit the the risk of
deploying the system is way under the benefit of deploying the system
It's going to be the case in manufacturing as well, because there are places where you just want the factory to be dark.
And I think that's where most of the, a lot of the value will be created.
I would say midterm.
And then maybe long term, you have things that are sitting in your house.
But, you know, it's a bit dangerous to have like some pretty strong thing out there.
And so the same way we've been waiting for self-driving car for the last 15 years, we'll be probably waiting for like humanoid or
boutiques in-house for meaningful time.
And before that, what we'll see is at scale deployment in manufacturing.
And that will take the right software platform.
And that's the software platform that we're building.
Okay.
All right.
Really, the last one.
We've talked a lot about AI and business.
Some businesses have gotten a lot out of it.
Some have not clearly potential, but also just like a shit ton of investment.
Is, yeah, what do you think about the bubble question?
Are we in a bubble right now?
Well, we're in a setting where we need a lot of infrastructure, so we need to invest,
and that's what we do in Europe, for instance.
But then the viscosity of adoption in enterprise is slow, is high.
In that it takes time to understand how to build the software.
It takes some building.
You can't buy off-the-shelf solutions and then trust that you're going to make immense progress
in your productivity.
That has been the disappointment that a lot of enterprises went through in the last two years.
So there's some building to be done.
You need to maybe buy the primitives,
buy a certain number of factorized functions,
but then you need to bring your own knowledge onto it.
So it takes some time.
You need to learn how to build.
And then you need to learn how to reorganize.
And that takes even longer because the teams are going to change.
You need less management because you need less infrastructure to circulate information
because AI allows information to circulate faster.
you need certain functions are going to disappear
certain functions are going to grow
so there's just a lot of work to be done on reorganizing things
and it will take years
and so the question is
the infrastructure investment that are being made today
are they going to create long-term value in two years
in five years or in 10 years
and that does define whether some people are losing money
or making money that's the
that's the problem
and we don't really know
so maybe people are over-investing
maybe people are under-investing
some people will certainly lose money
some people will certainly
like miss opportunities as well
but today I would say
my view is that we're maybe over-investing a little bit
and over-committing a little bit
not Mistral but some of those
because we see how
complexity is to actually create value in enterprises
But eventually we'll get there.
Eventually, the entire economy is going to run on AI systems.
That's for sure.
But it might take 20 years because it's actually fairly complex.
All right.
The website is mistral.a.i.
Our guest has been Arthur, the CEO of Mistrawl.
Arthur, thank you so much for coming in.
Really appreciate being here.
Thank you for hosting me.
You bet.
All right, everybody.
Thank you for listening and watching.
And we will see you next time on Big Technology Podcast.
