The Pragmatic Engineer - From Swift to Mojo and high-performance AI Engineering with Chris Lattner
Episode Date: November 5, 2025Brought to You By:• Statsig — The unified platform for flags, analytics, experiments, and more. Companies like Graphite, Notion, and Brex rely on Statsig to measure the impact of the pa...ce they ship. Get a 30-day enterprise trial here.• Linear – The system for modern product development. Linear is a heavy user of Swift: they just redesigned their native iOS app using their own take on Apple’s Liquid Glass design language. The new app is about speed and performance – just like Linear is. Check it out.—Chris Lattner is one of the most influential engineers of the past two decades. He created the LLVM compiler infrastructure and the Swift programming language – and Swift opened iOS development to a broader group of engineers. With Mojo, he’s now aiming to do the same for AI, by lowering the barrier to programming AI applications.I sat down with Chris in San Francisco, to talk language design, lessons on designing Swift and Mojo, and – of course! – compilers. It’s hard to find someone who is as enthusiastic and knowledgeable about compilers as Chris is!We also discussed why experts often resist change even when current tools slow them down, what he learned about AI and hardware from his time across both large and small engineering teams, and why compiler engineering remains one of the best ways to understand how software really works.—Timestamps(00:00) Intro(02:35) Compilers in the early 2000s(04:48) Why Chris built LLVM(08:24) GCC vs. LLVM(09:47) LLVM at Apple (19:25) How Chris got support to go open source at Apple(20:28) The story of Swift (24:32) The process for designing a language (31:00) Learnings from launching Swift (35:48) Swift Playgrounds: making coding accessible(40:23) What Swift solved and the technical debt it created(47:28) AI learnings from Google and Tesla (51:23) SiFive: learning about hardware engineering(52:24) Mojo’s origin story(57:15) Modular’s bet on a two-level stack(1:01:49) Compiler shortcomings(1:09:11) Getting started with Mojo (1:15:44) How big is Modular, as a company?(1:19:00) AI coding tools the Modular team uses (1:22:59) What kind of software engineers Modular hires (1:25:22) A programming language for LLMs? No thanks(1:29:06) Why you should study and understand compilers—The Pragmatic Engineer deepdives relevant for this episode:• AI Engineering in the real world• The AI Engineering stack• Uber's crazy YOLO app rewrite, from the front seat• Python, Go, Rust, TypeScript and AI with Armin Ronacher• Microsoft’s developer tools roots—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@pragmaticengineer.com. Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Transcript
Discussion (0)
So I just started, again, nights and weekends, didn't ask permission, just started fiddling around, seeing what could be done that would be better.
The first year and half, it was literally just me working on.
Nights and weekends, and I had a day job, and I was managing a big team and a lot of stuff going on.
Hold on. So at this point, yeah, you were already second level manager or something and had a team of 40 people.
Wow.
Running a lot of very interesting and very fun stuff.
You're good at juggling stuff.
This is a passion project.
I'm also a fairly good programmer, so that helps.
How would you go from this blind canvas to like, okay, here's what I think the language will be?
I start from a lot of blank canvases across my career.
So LVM, Blank Canvas.
MLIR, later, blank canvas, Mojo, Swift, many of the systems I build are blank canvas projects.
Do you think there is any logic in having a new language that is designed to make it easy for LLMs to code with?
So I often get asked, given the AI is writing all the code, why are you building a programming language?
Which is another way of asking the same question, maybe a little bit more agro.
And so, to me, I don't think that optimizing for the LLM is the right thing to do at all.
Because the important thing is...
Chris Lathner created some of the most influential programming language and compiled technology
of the past 20 years.
He's a creator of LLVM used by languages like Swift, Ruffed, and C++,
created the Swift programming language, worked on TensorFlow, and now works on the Mojo programming
language.
In this conversation, we cover the origin story of LLVM and how Chris managed to convince Apple
to move all major Apple DevTools over to support this new technology, how Chris created
Swift at Apple, including how we worked on this new language in secrecy for a year and a half.
And why Mojo is a language Chris expects to help build efficient AI.
programs easier and faster, how Chris uses AI tools and what productivity improvements he
sees as a very experienced programmer and many more.
If you'd like to understand how a truly standout software engineer like Chris thinks
and gets things done and how he's designing a language that could be a very important part
of AI engineering, then this episode is for you.
This podcast episode is presented by Statsig, the Unified Platform for Flags, Analytics
Experiments, and more.
Check out the show notes to learn more about them and our other season spawn.
answer. So Chris, welcome to the podcast. Well, thank you for having me. It is so nice to meet you in person because I feel like the work that you did has had an impact personally on me at Uber. We migrate to Swift. And of course, a lot of software we use runs on things that you've built. So can we rewind time back all the way to the early 2000s? And for some of us who have not really been around there, can you describe like what was the industry like in terms of compilers, languages, and what was the status?
go back then.
Yeah, so, I mean, I was involved in the kind of early days of Linux.
And so I started maybe not the really early days, but I started working with Linux in the
mid-90s.
And this is when Java was coming on the scene and a lot of things were changing in the industry.
As that played out, GCC was the compiler that standardized a lot of software and a lot of
people were using it to build Linux-based software.
And this is how I got to know it.
GCC is a great thing.
Before GCC, a lot of people don't know this.
every hardware vendor was making their own compiler,
and it was a gigantic mess for even just C code,
because none of these compilers were compatible with each other.
And the reason they did that just so I understand is, like, you have C code.
It needed to translate to assembly for a specific hardware,
and then the hardware vendor, you know,
they did the mapping and figuring out like what custom instructions
or something like that.
Was that a reason?
Yeah, so basically back in, particularly like the late 80s and then the early 90s,
there's a lot more diversity in terms of like instruction sets
and chip makers were innovating,
and you had HP building things,
and you had Intel was making a lot different systems back then,
and there's all kinds of different stuff going on.
And so you had risk computers that were being invented.
And so the challenge at the time was that everybody had to build their own compiler.
And so because everybody had to build their own compiler,
they all wanted C, and some C++ was emerging,
and C was a standard, and so there was a spec that said this is what the code is supposed to look like.
but as we know for many standards and many specifications, they're never complete.
And so what ended up happening is each of those compilers was a gigantic mess because they had
different bugs, they had different miss features, they lacked certain capabilities.
And so software of the day was built with systems like autocomph, which was this really weird macro
processing thingy that tried to work around some of the limitations.
And it was a gigantic nightmare.
So GCC came on the scene, kind of in the 90s, really, and clean up that mess.
it became the standardized thing that people could plug into,
and a lot of chipmakers embraced it,
and it was free software.
And so that led to the rise of Linux, links, adopted it,
and that really kind of brought the world forward.
Now, GCC was a good thing.
It really did help standardize the world,
and I think a lot of free software and open source
pays a debt of gratitude towards GCC.
But also, it was a really old thing.
And so I made fun of it at the time.
It was over 20 years old,
and its architecture was very, very, very old.
old school in many different ways. It was also not built to be a modular design. It had, you know,
it was designed to do one thing, take C code in, and then put out assembly code for a given chip.
And so there's a lot of things that you couldn't do, like Jit compilation. And you couldn't, at the time,
even optimize across files within. And JIT is just in time compilation, right? Just in time compilation,
exactly. And so there's a lot of things that people wanted to do that you couldn't do with GCC.
And so in around 2000, that's when I was in university and I was studying compilers. And I said,
Oh, okay. Well, wouldn't be interesting to build something new in the space.
And so I started working on LLVM.
LLVM is a, it started out as a code generation system so you could target multiple different architectures.
But really, for me, it was a learning process.
It was about saying, okay, compilers are cool.
Don't let anybody tell you otherwise.
Compilers are cool.
Even today, right?
Even today.
But I didn't really understand anything about it.
And so I wanted to learn by building.
And so across my university project, I built this thing up more and more and more.
We then open sourced it.
It got a little bit of a community.
Later, I went to Apple, which really helped foster and fund a lot of the development.
And that early starting point was coming from, you know, GCC and open source technologies are really good.
But there's a lot of things we can't do.
And so that's where I kind of fell down this rabbit hole.
And like when you started an open source, what did you open source?
What could it do?
Yeah.
And then, you know, what was the reaction to this early version of LLVM?
Yeah.
So LVM back in the day, and this is super funny because many people look back on,
successful systems, and they assumed that everything was obvious.
When actually every step along the way is challenging, you have to earn any success you get.
Very rarely do you, like, luck into it.
And so when I first started working at LVM, again, it was a research project out of university.
There's a lot of research projects at a university that don't go anywhere.
I'd argue most of them probably won't go in there.
Exactly.
And so when, and the default assumption is that LVM also would not go anywhere.
that's a safe assumption for a university project.
And so we used it internally.
And so my advisor, Vikramadvei, encouraged us to continue building it.
And then we taught it to a couple of classes.
And so we had a few people using it internally.
And we got some use case with it.
But at that point, it was really just optimization and code generation.
And so it plugged into GCC is the parser to actually parse C code, for example.
And so it was just about code generation.
And it was useful for a compiler people,
and learn things, but it wasn't very useful for general application developers, really.
When we open sourced it, then we got, I know, two or three people that were interested in it.
And it was mostly other compiler nerds that, you know, are delightful.
But there was no major community.
There was no major reason for people to contribute.
And what I did was I said, okay, well, I actually just treat the world like the rest of the research group and just have open development,
encourage people if they combine, they want to help and do something awesome.
if they would have questions, I'll answer them, and just treated people with respect.
And what happened over time is slowly the community grew.
You got an individual people that were interested in different things.
Some people were interested in esoteric programming languages, and they were interested in compiler
technology for that reason.
Other people were interested in performance, and different people had different interests.
And so very, very, very, very slowly it kind of grew.
And what was the big difference between GCC and LLVM?
Like, of course, modular is one thing, but also you mentioned capability.
that you wanted to do that GCC was just either couldn't do or was really difficult to do.
What were those capabilities?
So originally it was just in time compilation, so runtime code generation.
That was the thing with GCC.
GCC couldn't do it.
They could do only compile time.
They could only do a kind of batch up front, traditional Unix style code generation.
At the time, GCC could not optimize across files in your project.
And so if you had a function in one C file, it could not inline it into another C file.
So things like that.
LVM is also written in C++.
which was highly controversial at the time because GCC was all C.
And Richard Stallman was very opposed to C++.
And so actually there's a parallel universe because in 2005 had joined Apple,
we decided, okay, well, let's see if maybe LVM and GCC should merge.
And I actually proposed to the GCC community,
hey, we've got this interesting technology, this seems very complimentary,
it could lift the architecture of GCC.
Maybe we should do this.
And it didn't go anywhere because primarily,
was written in C++ plus, but it was also not invented here.
There was a bunch of other, you know, very serious, very experienced people.
And they're like, okay, kid, why would we listen to you?
And so it did not go anywhere.
And thankfully so, it meant that LVM had to grow up on its own.
And then did LLVM really start to take off when you got to move to Apple to now work on
it full time?
I assume there must have been at least a small team, like, investing in this.
Is that how it went or something different?
So the story of getting to Apple, so when I was graduating, I was looking for jobs in compilers and jobs that ideally would let me work on LovM and continue the work because it was still an advanced research project.
It was five years in.
The community had grown quite a bit.
I was doing regular releases every six months, and so it was really putting a lot of energy into advocating for the work and showing momentum and velocity.
I caught the eye of a VP at Apple.
And there was a couple of people that were working on adding a power PC back end to LVM because that's what Apple was doing.
And so I got to know them and they said, hey, come.
Like, we are frustrated with GCC.
That is the foundation of all of our compiler technology.
Because GCC compiled Objective C, for example, right?
That's right.
Exactly.
And so, and it was, again, an older architecture.
It was very difficult to work with.
They weren't getting the performance they wanted.
It was very difficult to add features.
The community was also kind of annoyed with Apple for a variety of reasons.
And so basically management said, come, like work on LVM.
I said, yes, sign me up.
And so when I started, they did give me one other engineer to work with, but really, they didn't give me a whole lot of guidance.
And so I was like, okay.
Do you pretty much?
Cool.
I will, I will go implement power PC support and integrate with the MacOS back in the day and things like this.
And then there became a time when they said, okay, cool, you're working on this.
And my manager said, it's great that you're working on this cool technology.
But at some point, we need to make sure that it's actually relevant to Apple.
Was it not relevant?
Well, it wasn't used any products yet.
Oh, I see you out of support, but it wasn't used just yet.
Just yet, yeah, exactly.
And it wasn't as good as GCC in a number of different ways.
And so I basically got the vibe that it's like, okay, well, after a year, if Apple's not using
in some product, then we'll ask you to start doing something else.
Because we're not just going to pay you to work on an open source project.
It's how it works.
Actually, I have impact on our company.
And so when I got that memo suddenly, and my manager helped me a lot, he was amazing,
went around shopping for, okay, well, what are the near-term impact that we could have on the business?
And so from there, the first use case was actually for graphics.
And so OpenGL was doing just-in-time compilation to do some graphics stuff.
And so we were able to do something very small that actually had value.
And suddenly, aha, this is not just a science project.
This is actually interesting.
Now, it's still missing tons of features, but that enabled the development of the next set of features,
the next set of features, the next set of features.
And it was now in a product that Apple actually used and shipped.
And it was generating business value, if we want to put it like that.
A small amount, but at least non-zero.
Exactly. And so what I did over the course of many years across Apple is then say, okay, cool, we'll keep investing the technology, keep building an open technology and an open team, so community. But make sure to deliver more and more and more and more value to Apple. And as that happens, suddenly, what ended up happening is I ended up replacing all of the developer tools, compiler, cogeneration debugger technologies within Apple. And that.
Once at a bit of time. Exactly. And so it took about five years to the point where we built a new C++.
PlusPars are Kling and Objective C front end and all this kind of stuff.
We got to about 2010.
That's when Apple was releasing the first 64-bit iPhone.
And suddenly, this was made possible by LVM, by all the technology we had built and all this
kind of stuff.
And it was an epic moment in the industry because everybody was convinced that 64-bit
phones were stupid.
That was not a thing.
And 64-bits doesn't make sense.
Why are you going to have more than 4 gigabytes of RAM and a phone, right?
hilarious.
Wow, back in the day.
But that's how people thought, right?
That's how they thought.
And they're like, it's an efficient, it can never work.
And so it actually, the first 64-bit iPhone, it was the iPhone 5S, shipped in production before Arm got their chips back to test.
And so we had built the entire software stack.
We had enabled the entire operating system, the entire tool chain, all this kind of stuff internally.
Then we were collaborating with Arm, but they had no idea how far ahead we were.
And then suddenly we're shipping it in production and the whole world's heads explode because how good the performance was and how all the capabilities we enabled.
and then it was quite fun.
So I'm having just trouble putting two things together,
and you can help me out with what I'm missing on.
On one end, you know, you made it seem like, okay,
like, you know, initially when you got to Apple for a year,
it was an experimental project.
Then you just went slowly with one team and after the other.
But now next thing, we know fast forward to four or five years later,
it's actually empowering like the core of Apple.
How did it go from like, you know,
the small projects to actually getting into like the core of the business?
Did it help that you start?
with developer tools and more developers were convinced?
Like, how did you get through to, you know, like probably the highest level kind of architects
and decision makers who in the end, I assume they had to take a risk on technology that was
like somewhat new compared to like at least GCC that's been around for like 25 years at this point?
Well, so it's a combination of having top-down support.
So people that wanted me to be successful.
And that's because they're frustrated with GCC.
And so they want a new technology.
So that was very helpful.
A combination of bottom-up success.
And it wasn't one thing.
It was like every six months, we'd have a new thing that,
worked well and go. I became known as somebody who would get things done. And so I got more
scope and responsibility. And so I went from being an engineer to a manager to a second line
manager to eventually a senior director over the course of a number of years. And so it wasn't one
thing. It was just a lot of hard work. And it was a lot of fun because we were able to, you know,
a great thing about Apple back in that time was that you could have a lot of impact because the
team was growing. But also, if you could get stuff done,
you had an amazing platform in which to work because the iPhone came into existence and a whole
bunch of other technologies were made possible. And Objective C was very Apple specific, but we could
move it. That was huge. We had a whole bunch of features to Objective C. And so as we built into
this, like there's a lot of opportunity, but it was hard one, right? It wasn't in any given moment,
it wasn't guaranteed. But as my team at Apple was making progress, we were doing all this in the
open. Then suddenly other people started seeing value. And like, Cray,
build a supercomputer and used LVM for it,
and Google eventually adopted it 2010
and started using it within Google for some
small things. And then a parallel
project within Google of another hero
within Google, like built a team
and add a lot of impact to Google,
therefore justifying more people to work on LVM
within Google. And different tooling, different
projects and different things that all happened.
Eventually, you know, Intel or
Arm or these companies ended up canceling
their internal compilers and switching to LVM.
And so then suddenly the vast
majority of their engineering teams all working
other thing. And then what you get is you get value that compounds and you get a, today LVM has a
community of thousands of people that come together. And so it's amazing to see that.
It's incredible. It feels a little bit like when you're starting to roll snow falls and you're
starting to roll this small thing. And then, you know, you do it for long enough and most people
don't do it. But then you could have a snowman and you're like, oh, where did you get all that snow
from? It's incredible. Well, so, I mean, it's funny because people always imagine success.
And so you start from a starting point and you say, ah, I want to be the
president of the United States or something. But, but, and some people get very motivated by
that and it's very powerful. What I really love is doing all the work, taking each of the steps,
doing the thankless thing, figuring out that really obscure thing that, you know, if you get the
architecture, right, it makes the whole thing compound and scale and compose the right way and it allows
you to solve problems that nobody else can solve. But what that means is most of the time I'm
working on things people don't understand. And so I'm fine with that. I've normalized to this. This is
what I expect at this point. I don't expect people to understand me. But what happens is that
these projects over time, they grow and they get to the point where suddenly it clicks and
suddenly people start to understand. And suddenly you can explain it because it maps into their
value system in a way that they can measure, basically. Chris just mentioned how things click a lot
more for people if it maps their value system in a way that they can measure it. In my experience,
measuring things is helpful all around, not just when trying to convince people, but also in figuring
out if a feature you built works as you'd expect. Here's a challenge more.
teams face. They shift features, but cannot easy the measure if they're working. Did that new
checkout flow improve conversion? Is the feature helping retention or hurting it? Without measurement,
you're just hoping things will work. That's where Statsic comes in. Statsic is a presenting
partner for the season, and they've challenged the status quo of how product teams work,
much like how Chris challenged how compilers and programming languages are supposed to work. Most
teams accept fragmented tools. Feature flies in one system, analytics in another,
experiment somewhere else. You're stitching together points.
solutions, running ETL jobs to sync user segments, hoping timestamps align across different
systems. That's the status quo, but it doesn't have to be. Statsick built a unified
platform that gives you everything in one place. Feature flags, experimentation, analytics, and
session replay, all using the same user assignments and event tracking. So you ship feature
to 10% of users, and the other 90% automatically become your control group. You can immediately see
if conversion changed, drill down to where users drop off in your funnel, then watch session recordings
to understand what went wrong, all with the same data.
This is how you measure impact at the pace you ship.
Companies like Graphite, Notion, and Brex rely on this approach to make data-driven decisions
without waiting for data teams or manually correlating information across tools.
StatSik has a generous free tier to get started, and pro-pricing for teams starts at $150 per month.
So learn more and get a 30-day enterprise trial.
Go to statics.com slash pragmatic.
With this, let's get back to the conversation with Chris.
One thing I feel that is probably pretty applicable for anyone is the fact that you were inside a company that it was pretty big and successful at that time Apple and you still managed to get so much things done just as you said by by solving hard technical problems continuously figuring how to help the business and there was this interesting part where Clang got open sourced and Apple back then LLVM was open source so that kind of I understand how to help the business.
And there was this interesting part where Clang got open sourced and Apple back then LLVM was open source so that kind of I understand how.
Yeah, those grandfathered in, yeah.
But I want to ask you on how you managed to convince a company that back then was pretty close,
and even today is pretty close outside of a few things, to be okay open sourcing a new project.
And I assume your reputation or hard work or getting things that might have been there.
But what else did you kind of use for leadership to like green light this?
Because I'm sure top down they need to say like yes, right?
Yeah.
Well, so Clang was actually the easy one because we didn't really ask to make people.
for permission. We just kind of did it. And so that was the easy one. The hard one was
Swift. Let's get the so. I think we'll get there, but that was the actual hard one.
And with them, let's get to Swift. The story of Swift as you as you shared many times is how
you started to work on it in secret. Can you tell me like what led you to start experimenting with
the new language, what you saw the problems being with objectives, especially because now we
solve the compiler problem. Like, you know, the compilation was probably as good as it could have gone.
And how did you pull off this kind of, what does secret mean, right? Yeah. Well, so, so the backstory on
Swift is that, so I joined Apple 1 in 2005 and I built and basically replumbed their entire C++
and C and Objective C tool chain. And by 2010 at WWC, which is their Sumner Conference,
we had launched a full stack replacement and it was working. And building a full stack,
fully integrated C++
compiler is no small feat.
C++'s even then was a very complicated language
and it was a very big technical challenge
but it was also very demotivating
because C++ plus, at least to me,
because C++ plus is just a beast of a language.
And so he couldn't come out of that
and not wonder, could we do something better?
And as you said, we'd build a lot of this fundamental technology
could build pretty much anything we wanted at that point.
And so I just started, again, nights and weekends,
didn't ask permission, just started fiddling around,
seeing what could be done that would be better.
And to me, what I did was actually said,
okay, let's go look at a lot of existing languages.
So let's go look at both the new ones.
Let's look at Java or let's look at C-Bos-plus things.
But let's also look at functional languages like Okamol,
and Haskell and things like this.
Let's also look at esoteric languages,
like the really weird other things.
Let's look at typescript and dart
and other things that were kind of on the scene and happening
and go and languages like that.
And so it was really about saying, like, well, what do I like and how do I see success?
And where I defined success as was Swift was making it easy to use and scaled.
And so I saw a lot of languages.
JavaScript is probably the best example of this where, you know, you start with something really simple.
You know, we can hack with JavaScript.
And then it gets adoption.
And then you get developers.
And then developers bring their tools to other domains.
And so, you know, JavaScript was designed for on-com.
click handlers, but now people are wearing web servers in it.
Yep. Well, and so...
We see it out all too much.
Exactly. And so what my observation was, is that success meant that you would be naturally
brought into other demands. And so think beyond just the first win, but think towards more
generality and scale. And what I realize is if you take something like Objective C,
which I was deeply familiar with, for example, you know, this dichotomy between objects and
C, so Objective C had this small talk-inspired object model.
and then I had C for performance.
It was actually a really beautiful combination
because you can get all the way down to the metal with C,
but you had high-level APIs.
And what I realized is that they didn't have to be two languages.
So you could take the good parts
and pull it together into one system that could actually scale,
and it would be way easier to teach,
way more memory-safe, way more modern,
good type inference,
fix some of the syntax problems
because Objective C was a little bit of a mash-up
between two different worlds.
And the funny thing about that is that
you fast forward all the way to today, that's exactly what Python is.
Python has a beautiful language for objects built in top of C and C++ and rust, and it's got
this two-world problem going on within it. And so it's very interesting to me, just, again,
how history rhymes with itself, and there's so many different things going on. But the initial,
the initial ideas was Swift was really about saying, how do we build a scalable language and how do we
pull together the best ideas, shamelessly, you know, borrowing ideas from different places. And by
scalable as you meant that, that means that it can be used for like your specific task that may
that be, let's say, for iOS, but then it naturally lends itself to like other domains.
Yeah, and so specifically scaling all the way down to embedded systems development,
all the way up to something that kind of approach scripting, right? So super easy use for very
high level programming. Can you tell me how did you design a language in the sense of like,
as developers, we use a lot of languages, right? Like, again, like I think all of us like have
tried and build with many.
different ones, especially with AI tools, which it's so easy to try out, you know, for myself,
like TypeScript, Russ, go, you know, the Haskell try out, this and that prolog.
Like, and I'm used to using them, right? And I kind of discover it, figure out what is there.
But when you have a black canvas or when you were thinking, like, how do you go about it?
Do you kind of like, I don't know, a meta program, like going to write down what you love to see?
You also, of course, build compilers.
You start to immediately think of how that will work.
Like, can you give us a little bit of a view of like what, what?
How would you go from this blind canvas to like,
okay, here's what I think the language will be?
Well, so that's a good question for me in general,
because I start from a lot of blank canvases across my career.
Right, so L-O-V-M, blank canvas.
Yep.
Right, MLIR, later, blank canvas, mojo, Swift.
Like many of the systems I build are blank canvas projects.
And so for me, it's a couple of things.
First of all, doing that kind of analysis,
the reflection, the deciding what is success ultimately look like.
Then do the survey.
Go look at what's out there.
What's good?
What's bad?
One of the things I really strongly believe is that you can take a,
a system that you don't like, but buried within a system, even if you don't like it,
that are often good ideas.
And so you can take the ideas that are good and ignore all the rest.
So, for example, Java, I didn't like garbage collection and finalizers.
And like, there's all the stuff that came with Java, but being able to be Jit compiled
was very powerful.
Right.
And so there's a lot of things that you can look at and you can say, and Java is a huge step forward
from the technology space for a variety of reasons.
But you can look at that and you can say, okay, well, I like that one aspect of it.
all the rest of these things are separable
and maybe I don't like how they work
and do that survey. And then
when I start building a new system,
I expect to design and redesign
and iterate and learn and grow. And what
I try to do is I try to go through
proof points and validate specific assumptions
and say, okay, I'm not going to
solve all the world's problems. I'll keep this part
and this part and this part super simple. It's like
the two circles that make your owl.
Just leave them the circles.
And then invest in proving
this piece. And if I can get this
done and I can understand that it works and I can gain validation and confidence in that.
I can know the interfaces going in and coming out of it.
Well, now I can go out into the next system and build that out and build this thing out so
that we can understand it and learn as we go.
But every step along the way, being unafraid to go massively change and flip over the
table and try again.
And so you started to excrement with Swift, like the weekends and on the side in 2010-ish.
How did it take until 2014 for a first release?
Like what really goes into building out a language?
Because again, I think we can all, I can kind of emphasize, like, this is cool.
Like you have these ideas.
Of course, you have a lot of experience.
You try out things.
But then, like, as a software engineer, I shouldn't ask this.
But I'm going to ask us what took so long.
Oh, well, so it turns out building programming language is not easy.
So it takes about four years, I guess.
Can you give us a bit of like empathy of like, you know, like what makes it so hard?
Yeah.
Well, so let me just frame up how it went, just so you know, because the first year and
half, it was literally just me working on it.
Oh.
Okay.
So nights and weekends, and I had a day job, and I was managing a big team and a lot of stuff
going on.
Hold on.
So at this point, yeah, you were already like pretty second level manager or something and
had a team of 40 people.
Oh, wow.
Running, yeah, a lot of very interesting and very fun stuff.
Wow, okay.
You're good at juggling stuff.
This is passion project.
Yeah.
I'm also a fairly good programmer, so that helps.
but after a year and a half got to the point where I told management about it.
I said, hey, I'm working on this thing.
What do you think?
They're like, uh, yeah, uh, no, why would we want a new language?
Objective C is what made the iPhone successful.
Right, right?
And so like, no, what?
And I'm like, well, how about I just have like one or two of the people in my team, like,
work on this part time.
They're like, uh, I guess you can tell them about it,
but we can't commit resources really.
You know, and so, and so, but, you know, again, I was respected and doing good things.
And so I was given a little bit more rope than some other people
would. And it got to the point of saying, like, okay, let's build up some demos. And then it was
about three years in, and most of the feedback I was given was, okay, well, if you don't like
Objective C, go make Object C better. You own Objective C also. It's a natural reaction. I mean,
as much as this engineer, I don't want to hear it. If I put a little bit of business hat on,
it's what you would say. Makes total sense. That's what the entire community is using. It's very
low risk. It makes your existing investment even bigger. And so what I did was I said, okay,
well, across that four-year journey was not just building the thing full out with like a team of people that knew what they're doing.
That's the discovery, the research, and also the socialization process with executive leadership.
And so what I realized is I said, okay, well, I want to have memory safety.
Objective C does not have memory safety.
That was one of the biggest things.
Retain and release, and it was completely manual.
It was just kind of a nightmare.
And so we invented ARC, so we have doing reference counting automatically.
That made Objectsie way more teachable and safer.
it also pulled it much closer to Swift.
We then did modules. We then did literals
and objectives C. We added all these features
that what was doing was pulling the lived
experience of doing Objective C programming
closer to what Swift would be so that
ultimately when Swift came out, it would
be less of an abrupt leap
and it was still a pretty big leap, but
we were doing that.
But meanwhile, you know, management's like, okay,
cool, why are you doing this? And they're having their own issues.
And we got to about three years in and said,
okay, well, we think we should do this
in a serious way. At that point, I had a
team of 250 people-ish, something like that.
I was running Xcode and the whole developer tools team.
And still on the weekend, you're still doing this.
Well, yeah, you can look at my GitHub history.
Like, you can go back to 2014 and see what I was doing, yes.
But at that time, right, they said there was this big executive review and a whole bunch
of internal discussion.
And I said, basically, look, I want this to be the focus of the department to bring
us into market.
Yeah.
And so at that point, we didn't really have it talking to iOS.
Right.
It was still missing a lot of the object.
features and things like this. We had no apps that were built with it. And so that last year was a
massive amount of work in terms of finishing language, but also integrating with the debugger into
Xcode, into code formatting, and to tons of different features. Building more integration
with the iOS SDK, that was a big deal, and we had not done that. And so a lot of that made it
possible to finally launch it four years later. And I would say when we launched it, so a mistake we made
as we called it 1.0. I was about to come to this because that's where I have my first
has an experience with 1.2. But yes, 1.0 came out and, you know, the feedback I can just talk
from, you know, like what I remember from there. There's two things. People were super excited
that Apple is releasing a new language. And iPhone was huge. 2014 was the time where like you
knew that you could make a build a big business on iPhone, Uber, I think Snap. All of these
companies were just going up. It was the status symbol already. And not.
everyone had iPhone just yet. However,
like when you tried it out,
it was just not good. Exco didn't have
refactoring support. Debugging was a nightmare.
We sure launched it as 0.5.
So that's, so
how did it go for?
Why do you decide at the time? Like, you know,
just I'm interested in your thought process at the time
on how we went about it and, you know, like what were some
learnings now that you look back? Yeah.
So, um, so it had to be
one auto to launch it because Apple doesn't launch
0.5s, basically.
So if we want to launch, it had to be called 1.0.
So that was one aspect of it.
But the other aspect is we wanted to get it out there.
And at the point that we launched it,
only about 250 people at Apple knew about it at all.
Most of those people are in my team,
and then some executives in marketing
and other people like this.
And so we knew we couldn't build a good programming language
in a vacuum with an NDA that you had to sign to get access to it, right?
You have to actually get it out there.
You have to get usage experience.
And so the learning for me is don't call some things or 1.0
until it's actually run.
ready and it's validated and you have a lot of usage experience and things like this.
We bring a lot of these lessons to Mojo, by the way.
Chris Lathner and the team launched Swift back in 2014, and it since become the de facto way
of building native iOS apps, which brings this likely to our season sponsor, Linear.
Linear's iOS app is also built on Swift.
It's a fully native app.
It's not built with React Native or using a web wrapper.
And there's good reason for this.
Linear's entire philosophy is about speed and performance.
The desktop app is famously fast, but the desktop app is famously fast, but the internet.
But that same obsession about speed carries through to mobile.
Linear have just redesigned their mobile app with something pretty interesting.
Their own take on Apple's liquid glass design language.
But instead of just using Apple's API under the hood, they rebuild the UI from scratch.
The linear team wanted more control over the navigation and customization than Apple's system
would allow.
Classic engineering mindset.
If the existing solution doesn't give you what you need, build your own.
Here's my favorite detail about the app though.
Linear has this feature called Pulse.
It's basically a feed that gives you an update.
on all your development work.
Project updates, what's on track, what's blocked, etc.
On mobile, you can actually listen to it as an audio briefing.
The linear team told me how they get raving feedback from engineering leaders
to listen to their polls update on their commute.
It's like a personal podcast about the team's work.
When VP shared how he gets to his entire stand-up prep
while driving to the office using polls.
This is why I love partnering with engineering teams, like linearers.
They're not just porting desktop features to mobile.
They're actually thinking about how people use their phones differently.
If you want to see what they're building, check out linear that app slash pragmatic.
You can use the mobile app to manage history tracking on the go,
and that liquid glass implementation is genuinely beautiful engineering work.
And now, let's get back to Swift 1.0
and how the language is pretty painful to work with on launch back in 2014.
But also, it's about making sure that we communicate clearly with the community.
And so one of the things I think we did well is we said, look, internally to Apple, we said,
look, we need to launch this, we know it won't be perfect.
And so we're going to tell the community that you can adopt it,
could build apps and submit them to the store, but we will break your source code.
And this is because we know that we have to change language when we get usage experience.
And so we told people, like, we will help you, but expect code breakage.
And I'm glad we did that.
I remember that was really clear.
People were upset about it, but they knew.
Like when we had a discussion of should we move over to Swift or not, and at Uber, the rewrite happened in 2016, where Swift was added 1.2.
and the platform team made the decision after evaluation that they will take a risk and it was a big risk and there's now stories about it.
I have a blog post.
There are some other people with their experiences.
But they did it because they knew that Apple is committed.
I think the open source part also helped build a lot of trust.
And I almost feel that that was one of the last times that I saw the Apple community really unite against like iOS developers specifically just go like, you know,
what will be in the pain together, which we know it's going to be painful, but like all mobile
engineers I know knew that this was mainly told their management and all that. They said, this is
the right thing to do. And of course, I think it really helped that Apple have like this trajectory.
And it was like, okay, well, they didn't want to be left behind. They wanted to risk not having
access maybe in the future one day to APIs. Not at that point. One thing I was surprised when
Swift was launched. I don't know if this was right on launch day, but Excode Playgrounds and the
Ripple, the read about print loop, was also part of launch day.
Yeah.
Was also part of launch day.
And for those that don't know Xcode, it was like, you could just like try inside Xcode.
You could just try Swift.
And with Ripple, you could also just really easily play.
Why did you decide that that needed to be there on launch?
Well, so part of that came from the year long process of like getting it to launch.
And so part of it was, okay, if we're going to do a new language, what are we going to get out of it?
What is the developer experience?
everybody wanted to get something, right? And so, for example, the documentation team, who I love,
said, this is our chance to write a book, which was awesome, because Swift had a book on launch,
and it was fantastic. The ESCO team said, okay, well, and management and marketing people said,
okay, well, can we get interactive programming? This is something we've never had, and so we want that,
and so that's where playgrounds came from. And so the REPL came from me saying, okay, well, modern languages
should have a REPL. This is a thing that builds into the debugger, and there's a bunch of
of different capabilities we could use, and it would make a lot of sense to make it much more
modern feeling, and if we're going to do this, let's do it right. And so there's a lot of that,
okay, if you're going to do it, let's get the advantages from it. Now, the flip side of it is we
were massively overextended. It's a lot of the vision was right, but particularly at 1.0 and
even 1.2, many of these things were still unbaked, and it took a long time, and I think that we
underestimated how hard some of these things were, and waiting for 2.0 or 3.0, you know,
maybe it would have been better for some of this.
But anyways, it's easy to be negative and judgmental in hindsight.
At the time, we're all working really hard and trying to make something amazing.
And we're learning as we went.
And I was really thankful for the community.
But one of the other things I'll say is that while you may have been on the, oh, wow, awesome thing and let's adopt and let's get ahead of the curve, there are just as many people that were the, what are you talking about?
Objective C is great.
If they're broke, don't fix it.
And they're like shaking their fist and saying, like, I will never use a new thing.
blah, blah, blah, blah, blah, and there's all these bad things, which they were right about with Swift.
And so it took quite a long time for the community center of gravity to move.
Well, this was a fun fact.
In 2016, I remember middle of 2016, or maybe a little bit earlier, Uber was one of the big mobile first.
Companies built on iOS, grew up on iOS, lived and breathed iOS and of course Android.
And the mobile platform team ran a survey in sometime early 2016, two years after Swift was out.
this is SWIP1.2 saying, we are planning to rewrite the app. The rewrite was a thing.
Just because they needed to need it to redo every single screen, every single interaction,
which kind of meant up a rewrite, all the business logic. So it was going to be a rewrite no matter what.
And they ran a survey, would you prefer, you know, as an iOS engineer, objective C or SWIFT?
And the results were pretty much 50-50.
Yep. So, and these were, again, very experienced engineers that they knew. And the two camps were
violently disagreeing with another. And I don't, I don't.
I still don't know why Uber went with Swift.
Someone was a tiebreaker in the end.
But back then, this is how foot on foot it was.
And of course, there was grumbling.
There was problems.
So the objective people said, like, told you so.
We could have just done.
There was at some point thinking of like, maybe you should have go.
But as you say, it was not trivial.
But I guess this is what you don't like see in hindsight.
Well, so, but I've learned a lot.
And again, you reflect forward because this impacts exactly what I'm working on now.
Same thing.
Yep, with Mojo.
Experts don't like change.
Right.
What I got.
This is counterintuitive.
Well, I mean, if you think about it, like, if you're an expert object to see programmer, I don't know if you were, but if you're an expert object to see programmer, you've been doing it for five or ten years.
You've learned all the bag of tricks, all the weird cases, you know, all the arcane thing.
You are the person that people turn to for help.
Swift comes out.
Now, your expertise is invalidated.
You're on day one, just like everybody else.
You are not the expert.
You don't really want a new thing.
you want to be the king of the hill of the old thing.
And you don't want the new thing to be successful.
And so now some people are intellectually flexible and will switch over,
and there are early adopters and things like this
or people that like new things.
And so it's not true of everybody.
But it is actually sensible human behavior
to protect your prior investment.
And for a lot of people,
you have to have a really good reason to switch over.
And so in the case of Swift, you know, Swift UI was like a reason
that moved some of the long-tail people over.
Years later.
Years later, right?
So there's different inflection points.
And so what you look at is it's almost technology diffusion, right?
And so they're early adopters, and they'll get on board quickly just because it's a shiny new thing.
Yeah.
And then there are some people you have to drag objectives to the other, they're called dead hands.
Right?
And so, like, everybody else gets mapped somewhere along this curve, and it kind of looks like an S curve.
But this is interesting because this will, you know, we're going to get to like some of the, like, AI and LMs, but I feel this playing out.
That's right.
That's exactly the same thing.
And it's part of human behavior.
It's not about Objective C.
Now, let me tell you the thing that I find
that I'm most proud of in this journey.
So when I started 0.10,
and it was just a toy project,
like playing around within spare time,
not expecting to go anywhere,
I was kind of implicitly coming from the assumption
that we could do something that was better
than C++ and Objective C
and that was more easy to use and things like this,
but didn't really know how it would go.
After we launched,
I would have people that stopped me in the streets
and said, thank you, Chris.
Because of Swift, and because of not just me, but the whole team working on this, I was able to become an app developer.
Before Swift, I tried building apps with Objective C, but I could never figure out the pointers and the square brackets and it would crash.
And I couldn't, and like, I'm not smart enough to build an app.
But now with Swift, it became easy enough that even I could do it.
And now I am an app developer instead of a web developer, and I've become the expert in my company.
And now I have learned and grown in my career.
And so now, thanks to this new technology,
like I progressed.
And if not for this enabling technology,
it never would have happened.
And that's the exact same thing
happening with GPUs right now.
We have the exact same thing
with CUDA and C++ plus and all these tools
that are gatekeeping
so many developers away from this emerging
kind of compute and kind of technology.
And it's the same exact thing.
And so this is what really makes me passionate
about working on this kind of technology
because I believe in the power of programmers.
I believe in the human potential of people that want to create things.
And that's fundamentally why I love software is that you can create anything that you can imagine.
And to me, that's what's so powerful about working in this space.
I really feel it and see it.
Just looking back on SWIP, because you created it.
It's your baby.
It started as a project.
You were a guardian of it or a shepherd or however you want to call it for a while.
And of course, now you've stepped away.
It has its own life.
Looking back at it, how do you feel on how the language has evolved?
what are parts that you're really kind of happy on how it lived up?
What are parts where you're thinking that maybe if you would have stayed longer,
maybe you would have tried to have it in different ways if there's any.
So it's very hard for me to say if I'd stayed at Apple and if I had kept my fists on
and controlled it if it would have been better or worse.
But I can talk to the mistakes that I made, right?
Because in their learnings that I could bring forward,
and that way it's my fault.
And it's not me saying other people weren't able to do it, right?
I like that.
But early Swift, it was, you know, I'll just say it bluntly, I didn't know what I was doing, right?
So I'd never done it before.
I'd build a C++ plus compiler, which had a spec.
Well, it was the first language you ever built.
It was the first language I'd ever built, right?
And so a lot of the ideas going into it were new.
A lot of the type checking algorithms were exciting, but I didn't really know what I was doing.
I'm also not a math guy, and so if I actually understood math, then maybe it would have been obvious.
but things in Swift like
Expression too complex to type check
and reasonable amount of time, that's my fault.
And there's specific features
that cause that to happen, and it's very
difficult to fix. And
so that's one mistake. Another mistake is
that we were very ambitious
in terms of adding features that we wanted to see in the
language, and we added them faster than the
architecture could keep up. And so particularly
if you talk about Swift 1, but even
today, what ended up happening is
the internals of the compiler and the design didn't
really line up the right way.
and so you'd get a lot of things that work kind of 80% or 90%
and they didn't quite fit together
and then as you start adding more and more and more stuff,
it doesn't really line up right
and it just gets more and more complicated.
C++ has the same thing going by the way,
and Swift's not unique this way.
But as a consequence of that,
you get an emergent complexity
that then developers can feel.
And Swift today, people make fun of two make keywords
and things like this.
Well, it's because the origin story
was really not, in my opinion,
didn't put enough energy
into like really refactoring, really redesigning,
really making sure the core was great.
Instead, it was, okay, let's get to app development
as quickly as possible.
And so as a consequence of that, what happened is,
and I and many other good people put more stuff onto
and more stuff onto and more stuff onto
and the language got more complicated.
And so, you know, partially that's maybe in any individual decision,
but also it's because some of those original things
didn't line up the right way.
And so the complexity kind of compounded.
Wow.
I'm just like hearing
we have a lot of tech depth inside a language
and it builds up.
Yeah.
Well, I mean, this is just one way
of putting it, of course,
you put it differently,
but it just feels this all repeats itself.
Like it's whatever project you do.
Absolutely.
Well, let me talk about like a simple example
in C++.
In C++, int is hard-coded into the language.
Yep.
Float is hard-coded into the language.
Complex numbers are not.
They're templates.
Why is that?
Why don't you just get rid of these concepts,
have one fewer thing,
and just make int and float part of the library.
And then everything's more self-similar
and everything will work the same way, et cetera.
And so Swift fix that problem, right?
So int and float are now part of the library,
and all the stuff is much simpler in that way, right?
But C++, you can't go back and fix it
because it comes from C.
And so they added templates after C,
and so all the C stuff had to be hard-coded in.
They couldn't fix that.
And that's the origin story of why C-plus-works
the way it does.
when I was talking with Steve McConnell on the podcast, he told me that he, when he worked at Microsoft for a while, the best developer he met there, did this thing where he wrote everything three times. First, he wrote a first version in a first version in like a week and a half, that a second version in three days and a third version in a day. And when I think back of the history of Python that I just heard there's documentary where we can listen to it, it was also the second or third iteration of language. And what I'm also here is,
hearing now is also, you know, like in Swift, you saw some mistakes in C++, but you couldn't
forecast some of the things. But, but now we're going to get to Mojo as well. And that's always
my goal, right? The goal is, so you're always going to make mistakes in life. So there's things
you can do. You can decide to make new mistakes. So learn from the past and make new mistakes,
not old mistakes. And then you can make the decision to fix mistakes wherever possible. And then
you try not to paint yourself into a corner. And so, again, a thing that I think Swift did well is
it said, okay, well, the initial release was 1.0.
That was predestined.
But we're going to break the language.
And so Swift 1, Swift 2, ultimately to Swift 3,
broke the language to try and make it better.
And then only at Swift 3 did we say,
okay, now it's stable, right?
And I think that was really good
because can you imagine if it's still Swift 1.
But we would be left with all these mistakes?
All the mistakes, right?
And so it was painful,
but I think that was a good set of decisions we made
and that led to a much better result.
Yeah, there's a lot of tradeoffs.
So as we get to what you're doing today,
you're after Apple,
you worked at a few interesting places,
Tesla, Google Brain,
this is high five.
Can you talk about what you've learned
at each of these places and kind of how it's,
because I think it'll be interesting
to look back on how it just all is leading
to what you're doing now.
Yeah, totally.
So the way I frame it up is that,
you know,
one dot over my career was at Apple
and building developer tools
and understanding CPUs and I built OpenCL,
which is a GPU thing also,
and compilers for the GPU
and like all this kind of stuff.
And so learning a lot about that hardware software boundary and developers and developer tools and Xcode and like that whole thing.
2016, I fell in love with AI.
And the reason was, is that about 2016, the Photosap had the ability to tell you, is this a cat or is this a dog?
And it was a very, like it was inception v1 or some really ancient AI model was in there.
I'm like, I own developer tools.
I know a lot about programming.
I have no idea how to write an algorithm to detect a cat in a picture.
How the heck does that work?
So I fell down this rabbit hole in 2016, and this is what leads me to 2.D.O. of my journey,
which is a very AI-focused, like, go figure all this stuff out.
And so Tesla is doing Applied with self-driving cars.
That's where I learned CAFE and TensorFlow, and I realized TensorFlow was a good thing, but it could be a way better thing.
And so that led me to join in Google and helping scale TPUs.
There I ended up owning the bottom part of TensorFlow, all the CPU, GPU, TPU, plus a bunch of other weird internal ASICs within TensorFlow.
but notably I scaled the software platform for TPUs.
And across years, learned both, hey, it's actually really hard to make AI software without Kuda.
TPUs don't have Kuda.
Kuda, which is NVIDIA's way to program kernels, right?
Yep.
So basically the entire AI ecosystem back then, but also today, ends up being very heavily biased by
Nvidia and by their API and software and language toolkit called Kuda.
Which is like 20 years old or so?
It's about 20 years old.
And so it's awesome, and it's very mature, and there's a lot of good things about it, but it's also 20 years old.
And so there's time and space for new things.
But when you're building an ASIC, so TPUs are a very large-scale data center training and inference accelerator, very fancy, very frontier, particularly back in 2017.
You don't have software.
And so you have to create everything from scratch, and then you have to get TensorFlow and Pytechorch to talk to it.
And nobody really understood how that worked.
And so across the years, I learned so much.
And I'm so thankful for my experience at Google
because I learned about the algorithms,
I learned about AI, the frontier applications.
Like the attentions, all you need paper,
was invented in that timeframe on TPUs.
And so we said, wow.
Yes. And so tremendous amount of things
going on large-scale embedding systems for ads and recommenders
and stuff like this was gone.
It was a brilliant time.
And so across that, kind of an early version of what I'm doing now
said, okay, well, we need to reinvest in the software technology
and the platform for technologies.
And so I built components of what is now, I think, pretty widespread AI software.
So there's this MLIR framework, new runtime technologies, a whole bunch of different things.
So each of these things ended up being kind of 2.0 of the first leg of my journey.
And so MLIR is a compiler framework.
It's kind of like LVM2.0.
And so it's a rough analogy, and we probably won't go into NLR full details.
But it's for ML, right?
Well, it's for domain-specific chips.
The main for civic chips, yep.
Yep.
And so, but I could not have built that without the experience of building L-OVM, right?
And so having gone through that experience and being willing to do a 2.0, a clean slate, you know, looking into the abyss, you know, you're standing on a cliff and it's just blackness around you and you could do anything.
But you have to decide where to start, right?
That really helped having gone through that and having made a bunch of mistakes with LLVM and being able to then fix them.
And so we're able to build something that has now been widely adopted by roughly every AI company or,
every AI chip company that exists, which is very exciting.
Wow.
And through that journey at SciFive, for example, then said,
and SciFiFi built CPUs, right?
So, one of the best in the world, as I understand.
So SciFive builds Risk V hardware.
Risk five hardware.
Yeah.
And so I decided, okay, I've always been on just the software side of the hardware
software boundary.
How about we go play on the other side?
That sounds fun.
And so I learned a tremendous amount about physical design and chip design
and the business model of hardware.
and IPs and like all the different technology for verification and things like this and
tremendous amount of fun.
And one of the things we decided to do is let's go build an AI chip or an AI IP.
And so you get to that and then suddenly you get to where's the software?
And so across like each of these legs in my journey, I'm like, okay, where is the software?
Where's the software?
And in the case of AI, there's basically nothing there.
There's nothing that you could take off the shelf that provides an end-to-end AI solution.
that you can then just go implement your chip.
And so, I mean, the origin story of modular really came down to my co-founder and I
who were buddies.
We got to know each other previously at Google, but really saying, like, you know, we need
something like LVM, but for AI.
For AI.
Like, we need the thing that people can implement support for their chip into it and then
get an AI software stack because, you know, if you're building CPUs 20 years ago,
you didn't want to have to build a whole C++ compiler and C compiler and code generator
and kernel and web browser.
you just want to do a little bit of work and then get software.
And that doesn't exist for AI right now.
And then for those of us who are software engineers,
but we don't build AI software, we just, you know,
we use the models, we might invoke the API or we might invoke the trained one.
Can you explain to us if, you know, we moved into the world of actually building AI applications,
running them directly on GPUs, may that be MBIDIA or now there's AMD or Google TPUs?
What software would we use today?
Like, what is the status quo or what was at least this?
status quo when you started modular? And then what is your vision, which I sense there's a
bunch of similarities with like GCC and LLBM here. Yep, absolutely. So, um, so before GCC,
every chipmaker had to build it on software say. That's literally the shape of AI software today.
Because there is no thing to plug into, besides modular, uh, everybody has to build their own
vertical software stay. And so when I was at Google, we built a thing called XLA. That's the thingy
that we built, kind of like Kuda, but for Google TPUs. Or Google TPUs.
Google TIPUs.
If you look at Apple, they have metal and MLX and their whole stack.
If you go look at AMD, they have Rock M, their whole stack.
Everybody has to have their own verticalized stack, and they share very low code.
Oh, so this is why I was just looking at, I talked with some folks on Tropic and look at their job sites, and I saw that they're hiring both Kuda engineers, but also Google TIPU kernel engineers.
And Traneum, which is the ATS show.
And Traneum.
And so they're writing the same models three times over and over again.
depending on the hardware they do.
And then I guess when it's a new version,
okay, I get it.
So this is where we are today,
or where we were at least.
Well, that, and so, like you mentioned Anthropic,
they have an amazing engineering blog post
from a few weeks ago.
I can remind me I can dig up the link.
Yeah, we'll put it in the show notes.
It talks about the problem
of re-implaying same models three times over.
Turns out they don't work right.
You have bugs.
You have quality problems
that then users can feel in your product.
You have outages because
implementing things three times
in parallel with three different teams,
of course they don't wind up.
So where modular came from was saying,
okay, well, I've lived this experience with Google,
with Tfu's, and a number of other chips, by the way,
but TPUs are the public one,
have seen this at SciFive building into the hardware ecosystem.
At that point, this was about four years ago at this point.
The AI hardware ecosystem wasn't mature,
but suddenly I had built up this wealth of knowledge
of how to do this stuff.
And when I looked around at all these projects,
I looked at PyTorch, I looked at,
And there's onyx runtime and like there's a million of these things out there.
I don't think anybody's on the right path.
Nobody was willing to do the work because what I saw is I saw the problems I faced at Google.
I love my experience of Google.
I learned tremendous amount that people are amazingly brilliant.
But the problem that I had and that we had as a team was that every year there'd be a new chip.
And every year you're scrambling and get the next ship to work.
At the same time, all the product teams are using the current chip.
And so all your best people are getting assigned to put out the burning fire.
for search or ads or whatever the work, YouTube,
whatever the workload is.
And so you could never actually get the mindset or the bandwidth
to be able to do something fundamentally new,
particularly if it took more than a six or 12-month performance review cycle.
Oh, yeah.
Google was infamous for that.
And so what I saw is I saw a lot of these very short-term projects
and a lot of improvements and a lot of micro-optimizations
and a lot of great work, a lot of massive innovation in models and the architectures.
But the progress, the research progress, was happening
because Google had infinitely smart engineers
that had built every level of the stack
and could hack the daylights out of it
just to make simple things work.
It wasn't scalable, it wasn't beautiful,
it wasn't built to compose,
and it certainly wasn't built to bring up new hardware quickly.
And so, as you know,
I've been working compilers and dev tools and languages
and stuff for quite a long time.
And so I said, okay, well,
I think it's finally time to solve this problem.
But to solve it,
you can't solve it in a weekend with, like, three people.
You need years of time and a deep, very high-tech team to be able to invest in this technology
because Kuda, as you mentioned, is 20 years old, it's had thousands of people work on it.
Huge investment.
Any one of these chips and these algorithms is extremely complicated.
The AI research landscape moves really fast.
You need a huge amount of effort.
And the only way to tackle this was with a VC-funded startup.
Because that's the only way we could pull people out of Apple and Nvidia and META and Google
and all these companies to be able to go build this.
And you said something interesting in a previous podcast or interview.
You said between kernel engineers, compiler engineers and AI engineers, they just kind of think
differently and there's not much overlap.
Can you tell me a little bit about, because you have been all three.
You, of course, know, know, so many of them.
But like, what is the difference?
In the end, you know, they all need to work together at a place like this.
But you said that somehow they don't, like, understand each other.
Yeah, so I'll tell you the aha moment that.
came to our technology approach for modular.
Again, if you go back four years ago, you're talking 2021.
Funny how that works.
Okay, so late pandemic, a lot of amazing stuff had been done.
And a lot of people have built, first of all, TensorFlow and Pytorch.
I consider them to be some of the very early production frameworks.
TensorFlow PiTorch both come from 2015, by the way.
So they're nearly 10 years old, which is ancient and AI time.
Right.
And so they were built under this idea of we will take,
Kuta kernels and we'll take Intel
MKL kernels and
then we'll build Python APIs on top of these.
And so one of the challenges with that generation
system is that it was
so easy to write kernels that people
made thousands and thousands of kernels
but then they couldn't be ported and
you had to rewrite thousands and thousands of kernels
to be able to bring up a new chip.
When we worked on TPUs, we said,
aha, you know what's cool? Compilers.
And so instead of writing thousands and thousands
of kernels, what we can do is we can use compilers
to generate the code and automatically
synthesize pernals on the fly.
This was a huge thing, and we said,
hey, look, we can get performance, we can get
things like auto fusion, where you merge
two kernels together.
And compilers are a lot of fun, by the way.
I encourage people to learn them. And so that was
great fun. Now, the problem with this is
where you can't
scale kernels by writing thousands
of kernels every time you bring up a new chip,
turns out AI moves really fast.
Turns out compiler engineers are lovely.
I love many of them.
But there's very few of them. There's very few
of them. And so you now, what you ended up with is you ended up with a bunch of these technologies
where it was cool because you could support, you know, some compiler for some ship, but suddenly
you couldn't do flash attention or some new algorithm comes on the scene. It's a massive
breakthrough in research, but the compiler people couldn't be in the loop to implement the new
sparsity or the new float four format or the new this or that or the, and so having compiler
engineers in the loop was a huge problem. And it held back, for example, Google TPUs, a lot of what
happened, particularly when Gen AI, Gen AI came on the scene, really invalidated that technology
approach. So I believed, and so going into this, and also part of the learning from working
on TPUs, is that we needed programmability. We need full power over the hardware. We didn't need
the wateredown, simple solution that made it so that you got most of the performance.
What I believe in is getting all of the performance. And so... Otherwise engineers would just like
write kernels as they do. Exactly. Exactly. And there's so many technologies you see this across
software where it's like, okay, it works up until a point, and then you have to switch to a
different system or a different stack, and I don't like those things. Go back to Swift. I wanted
to be able to scale from embedded all the way up to application level. Like, actually
unify this. Don't make it so that you have like fragmentation in different systems.
And so what we bet on was we bet on two level stack. We bet on Mojo.
What programming language? A brand new programming language. And as we all know,
making a new programming language is doomed and like never works. And you cannot get
adopted, like all those things.
Well, you have a good track record, though.
Yeah, it's actually just a hard problem, but you have to be smart about it, and we can talk
about that.
And then on top of that, saying, like, let's take this programming language and design it so
it can solve these problems and scale across hardware, etc.
But then take these compiler algorithms and put them into the language so that you as a
mojo programmer get the power of compilers without having to be a compiler engineer.
And so a lot of what makes Mojo really special is it takes power out of the compiler and gives it to you as a developer
because it turns out a lot of people can write code.
They can't necessarily write compilers.
And so what this does is breaks open the talent problem.
And so now we can have a lot more people building these very highly adaptive, highly scalable, highly composable systems in ways that they couldn't do before.
So can you tell me about both Mojo but and also how this language can actually break the compiler problem.
That part, you know, the new language, I think we can all understand, but how you can get through to compilers and get the performance out of it, how does the language do that or how does the stuff around it are under and I need to do it.
Well, so again, this is my life's work, my journey, and so I've made a lot of mistakes along the way.
And so a lot of it, as we talked about before, when you don't know what you're doing, go learn from other people and go decide what they've done right and wrong and what you like.
for me, a lot of it was learn from what I had done wrong
and make sure to do it better the second time.
And so one of the things that classical compiler engineers
and myself back in 2005 or something,
which is now 20 years ago somehow,
was that compiler engineers want to show
how awesome their compiler can be.
Let me show you.
I can influence this optimization.
I can detect this pattern in this case.
I can make this code go two times faster,
or five times faster, 10 times faster.
The challenge with that approach, and this is, actually, this is really important for benchmarks,
because if you think about, if you're a compiler engineer, you generally can't change the source code.
Like, you know, if you're optimizing for the spec benchmark suite or something, the source code's fixed.
And so all you can do is implement crazy stuff within the compiler.
Within the compiler.
And it's a power matching and transformations and maybe change the memory layout of something, using some heuristic.
But there's a problem for that as a user.
And so if you're a programmer and you're using one of these compilers and say you're working with a loop vector,
For example, loop vectorization can make your code go four times faster by using SIMD to take advantage of the hardware, right?
But it requires a lot of pattern matching on your code.
And it has analysis of how you use pointers and all kinds of other stuff that all has to line up for the optimization to work.
So now you can have this wonderful moment where you pick up one of these magic compilers and then you get amazing performance.
You're like, wow, this is cool.
It goes four times faster.
I love this thing.
This is great.
But then you go and fix a bug.
If you fix a bug and you break the hack in the compiler,
well, suddenly you get a four-time slowdown.
And you're not sure why,
and you cannot really debug it unless you're a compiler engineer.
Exactly.
So now you file a bug, and again, there's not enough compilers
in the compiler engineers in the world,
and so they may or may not look at it,
or they may tell you you're holding it wrong, it's too bad,
like it has to be this way, right?
But what that experience is,
is that's caused by compilers
that are trying to be sufficiently smart.
And the sufficiently smart compiler never works.
What it ends up being is it ends up being an abstraction that's leaky.
Leaky abstractions exist in lots of software and lots of different domains.
But when they exist in a language, it means that you have an unpredictable programming model.
So fast forward to today and AI.
Today and AI, basically people want peak performance because GPUs are really expensive.
As you said, if you build a system that doesn't deliver performance, they'll switch to a different system because they have to get that performance, right?
particularly for the really large-scale frontier labs and things like this.
They just squeeze everything out of it.
They need to. It's required. The cost is just astronomical otherwise.
Well, you can't have a system that is like unpredictable or flaky or weird.
You want something that has control.
And so what Mojo does, unlike Swift and many L of M languages,
Rust, all these guys build on top of the same kind of technologies,
we go for a much simpler compiler.
It's way more modern. It's way fancier than what like Swift or Rust
languages like that are built on for a variety of reasons,
but it's way more predictable.
And so it puts control on your hands.
And so instead of, for example, vectorization,
instead of like hoping to magically do it for you and maybe it works out,
maybe it doesn't, it gives you library features or you can say,
hey, vectorize this for me.
And I saw one at Mojo, of course, there's a tutorial and getting started.
I saw there's these special character slash keywords where if you want to,
I mean, you can just like write code that looks like Python is a subset of it.
And you're like, whatever.
But then you can kind of take control,
So like as a, when you're coding, when you're like, you decide you want a certain compiler feature that it supports.
Can you give examples of like what I can do with it?
Yeah.
So, for example, I take SIMD, SIMD, so vectors.
Even, I don't think this is even a really standardized feature today in most programs, which is, but it's essential to get performance out of a CPU.
Right.
Most of the operations and Fluidpoint operations in a CPU are in SIMD because they allow parallelism.
So Mojo makes that very native.
Mojo also makes things like B-Float 16 and compressed floating point formats and stuff like that, very native.
And so it exposes that to you as a programmer.
But then you have this problem of you want productivity.
Like you don't want to have to write assembly code effectively or very low-level code to be able to do basic things.
And so you want very powerful abstractions that you can build in the library.
And so what Mojo does is it has very powerful metaprogramming capabilities.
And so, and by the way, it didn't invent these.
It roughly stole them from Zig and made them better.
So thank you to the Zig people.
They have amazing ideas.
But the idea is to say, let's pull this compile time metaprogramming system up,
make it much more powerful, embedded in the language.
And so now you can build compile time transformations of your code.
And so you can say, hey, I want a vectorized function.
Well, what does it do?
It takes a higher order function, which is, you know, all the operations you want to vectorize,
and then it stamps it out across your vector lanes,
and it handles that transformation for you,
giving you a very productive way of expressing
these very high-performance algorithms
that give you extreme predictability.
And so this design point is very different.
It's very important for accelerators,
because with accelerators, you can't have the memory allocation
that magically happens.
Like, you need control,
but these features are a very different design point
than what most languages are coming from.
The other advantage, as you say,
is it makes the language way simpler.
And so if you take a language like C++, C++ has metaprogramming, has templates, and ConstExper and like this whole weird system that is evolving.
Not the simplest twos.
I mean, you learn it.
You learn it.
And I've written way too much C++.
And so like, you know, even I can handle it.
But what C++ does is it has the program and then the metaprogram.
And they're effectively different languages.
And so this is really complicated.
And writing something using templates, it's almost impossible to debug.
You can't even pass a string to a template, really.
Like, there are hacks, but, like, it's just really difficult to do things.
You certainly can't, like, make a binary tree or something and pass it into a template.
That's not a thing.
And so in Mojo and Zigg, what they do is they unify the program and the meta program.
And so you can just write code and then run that at compile time.
And so now if you want to debug it, cool.
Just use the same code at runtime and step through it in a debugger.
It makes it super simple.
It means you have one thing to learn, and it works in both places.
It means you can have real types.
It's in general types.
You can do, in Mojo, it's very fancy because you can use like a list or a dictionary kind of data structure that does heap allocations.
And you can build it all at compile time and then block the output of that into your program.
And so it will synthesize the mallex for you for a bunch of things that you compile time evaluated.
And it all composes the right way.
And so what this does is it gives you tremendous amount of power when you care about performance, when you care about building these higher level algorithms,
which end up being like mini compilers.
People generally don't think about it that way,
but they give you the ability to specialize
based on dimensions and numbers
and magic constants in your algorithms
and allows you to build these very high-power abstractions
very naturally.
And if I want to get started with Mojo outside
of just the basics, which are,
so like using some of these more advanced features,
is it the best way to actually like rent a,
may that be a GP or like some these days?
is pretty easy and go and try to implement an algorithm, a program that actually needs high performance
and start to play and understand these additional, kind of the ways that I can control a compiler.
Or what would your suggestion be for someone who is like, okay, Liam believes that this is the future
and this is like cool to experiment with on their job.
They don't currently have a problem like this.
How would you go about it on the weekends, right?
Yeah.
Yeah, so I can talk about what makes Mojo cool.
Yeah, let's do that.
Well, but see, that ends up in the rabbit hole of really weird technical things that only Chris understands.
Wait, let's not talk about linear types.
Let's not talk.
But instead, let me tell you about why you might care.
Okay, so Mojo is a member of the Python family.
So it's super familiar, easy to learn.
And by the way, with the AI coding, it's easiest time ever to learn programming language, just like you mentioned.
One of the simplest easy things to do with Mojo is to extend an existing Python module.
So have you ever been coding a Python, building, and building, building, it gets big and big and big and big and suddenly get slow?
Well, historically, your option is to go rewrite a big chunk of it in Rust or C+++, and then you have to use bindings, and it's real kind of a pain in the butt to be able to do this kind of work.
Mojo is the most beautiful way to extend Python you can imagine.
No bindings, just like Swift and Objective C, they natively spoke to each other.
No, no, because Rust is a popular choice to extend Python, but with a binding.
Yeah, but you need bindings, right?
And so it's very mechanical, and you can get it wrong.
And so Mojo does that without the bindings, but also integrated with pit packages and the build
systems all integrated.
And it's just super beautiful and awesome.
And you can stay on CPU.
You don't need anything.
It's all free.
Like, go nuts and go do this.
We have very simple tutorials for this.
Because I guess one thing with Mojo, that I don't think we, would you mention, but I know
that it's also not just for GPUs or TPUs, but also CPU.
is the CPUs. And today's CPUs are extremely fancy and complicated machines, and getting
performance out of them is also really interesting. And so what I've seen people do, and come back
to the empowering stories that I love, we had a person come to one of our community meetings,
and they rocked up, and it's like, oh, okay, tell us what you're working on. And I said,
well, I don't know anything about this fancy, you know, AI stuff. Right. I am merely a geneticist,
and I'm like,
like, merely, right?
It's like, I work with DNA sequencing and the stuff,
and I'm a Python programmer,
and so I saw the Mojo Mix Python go fast, awesome.
And so I said, okay, well,
maybe I would try using it from some of my DNA sequencing stuff.
And so I pulled over some code,
and wow, it was like 100 times faster
because I pulled it over to Mojo,
and I can learn how to use threads,
and then I could factorize,
and that I could use, it was very cool because I could do this.
And then you can start to use the CPUs resources
if you're running on a CPU.
Which, as you said, have a lot of capabilities,
and you can play around with it if you learn how to control.
Exactly.
And so Mojo makes it really easy to do that, where you are on a CPU.
And then the thing I loved about it was he said,
and then I read that Mojo was really cool for GPUs.
And I'd never done that before.
I don't know anything about GPUs,
but I decided to give it a shot.
And in an afternoon, I got my thing running on a GPU.
Now it's a million times faster.
And now I'm a GPU programmer, right?
And so that, to me, is super empowering.
And this is where when you look at technologies like Kuda and things like this, again, they're amazing systems.
And there's a lot of good energy and good work put into it.
But it's 20 years old.
It's C++.
It wasn't designed for modern architectures.
GPUs today have tensor cores.
And Kuda can't even acknowledge that that exists, really.
And so this is a huge, huge opportunity for us to get more people into the space and for people to learn about this for the first time.
I mean, you know, just as an engineer who is like, I don't program GPUs.
But of course, like everything I write runs on CPUs.
or cloud or wherever.
This reminds you a little bit of a story of the,
what would you previously share of the Swift person?
Because like I can,
first speaking for yourself,
I can't really imagine myself like just saying,
all right,
I'm going to like write Kuda.
I don't really have a use case,
but I can't absolutely imagine myself
just like adding a bit of mojo,
optimizing it and then maybe just deploying it on a GPU
to see how it is.
If I see stuff that are paralyzed,
play around with it,
is it faster?
And then just like with this genesis.
Build and learn as you go.
Yeah.
And then start to understand.
and these other hardwheres, which of course will increasingly probably be useful,
if we know that.
And also they're now available.
We can rent them on the clouds.
You can soon, maybe not today, but I think, I mean today, you can rent them by the hour.
If not, it's going to happen for sure.
Yeah, you can totally do that.
Well, and so we even go further.
It turns out that lots of things have GPUs.
Today with Mojo, you can run on AMD, Nvidia, and Apple GPUs.
And so if you do want to learn about GPUs, you can go to our website,
search for GPU puzzles, and we'll teach you how to do this.
And actually, it's an amazing time.
I love it.
Because what we're doing, and we're still early.
We're still building into this.
We'd love help.
Please come join our open source community and help us build in this.
But what we really truly believe is that GPUs and AI accelerators and like all these chips,
whether they're for AI or for DNA sequencing or chemistry or bioinformatics or oil and gas exploration.
Like there's all these HPC, there's like all these applications that can use these accelerators.
But all the software around them is really weird.
Right. And if we can break that down, if we can make it easy to learn, if we can teach people, if we can make them consistent across the hardware vendors, because these hardware people don't always get along, by the way. But if we can make it more uniform, then we can get a bigger community. In that community, sure, some of the people, some of the old dogs from the existing community will convert over. And we love that. And, you know, Mojo's way nicer than C++. And so it's way better from user experience perspective and compile times. And there's so many things that are better about it than older technologies like Kuta. But all the other.
Also, what I believe is that there's an entire new generation of people that should be programming these chips.
And if we can get more people into this ecosystem, they can upskill, they can get better jobs.
These are high-paying jobs to do this kind of work.
And so it's the best time, it's a better time than ever to learn how to do this.
And so it's, you know, partially my life's work here is to enable this and enable people to get into this ecosystem and break down the gatekeeping.
And really, it really feels, you know, you have them just always been passionate about compiler's technology and going towards here.
Can you tell me where modular is as a team?
You started about three years ago.
Can you tell me how much progress you made,
how large a team is right now,
and like where Mojo and the OK system is
and where you're focusing next?
Yeah, yeah.
So modular will be four years old in January.
It's hard to believe.
Turns out things do take four years sometimes.
Even with full focus.
Yeah, and so we are an unusual startup
and we went into this knowing this,
but we are a very large team and very expensive
because we believe, and our investors believe,
that this problem that we're tackling is very important.
And so at this point, we're about just over 140 people, something like that.
And we've enabled seven different architectures from three different vendors.
So that's Ampure, Hopper, Blackwell from AMD, or from Nvidia,
the 300, 325, 355 from AMD, plus all their consumer stuff.
And now Apple GPUs and beta and coming online, which is very,
exciting. We're probably not done yet. We'll keep going, but can't share stuff before it's time to
share, but it's still very early days. And so what we're doing is we're building into this from a
technology perspective. Mojo, for example, I want 1.0 to be meaningful. And so I think that we'll
have 1.0 early summer next year. So that'll be pretty cool and it will be way better than
Swift 1.0. Yeah, but I was about to say this 1.0 will be. So the team's scoping that out. We'll come up
with detailed dates soon, and we'll share that when it's the right time. And so as we're doing,
that's what we're doing is we're being really thoughtful and learning from the previous journey
and from other people on the team's journey is to make sure that we do this well. And so we want
to provide stability for the ecosystem, which Swift 1 didn't do, by the way, and build into this.
And each time we do pass through one of these milestones, you know, we can expect more growth
and more use cases and more investment from different kinds of people. That's not what I've seen.
So commercially, we have customers and we're scaling into that.
that. We have a cloud platform, which is the way we monetize most of this stuff. Mojo is not a product
that we're trying to make tons of money up, by the way. It's a thing we had to build to be able
to scale across lots of hardware. And so we see a dual world where we can actually catalyze
and grow and teach people how to build and use all these accelerators, and then we can help
enterprises build and manage AI into their ecosystem. And a lot of what these folks are looking
for is they want something that's easy to use that actually works, it's reliable, that they can
scale. Yeah, a lot of them want optionality on hardware. They want to be able to be able to
able to buy the best ship for their workload. They want to be able to get
multi-source in different vendors. They don't want to have to rewrite it three times.
Yeah. I mean, this all only makes sense when you're, especially that there's this
dependency on like one or two vendors, usually one. Yep. And so, and what I see and what I predict in
2026 is there's going to be a lot of silicon from a lot of different vendors and we want to
make sure that people can scale across that effectively. And so, you know, for people that are
interested, we have lots of job postings on our website. And so we're growing very rapidly. And we
have a very big mission. It will take, you know, it's not another six months and then we're done.
We're just still at the beginning. And every year of our journey is just another epoch of new
things to build and learn and grow and develop into this. And of course, you're building like
your sounds like the infrastructure under AI or hopefully a lot of it could be. How is your team,
the engineering team, specifically using AI, AI tools. May that be, you know, IDs, agents.
You mentioned that Mojo is a
familiar language to
write with AI, but I'm
interested in underneath the hood. Are
you seeing the impact of
these tools? Is it helping engineers
work faster or down in the
current? Because there's always this question, like if you're
writing compilers, will these things
help at all or no? You need to handcraft it
and I'd love to ask for someone who's
actually seen it. Yeah, well, so I can tell you,
I can't tell you all the things, but I can share
my experiences. So we definitely encourage our
team to use AI coding tools, and so you
use cloud code and cursor and all 57 other things. And so that there's lots of different experiences.
I saw I've seen tons of benefit. And so for me, I still code. And so I use cursor for example,
as my daily driver right now. And so I feel like it's, you know, a good 10% productivity for
particularly mechanical rewrites and stuff like this. But you're a really good programmer. So
that's pretty good. Pretty meaningful. Yes. I represent very sophisticated programmer. Correct. Yes. And so like I,
but it's really, I enjoy it because it frees me from a lot of the mechanical stuff.
And so whether it makes me aggregate more, more productive or not,
increases my enjoyment, which is good.
It's really important.
It is, actually.
Now, for other folks that are building prototypes or PMs,
they're building, like, wireframe models of things,
like it's transformative because you can literally 10x somebody
and you can make it so they could build something
that otherwise wouldn't get around to doing.
And so that is transformative.
For production coding, I think that it's kind of hit and miss.
And to me, honestly, I don't know if it's, like,
actually a net win or not
because I've seen many cases where people
say, okay, just go let the agent
and try a thing. And then it grinds
and grinds and grinds and a gazillion tokens
later, like, it doesn't work.
If they would have just done it,
then it would have been done sooner, right?
And so to me, like, again, there's
lots of wins, but there's also some losses.
And so I don't know how things will net out.
I do know it's moving very rapidly, and I do
want us to be using it.
But also, one of the things I'd say is
I don't want people to turn their brains off.
And one of the things I think is really important is that programmers in our team, but also in general, they use it as a human assist, not a human replacement.
And I think it's very important that we review the code.
We understand the architecture.
And it's not kind of for production workloads and production applications, at least vibe coding terrifies me.
Not just because of what does it mean for jobs, but what does it mean six months from now when you want to change the architecture of something?
Yeah.
How are you going to even understand how anything works?
Right.
And so I do think that it's really important to keep humans in the loop and architects thinking about something and understanding things, not just for security and performance and all the other reasons as well.
Well, Eddie Osmani, I talk with Eddie Osmani, who's been on the Chrome DevTools team for like 13 years now.
I think a lot of the tooling he or his team built there.
And he said he wrote this book called Beyond Vipe coding.
And he said that what he does is he enables like a verbose kind of output for the agent.
And he actually reads through like what the thing he was.
He makes sure that before he commits something or reviews that he just understands what there.
So that he always stays in touch.
So, you know, if this thing turned off, like he could, he could jump in.
It's just like he had now, he just chose not to do so.
Well, and I think that, again, the thing that I care about is that the architecture of production,
I mean, prototypes are a different deal.
But for production, you keep the architecture.
clean and right.
It doesn't need to be perfect,
but it needs to be curated
because I've seen AI coding tools
go crazy with like duplicating things
in different places.
Well, we know that's a problem
because then you want to go change something.
Now you have bugs when you update
two out of the three places, right?
And so there's a lot of these things
where the tools are amazing,
but they still need adult supervision.
So when it comes to hiring,
like you have like,
you're building something really interesting
and cool and honestly super ambitious.
You already have like a pretty big team
but what you're growing.
what is your hiring bar?
And I think you previously mentioned
that you're building a company of elite nerds.
I'm interested in.
Not wrong.
What are things that you look in folks?
And for software engineering,
experience engineers, especially like listening,
what tactics or strategies suggest that they get to the level
of that they could have a shot at a place,
like modular,
or even if not necessarily modular?
Yeah.
So we hired two different kinds of folks.
One is the super specialized, you know,
compiler nerd.
or something like this, or the GP programmer expert that knows how to do
high-performance matrix modifications, and they've got 10 years of experience,
and so you can go look at that track record.
We also hire people fresh at school.
Wow.
This is refreshing.
And so I really do appreciate people that are very early career
because they haven't learned all the bad things yet.
And so for me, I think maybe the more interesting thing to talk about
is not the super expert that's already the senior engineer.
Yeah, we are.
It's about folks that are much earlier in their career.
And so what I love to see is I love to see people that are really hungry.
They have not given up and just said AI will do everything for me,
but they have intellectual curiosity.
They're willing to work hard, of course, right?
But then they're fearless because everything, particularly in the AI space,
is changing so rapidly.
But a lot of people will just freeze up and they won't do anything.
Versus if you say, okay, cool, let's go figure it out.
I want to learn how to do this.
I mean, this has been my life's journey.
It has been saying that sounds terrifying.
People tell me it's impossible and it's doomed to failure,
but how hard can it be?
Let's just go figure it out
and work through things
and do that.
And I personally really love
when people have contributed
to open source projects.
I think that's the simplest way
to prove that not only can you write code,
but you could actually work with a team,
which is a huge part of software engineering
in reality.
And so I love internships
and things like this
that people have done before
and kind of when talking with people,
you can tell
whether they're excited about what they do
or whether they're just
kind of performatively
kind of going through the motions.
Now, one of the really challenging parts
is in interviews, people get very nervous.
And so one of the things I really believe in is giving people kind of the native tools they're
used to using.
And so, like, let people code in the way that they would be normally coding.
And so today, for example, we want people to be using AI coding for the mechanical pieces, right?
So saying, okay, well, you have to write in a whiteboard and you can't use your native tools
would be very strange.
And so we kind of try to make sure to meet people in something that actually approaches
a real world situation.
One question I have a meaning to ask. Because you built several languages, this comes up and I ask folks like what are things they'd love to hear from you. And this was repeated quite a bit. Do you think there is any logic in having a new language that is designed to make it easy for LLMs to code with and like attributes, for example, better pattern matching or other things that might be the strength of LLMs. And if so, like what would these
patterns be. This is really weird. I would have not thought we would talk about things like this,
but now we should. Absolutely. I'm happy to talk about it. And so I often sometimes get asked,
so given the AI is writing out of the code, why are you building a programming language?
Right, which is another way of asking the same question, maybe a little bit more agro. And so to me,
I don't think that optimizing for the LLM is the right thing to do at all. Because go back to what we're
just talking about, the important thing is reading the code. Always has been, by the way. Right.
code is read more often than it's written.
AI has made it way easier to crank out the code than it ever has been, and I don't think
we're going back to where we were.
So writing the code is actually not the key thing to me.
It's about reading it.
And so what I care about with Mojo and with these new emerging systems we're building
is really a combination of two things.
One is expressivity.
And so can you express the full power of the hardware?
Yes or no.
Like if the answer is no, then you will never be able to get it.
And so JavaScript will never be a good way to write a Kuta kernel or a GPU kernel.
It just will not because it can't express the things you need to express, right?
No slight against it.
It's a good system, right?
But if that's the goal, then you have to be able to do that.
Now, assembly code can express the full power of the hardware.
And that's not what I'm advocating for, right?
The other thing that goes with it is you need readability.
Right.
And so you need this combination, this intersection between expressivity.
So can you express the important thing, in our case, performance?
And then can you understand the code? Can you build abstractions? Can you build in scalable systems?
And that intersection, I think, is the key thing. And so this is where you take Python syntax, for example.
Python's super easy to read. It's actually very standard. A lot of people know it. Let's embrace that. Let's
go all in on that. That's a good thing. And so Mojo embraces Python and the syntax of Python at least and says, okay, well, now let's fix the problems with Python, which is basically the entire implementation.
let's replace the Python interpreter and all that kind of stuff.
Let's keep this good part and make it so we can express that full power of the hardware.
And with that, I think the LLMs will continue to get better and better at dealing with any weirdness.
And what I've seen is that the LMs are an amazing way to learn coding a new language.
Yeah, this is interesting because I feel we're starting to see, like, including in like software,
how to build good software.
Oftentimes, like what is good and efficient for teams is also LLMs tend to do better with
them. Yeah, I also look at these
funny questions. It's like, okay, well,
a lot of people in the agentic coding loops want
to have error messages
that feed back into the agents so that the agents know what to do.
And so, should we make better
error messages for the agents? Make it better
for humans. And my question is like, what is
better for an agent that's not also better
for a human? Let's just have good error messages, right?
The other key thing, and I think the most
important thing for LLM-based
coding and AI tools is have a
massive amount of open source. And so
for us, if you go to the modular repo, we have,
I don't know, 700,000 lines of Mojo code that's open source.
And when you open source, you open sort of the whole history as well, by the way.
Exactly. Exactly. And so that's super powerful because now you can say, hey, hey, go index this.
And so now it knows a gigantic swath of everything you can possibly do. And so that that I think is also really important.
Yeah. Now, to close with, you have mentioned that compilers are cool multiple times here.
I'm slightly biased, I admit. Absolutely. But I'm getting convinced that compilers are cool.
for software engineers who have not built or work with compilers,
what are some kind of like pointers you would give them of like,
hey, I'd like to get into understanding compilers,
maybe building one, should I try to come up with the language?
Like, what are some things that you point people to?
Yeah, so I'll tell you why I fell in love with compilers.
So if I go back to university,
I was taking the standard computer science program
of basic programming, then data structures,
and then like an operating system class,
GUI class and all this kind of stuff.
The thing I love about compilers is that
you built Project 1,
and so in this case, it was
what's called the Lexer, the thing that tokenizes
the source code. And you
had to use some data structures. So I'm like,
oh, okay, cool. I'm using the things I learned,
not just how to write code, but
what a tree is and things like this.
And then you get to the second
part and you build on top of it. And so
now you build a parser, the thing that decides
the syntax of the language, using
the tokenizer. And then you build
the type checker and you build on the first two
and then you build the next thing and you build on those
things. And so what I
loved about compiler as a
class was that I got to build and learn
and iterate and then if I made a mistake
I had to go back and fix it
is you're building higher and higher and higher.
And so I think that compilers, particularly in
the university setting, reflects more
of like real software development in a way
that a lot of other classes I experienced
did not because a lot of the other
classes ended up to build a thing,
turn it in, throw it away,
build another thing and throw it away.
And so that's why I found love with it.
Today, if you want to learn compilers, it's easier than ever.
You can go to Lovm.
There's a great tutorial called the kaleidoscope tutorial that I wrote.
That wonderful.
15 years ago or something, it's been a long time ago.
The Rust community has a lot of compiler,
lovely compiler nerds in it.
And so a lot of compiler-based technologies built in Rust these days,
which is also super cool.
And so there's books and lessons and things like this.
I don't literally think everybody needs to become a compiler engineer,
but I think the compilers don't get the credit they deserve,
and it turns out there's a lot of really good jobs and compilers.
And so if it happens to be that it's an interesting place and interesting side of technologies,
and I really do encourage people to do it because we need more folks in this field.
Wonderful, Chris, this has been really, really interesting.
Learned a lot. Thank you.
Yeah, well, thank you for having me. I hope it was useful.
It was.
It's rare to talk with a software engineer who is so passionate about programming as Chris is,
and I hope you enjoyed this conversation as much as I did.
I especially liked how Chris took us to the backstage of how things really
happen at Apple. How it was Chris's own stubbornness and track record of getting things done
that allowed him to push things through like open sourcing the Swift programming language
and creating the new language to start with? For more deep dives on AI engineering and stories
and developer tools at other companies, check out deep dyes in the pragmatic engineer that I wrote
linked in the show notes below. If you enjoyed this podcast, please do subscribe on your favorite
podcast platform and on YouTube. A special thank you if you also leave a rating on the show. Thanks and
see you in the next one.
