This Week in Startups - Expanding AI chip capabilities beyond Nvidia with Modular CEO Chris Lattner | E1808
Episode Date: September 14, 2023This Week in Startups is brought to you by… Roots. Invest in the only real estate investment trust that creates wealth for you and its residents at https://investwithroots.com/TWIST Supergut is the... easiest and tastiest way clinically-proven to regulate digestion, curb cravings, and boost energy. Get 30% off their delicious shakes, bars, and fiber mix at https://Supergut.com with code TWIST. LinkedIn Marketing. To redeem a $100 LinkedIn ad credit and launch your first campaign, go to https://linkedin.com/thisweekinstartups * Today’s show: Modular CEO Chris Lattner joins Jason to discuss the process of building LLMs (8:14), what caused Nvidia to be entrenched in machine learning today (35:37), the AI wrapper debate (41:44), and much more! * Time stamps: (0:00) Modular CEO Chris Lattner joins Jason (2:24) Where hardware and optimization stand today (8:14) What goes into deploying and distributing an AI model into a product (10:32) Roots - Head to https://investwithroots.com/TWIST to sign up and start investing today! (12:02) Why companies make their own machine-learning software (14:32) Nvidia’s outlook on what Modular is building (18:46) Chris’s time at Apple (20:05) Strategies for reducing complexity (22:53) Awareness of underlying complexities in the technology stack (25:14) Supergut - Get 30% off with code TWIST at https://supergut.com (26:46) What it will look like in five to ten years; Increase consumption by creating new categories (31:29) The open-source community; RISC-V and Arm (34:15) LinkedIn Marketing - Get a $100 LinkedIn ad credit at https://linkedin.com/thisweekinstartups (35:37) How Nvidia secured its position (38:21) The AlexNet moment (41:44) The AI wrapper debate (44:32) OpenAI moving from non-profit to profit and open to closed-system (46:32) The lack of programmers and the ability to do more with less (52:09) Modular Mojo and other developments * Check out Modular: https://www.modular.com/ FOLLOW Chris: https://twitter.com/clattner_llvm * Read LAUNCH Fund 4 Deal Memo: https://www.launch.co/four Apply for Funding: https://www.launch.co/apply Buy ANGEL: https://www.angelthebook.com Great recent interviews: Steve Huffman, Brian Chesky, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarland, PrayingForExits, Jenny Lefcourt Check out Jason’s suite of newsletters: https://substack.com/@calacanis * Follow Jason: Twitter: https://twitter.com/jason Instagram: https://www.instagram.com/jason LinkedIn: https://www.linkedin.com/in/jasoncalacanis * Follow TWiST: Substack: https://twistartups.substack.com Twitter: https://twitter.com/TWiStartups YouTube: https://www.youtube.com/thisweekin * Subscribe to the Founder University Podcast: https://www.founder.university/podcast
Transcript
Discussion (0)
If you go back in time, I built a technology called LLVM,
which is this fairly obscure compiler technology
that then is probably on your phone today
and on many of your laptops and in your consoles and things like this.
That technology helped unify a generation of compute
around CPS in particular.
And so LLVM was great for hardware people
because they could integrate with LLVM,
and then they got all the C++ plus and all the Swift
and all the other languages and Rust and Julian
things like this for free.
But machine learning doesn't have that.
And so what modular is building is it's building that thing that once you plug into it,
you have a full AI stack.
And for hardware maker, that's a very powerful thing.
This weekend startups is brought to you by Roots.
Invest in the only real estate investment trust that creates wealth for you and its residents
at investwithroots.com slash twist.
SuperGut is the only nutrition brand clinically proven to improve digestion,
balanced blood sugar, sustain energy, and manage weight.
Save 25% on the delicious shakes, bars, and prebiotic mix at supergut.com with code twist.
And LinkedIn marketing.
To redeem a free $100 LinkedIn ad credit and launch your first campaign,
go to LinkedIn.com slash This Week in Startups.
All right, everybody, welcome back to This Week in Starters.
We're excited for today's guest because
He's worked at some of the biggest technology companies in the world and working on AI.
His name is Chris Latner.
His company is modular.
He's worked at Apple.
He's worked at Tesla.
He's worked at Google.
And now he's got his own startup, as I just said, modular.
So, as we all know, Nvidia's dominant right now in the AI space.
$16 billion in revenue in Q3.
That's 2x year over year.
They're wildly profitable.
Stocks doubled since 2020.
23. But as we've said, on this pod and all in, and there's going to be competitors coming,
right? Of course. And some startups are going at Nvidia on the hardware front. We had
light matter on recently in episode 1787. And they're trying to use optics, photonics-based
chips basically to move data around. It's going to make things cooler in data centers and
help with these large AI jobs. Well, Chris is taking a different approach at modular.
they're going to make it easier for developers to run AI modules on non-invideo hardware,
and they just raised $100 million, as AI companies are apt to do in 2023.
Chris, welcome to the show.
Well, quite the introduction, Jason.
Thank you for having me.
It's great to be here.
Yeah, great to have you, and you are in the thick of it.
One of the things I hear over and over again from people deep in the AI space,
I had a conversation with Elon about this not recently,
and we see it at OpenAI and other places,
is only a small amount of the hardware that's being purchased
is being used at any given point in time when AI jobs are running.
So for people who are technical,
but maybe not working in the specific field,
why is it that when we push a job,
you know, we're doing chat GPT-5 or Claude 7.0,
whatever people are doing, they're doing a Lambda or a Lama.
I mean, there's just so many different things on Hugging Face right now.
Why is it that so the hardware is not optimized to these jobs?
Why are we find ourselves in this?
And then what is the actual percentage of the hardware being used,
whether it's an H-100, A-100, or my M2 on my MacBook Pro?
Yeah, so it's super interesting.
If you zoom into what is AI these days, right?
So many people focus on training.
You have to start with the research.
You have to start with models.
Models are changing all the time.
I mean, just follow what's happening.
It's hard to keep up with the pace of innovation and the model architectures.
But then there's also the inference side of things and the deployment side of things.
And so these two markets, these two problems are actually completely different.
So what you're talking about is you're actually referring to the training side of this.
And modern training jobs, as many people know, have gotten huge, right?
You get tens of thousands of nodes, thousands of GPUs.
These are monstrous jobs.
And so because of that, what you get is these time sharing systems.
and so it's super funny.
Like we went from personalized computers all the way back to the mainframe or the job sharing.
Like I'm going to put in my punch cards.
Right, that was Perot Systems.
Yeah.
Yeah.
You've read time on somebody's mainframe.
Well, yeah.
So we're back in those days.
And so the actually better analogy, if I'm not joking about it, is HPC systems.
And so if you go back 10 years ago or something, you'd get one of these massive supercomputer
systems that a national lab would install.
And then researchers would have to like walk up and allocate.
time against it, right? And so the big question then is how do you amortize the spend for the
hardware across a lot of work that happens on any one of these massive supercomputers?
And training systems today, they're massive supercomputers in every way, shape, and form.
The program malls are very different. The workloads end up being a bit different. And so
there's some differences, of course, but the way they get managed is very similar. Now, what I've
seen is different groups that own these things, manage them sometimes better, sometimes worse.
And one of the challenges you'll see is that, for example, the big research teams may allocate, you know, 20,000 GPUs or something.
But then the question is, how do you fully utilize it?
This is one of the cases where time sharing, like clouds, are actually really great because often you're not training models all the time, right?
Your model training is actually proportional to the research cycle that you've got going on.
And so if you're, you know, one of the massive companies like Google, where you have thousands and thousands of researchers,
what you'll do is you'll have this big hardware pool,
and then you'll have the researchers that are all effectively putting in their slot
so they can use the machines when they come up,
and then they run their batch job for perhaps hours,
perhaps days, perhaps months, right?
And they get allocation for it.
But if you get these smaller groups where sometimes they're on cloud,
and so they're just renting by the hour,
sometimes they build their own data centers,
and then the problem they have is, okay, cool, you have all this hardware.
How are you utilizing?
Is it being productively used?
And so these are major questions.
that I think that the entire industry is struggling with.
But if you go just adjacent to that, that's training.
That's where the models come from.
If you go to production, the character is completely different.
And so here, you're not talking about supercomputers.
Here you're talking about the fact that, you know,
you may have tens of researchers that train a model and they use a massive amount of hardware to do so.
But then you need to deploy that model.
You need to deploy the model.
The problems are completely different.
Right.
here the problem is you have a billion users.
And a lot of queries and then a lot of follow-up queries.
And people want to, I guess, I'm not sure what it's called when you, well,
there's prompt engineering and the prompts are getting more sophisticated.
So all that creates load on the system.
Yep.
And the load on that system is really different.
Instead of it being one massive computer that is then batch scheduled, what you need is
you need scale out.
And so any one of those systems is actually a single node often.
But now you need thousands and thousands of these nodes.
and those are fully utilized, right?
Because you've got users in 24 times,
in all the time zones, right?
And so that's actually a very different problem,
and it's super interesting.
And so if you look at AI today,
it's super fascinating to me
how much energy has been put into the training side.
Everybody's always talking about the research,
models, and the training, and the training, and the training.
Few people talk about what it takes to get that thing into production.
Yeah.
And one of the big challenges that we as an industry are facing today
is that, you know, these systems that people build with,
like TensorFlow and Pytor,
and these kinds of things
were always built
by the research team
for training.
And so getting that model
and production is super difficult
and this is almost an unsolved problem
these days.
And one of the challenges there
in particular is it's not just about cloud.
Often you want to train a model
and then put it on a phone.
Right?
And so it's a very different problem space
and it's much harder than some,
I mean, it's very,
both of these problems are really cool,
but it's super hard.
Explain to folks, after all the training has been done
and then you have this language
model and you then want to load it onto a phone. How does that all work? What is the output and
how would you explain it to a layperson of, hey, we built the model, but now we want to
distribute the model to a bunch of different places and then let you play with it. But what is
required there? So I don't think that it would be in good taste to talk about how we do this because
it is so complicated and nasty and horrible that we cannot go into all the details. But I'll give you a sense.
Because that's how I am.
Right.
So if you take a traditional enterprise,
it's building ML into their products, right?
Often they're not building one model into one product.
Right?
So they have many different kinds of models,
some recommender models for like,
hey, maybe you should look at this in your shopping cart next.
You have classification models.
So you're looking at, okay, well, you like that shirt.
Like, maybe you should pick this shirt.
There's many different kinds of products.
They then get matrix into many different different.
kinds of things that they're deploying into.
So often cloud is a big deal,
but then you have mobile apps and a lot of other things.
And so what has ended up happening is that
deploying ML today involves
building this entire matrix of all these point
solutions, because there's no one
thing that allows you to span across all of these
things. And so what you end up using
is like this catastrophic array
of like 15 different tools.
And all these tools have different problems.
Like so I'm an Apple
an Apple alumni.
I have a ton of
The easy-to-use programming language for building apps.
And so I love Apple and I love the Apple folks, but to deploy ML onto an Apple platform,
you have to use their point solution called CoreML.
And Coromel is not compatible with all the models,
and so there's all this friction just to get onto an Apple device, right?
And so Apple devices are pretty common out there.
And if that's hard, you just think about what it means for this wide spectrum of different things.
And one of the challenges here, the fundamental, the incentives,
structure problem is that hardware makers like Apple, like many other hardware makers, always want
to build a solution for their hardware. And nobody's trying to build something that scales across
everything. And so this is what we're focused on. Hey, everybody. Today I'm joined by Rootts CEO, Dan,
welcome to the show. Thanks for having me, Jason. Tell everybody here in the audience, what is Roots
and what makes it different than the other real estate investing platforms? I'm a complete neophyte.
Roots is a reet with a little twist. Sorry, how to do it. We are the first real
estate portfolio that we know of that builds wealth for both our investors and our residents.
And we've created a unique win-win model that creates partners and not tenants.
Am I as an investor, if I wanted to put money into this, getting dividends, or am I just
getting the growth of it? How does all that work?
When you invest with us, you get to participate in two ways. One is through the distributions of
profits generated at the company. And we pay those out quarterly. Over the last 12 months,
that's equated to about a 6% cash on cash return to our investors just in distributions.
And then the other way everybody participates is each quarter, we reevaluate what's called
our net asset value.
And as that ticks up, our unit price or our share price of our portfolio goes up as well.
And that's how you would basically be able to sell your share at any point and liquidate
your investment and move on to your next piece or leave it in and keep it.
keep growing with us.
Head to invest with roots.com slash twist to sign up and start investing today.
That's invest with roots, no spaces, no dashes, dot com slash twist to sign up today.
Because Nvidia has Kuda, right?
That's their software for writing their machine learning apps.
Apple has theirs.
And these two things are just...
Google has theirs.
Tesla has theirs.
Like, everybody builds their own thing.
So if you go back in time,
why does everybody build their own things?
Is it just because it didn't exist before
or because its customization is necessary
to get the end result they want?
Well, because they don't have a choice
functionally, right?
And so it's super interesting.
I mean, AI is so important to what we do, right?
Nobody takes a step back and says,
if AI is so important for the industry,
why is all the AI software so bad?
Right?
And so you look at that.
Is it a function of time?
We just were so young in the game?
Yeah, that's, that's,
a big aspect of it. So the analogy I give to people is that AI is like an adolescent.
Like, it's like a teenager, right? It's, it has some, it's very exciting. It's overconfident.
It's got some winds under his belt. It sometimes rolls over its parents car and causes a mess, right?
But what's happening right now is everybody just wants AI to grow up. Like, people want to build
AI into their products. They want to not mess with the AI infrastructure. They want to actually be
able to deploy things and build AI-enabled products, right? And right now, if you're one of the
Fang companies, for example, you can take a team of 50 people and brute force it. But if you're
many other people that should be using AI in their applications, it's so much more difficult.
And to your question, like, why does they ever build their stack? They don't have a choice.
Like, all of the technologies that exist today are built for a particular piece of hardware
or they're built by a research team. This stuff is not production quality. And if you go,
if you go back in time, I built a technology called LLVM,
which is this fairly obscure compiler technology that then is probably on your phone today
and on many of your laptops and in your consoles and things like this.
That technology helped unify a generation of compute around CPS in particular.
LLVM was great for hardware people because they could integrate with LLVM,
and then they got all the C++ plus and all the Swift and all the other languages and Rust and
Julian and things like this for free.
But machine learning doesn't have that.
And so what modular is building is
it's building that thing that once you plug into it,
you have a full AI stack.
For hardware maker,
that's a very powerful thing.
And what's NVIDIA's take on what you're doing?
Are they supportive of what you're doing?
Or do they feel like what you're doing,
they're not supportive of because it's going to help,
you know,
people maybe port to other hardware platforms
and maybe take away their dominance
or to get the sense that they care about their dominance at this point?
I mean,
they seem to have run away with it.
right now. Yeah, well, great question. So, I mean, there's this narrative in the industry that we're
here to hurt Nvidia or something. Invita is one of our most important partners, right? And,
and one of the things that I think people forget about is invidia is really invested in building
some really crazy exotic next generation products. Yeah. Right. And so what we're interested in doing is
we're interested in expanding the developer ecosystem that can use those products. So we're on a very
complementary set of missions here, right?
And so what we're doing is we're looking at saying,
okay, well, this whole AI thing,
it evolved rapidly.
Again, it's very high potential,
but it's all a mess.
Like, the people who do it, as you know,
are wicked smart.
Some of the most brilliant people in the industry.
But there's other good people, too,
that have good ideas.
And so if we expand out the developer community,
if we 10x a number of people that can participate,
think about the amount of innovation that can happen.
Think about the new use cases and applications.
Yeah, right now people don't,
actually know this, but a lot of what's happening in AI is limited to people who can code
in Kuda, Kudo.
What is it?
Yeah, Kuda.
Yeah, Kuda.
And then I guess some people write in C Sharp or C++.
What are the other ways people generally get AI code, you know, down the hardware stack?
Because you're building Mojo, I know, which is, you know, more Python-like, I think.
Yeah, well, we'll talk about that.
So it really, it really varies.
And again, AI is not one thing.
This is another thing that I think people get sometimes distracted by,
but it's not like transformers are one thing, for example.
And so if you look at a lot of a model or like stable diffusion,
which is a UNET model, which is a very different architecture,
what you get is a lot of Python on the outside.
The Python handles what's called tokenization of converting input text
into something the model can understand.
You then get something like PyTorch or TensorFlow involved,
which is itself a gigantic, complicated thing that is awesome in some ways,
but also challenging in other ways.
You get custom kuda kernels, as you're saying.
So you want to get high performance out of one accelerator.
And so you get C++ plus because sometimes Python is really slow.
And so what ends up happening is the developer building one of these next generation
models, you have to know all of these different things.
And so practically speaking, no no sane humans actually can do that.
And so this is why you need teams of experts.
And these teams are super experts in every single different one of these parts of the
problem where somebody knows model architecture and differential equations, somebody knows kudos,
somebody knows C++, somebody knows all these things. And so only that is what's able to bring
these things together. Which we've seen this movie before in the early days of the web,
setting up a web server itself, getting a sun microsystems, you know, server. You know,
it wasn't like today, obviously. And remember when we had apps come out, even pre-iPhone,
if you were trying to build something for Nokia or Docomo or any of these other platforms
around the world. It was really hard.
And there was a limited number of people could do it, which meant
you just didn't see a lot of apps. They would come
very slowly, a couple of apps a year.
They were super interesting.
And they're expensive too, right?
Because the development costs were so high.
Yeah, which means something that's fun
or interesting. The idea that there would be
an app for skiers, like I have an app on my phone
for skiers called Slopes, there's like
probably a half dozen of them. The fact that there's
a solo developer or two-person
development team on their weekend hustle
building an app, it's just a crazy
thought. I mean, you were at Apple when this happened, the concept that an app could be made by one person
in their spare time and get to a million dollars in revenue or even $100,000 revenue, 10,000
revenue, 10,000 revenue. There were so many hurdles to that. You had to actually do deals with the carriers.
You had to put up servers yourself. You had to figure out how to get that app on, yeah, getting the
app, the distribution on people's phone was a roadblock. You just think about the genius of Steve Jobs.
The app server distribution, the payment rails for people buying it. And,
then there's a really lightweight, easy app discovery and the ability to write them.
So you're working on Mojo.
This is a programming language.
Well, just before we move on from Apple, right?
So my job at Apple was to lead the developer tools team, right?
I mean, I had many hats, but by the time I left, I was running the developer tool
team with Xcode, the whole iOS app development ecosystem, built the Swift programming
language.
Also supported all of the internal hardware, which Apple has very fancy, very exotic, and next
Gen hardware that they're building.
And a major part of the job is to make people more productive.
Make it so more people can participate exactly as you're saying,
because so many people have good ideas for apps, right?
And so if you get more people involved, like the move from Objective C to Swift,
massively simplified things, made it much easier to learn.
That was a huge movement that then enabled entirely new categories.
And so many people today tell me, you know, I was able to become a programmer because of Swift, right?
And so ML, I believe, has got exactly the same thing going on, right?
Where it's absolutely possible for the most advanced teams to achieve things, right?
But first of all, like complexity, which is really our enemy here, complexity, like if you fill your head with accidental complexity, you don't have space for other stuff.
Yeah.
And so by relieving the accidental complexity, you make the teams of experts even more productive.
But then you're also more inclusive to other people that have good ideas, but either are, you know, repelled by the complexity.
What are the strategies for getting rid of complexity?
I mean, I'm just thinking about playing chess.
You kind of learn some heuristics, some basic sets of moods, chunks of moves that you can
apply in different places.
Or, you know, we have co-pilots, which, you know, and we have open source.
We have a lot of different ways to help people with complexity.
But when you look at complexity in the world, what do you think of?
Do you have a playbook for reducing complexity?
Yeah, absolutely.
So, and this is one way that modular is very different than pretty much everybody.
in space, but complexity comes through abstraction,
reduction of complexity comes through abstraction,
and through getting people to be able to work together.
Okay?
And so the idea here is that you look at all the domains of people that are involved,
including all the people putting together the transistors on the chip, right?
There's so many different specialities.
The details can't fit in any one head.
So success comes from teams of people, right?
And then composing on other people's work.
And so a lot of what I think software has been successful,
I mean, you've built some pretty epic systems, right?
Yep.
It comes from being able to take things that other people built
that you don't have to understand
and then build new things on top of it, right?
And so what a lot of folks are doing today in ML systems
and ML ops and a lot of these things,
they say, okay, well, there's so much complexity out here.
What are we going to do?
Well, we're going to throw a layer of Python on top of the stack,
and then you'll deal with our layer,
and look how simple it is.
therefore you don't even know about any of this complexity.
Now, there have been dozens or hundreds of attempts at this.
I mean, there's a lot of stuff out there.
Some of it's really good, but the challenge with that is if you're building atop of something like TensorFlow or Pytorch
or, you know, you're trying to get onto novel kinds of hardware and like a TPU or something like that.
Well, you actually get exposed to all this accidental complexity because it all leaks.
And so, yeah, you get this cool demo, but you can't fix performance or scalability or programmability
or programmability or security
or like these core
problems that people struggle with
by adding a layer of Python on top of systems
that are fundamentally broken.
Yeah, the thought doesn't work.
And in a way,
what we've seen happen in the modern web
over time,
you have cloud computing,
abstracting away, putting up servers.
And then storage got abstracted.
I mean, GPS got abstracted
away. There's a software development kid
an SDK for anything.
There's an API for anything.
And then even building glue between systems has gotten easier.
I used to call it middleware, I guess, back in the day.
I don't know if there's still a term for that.
Enterprise Java beans.
Yeah, it was always like weird stuff to try to get you to move data from one system
to the other.
It seems like comical now.
Maybe you can just talk about the complexity in the world writ large and in the technology
stack because you've been at this for a couple of decades.
It is pretty amazing.
When somebody's coming in now,
a 20-year-old developer in school
who is, like, building stuff,
how much do they know about what's actually going on
beneath, you know,
you see the little tip of the iceberg,
are they even aware of, like, the complexity underneath?
Yeah, well, so, I mean, again,
it's hard to make generalizations about all 20-year-olds.
Yeah.
Because there's some variance there, but...
On the average 20-year-old...
On the average 20-year-old, they know Python.
Yep.
They know if you go into computer science,
you know how to train a neural network,
for example, but you don't know how to deploy.
it, right?
You get exposed to some other programming.
Maybe you'll get a little bit of C++ or something like that.
But most people coming out of a computer science degree, no Python,
and pretty much everybody that is not designed to be a computer scientist,
so there's a lot of other fields out there, no Python.
Right?
And so Python is great because it's super high abstraction.
It's like the ultimate duct tape language where you can bolt together these very powerful libraries.
But Python also has certain challenges when it comes to perform.
or dealing with hardware or a lot of the things that inhabit the AI space.
And so running Python on a service with a billion users is not always great.
And so there are challenges there.
And so if you come back to what is modular doing about this,
what we're tackling instead of adding layers of Python on top of existing systems,
we're saying, let's go explode those systems.
Let's do the hard thing.
Let's go build the system from the bottom up.
And this starts at the hardware.
The hardware, there's a lot of really good hardware out there.
To your point, nobody knows how it works.
I mean, the people that built it do, but most application developers don't know how it works.
And what has happened is that right on top of the hardware, there's all these different layers of effectively middleware, just like you said.
Right.
But each piece of hardware has a different layer of middleware.
And so that means that when you get to the top layer, the part that anybody actually wants to work on is super fragmented.
And it makes sense.
It's the insane structure of the people building the hardware.
they want to build a thing for themselves.
But the losers are all of us trying to get our jobs done.
Many people in ML don't want to care about the hardware.
They're made to care about it.
You've heard me talk about Supercutta bunch.
This has been a key part of my health journey.
It's an awesome nutrition company that my bestie, David Freeberg, from the Olin Podcast, started.
I love their bars.
I love their shakes, especially the gut balancing chocolate brownie bar.
It is delicious.
They also have an unflavored.
prebiotic mix. You can add to anything. I like to put it in my coffee. You can put in your
O'Mel. Their products are super helpful for weight loss. Why? Well, SuperGut's products
mimic the effects of OZempic by boosting your GLP1 hormone. This helps quell hunger and boost
your metabolism, which is a great, great combination, obviously. And Supergut's prebiotic fiber
that actually alleviates digestive issues. And obviously, the products all taste great. The best part,
the team at SuperGut actually put the work in and scientifically prove their products,
work. They conducted a placebo-controlled clinical trial with Stanford last year. That's been published
in the medical journal, diabetes, obesity, and metabolism. The results were amazing. The participants
in this study, they lost weight, they lowered their blood sugar, they improved their metabolic health,
and they had improved digestion and so much more. Whether you want to improve your gut health,
maybe drop a few pounds like I did, or just feel better throughout the day. And listen, you're busy,
you're traveling. I like to bring Supergut with me. Go to Supergut.com and use the code twist. You get 25%
off. Go to supergut.com and use the code twist to get 25% off. I've been on this health journey. I've
lost 40 pounds. A big part of that sincerely was me using supergut. So go to supergut.com and use
the code twist for 25% off. What is this hard we're going to look like in five or 10 years?
Because we're at this point in time where what Open Eye did with, I think, 3.5 really kind
of captured people's imagination and, you know, being able to actually play with it,
inspired a lot of developers to maybe get in there.
And so here we are, everybody buying up sovereign wealth funds, you know, governments, countries, you know, individuals, companies, startups, everybody buying up all this hardware, racking it, data centers.
And it seems to me, having watched this happen with fiber, you know, we overbuilt fiber massively.
and then all the fiber companies,
WorldCom, etc.
There were a ton of these
went bankrupt.
They became worth literally 98, 99%
less than they were
when they went public and all that
wound of getting bought by Google
and other people at auctions.
Are we in a similar moment right now
where we're building up massive capacity
or do you think there's enough jobs here
to actually use this hardware?
And then the second part of the question,
so there's something about like this moment in time,
where does this all wind up?
If we're sitting here,
five years from today, are we looking and going,
hey, wow, there's somebody just leapfrogged
Nvidia or there's three choices.
You can go just like you do Android or you can do an iPhone
or you can pick AWS Azure or Google or Rackspace
or right on down the line.
Yeah, well, so great question.
So there are really two different questions.
Two different questions.
One question there is the TodayPro.
And today problem, everybody's talking about Nvidia
and the stockouts of Nvidia and wouldn't be great
if there are other options.
It's super funny because the majority of spend by many metrics is actually on the inference side,
which is still very dominated by CPUs.
Yeah.
And again, like, we talk about the pain point.
Well, the pain point is people try to build these massive systems and there are not enough
GPUs to go around.
But meanwhile, so much AI is in our life.
That's all being served in cloud.
A lot of that's happening.
I mean, some is on GPUs in cloud, but a lot of that's on CPUs.
Right.
And it works totally fine.
And it works totally fine.
If you're on Amazon and it's showing you some additional products like the one you're looking at,
in all likelihood, that is a machine learning job that's being done on a CPU that was written five years ago or 10 years ago.
Or if you do Google search query, there's dozens of models all talking and doing weird things.
And there's this intricate dance, right?
And so it's really interesting.
If you look at that, your question about is there going to be an oversupply and overabundance?
I have no way to know, right?
My goal is increase consumption by creating new.
categories. And so, and it has nothing to do with H100 or Nvidia. It's just about AI and the
applications of it are like a good thing. It makes people's worlds better. And so if we can
increase the number of cool things and make our lives better, that seems good to me.
Now, your question about where do we go from here, right? So forget about cloud for a second.
Like, so I've been working in the hardware software boundary for for decades now. And the thing that
when I zoom out and I look at, look at this time, it's been super interesting. You know, people talk
about Moore's Law ended, you know, whatever.
And what is Moore's Law?
Well, different nerves will argue pedantically what that means,
but it really means, you know, back in the day,
we'd give a new laptop, and every year would be, you know,
80, 2x faster.
18 months to be twice as fast.
Your Pentium chip was twice as fast.
Absolutely, on the same code, right?
And so what ended up happening, I don't know, 10 years ago-ish,
is we had multi-core CPUs.
Ah, we have more than one of these to deal with,
and then we had GPUs come on the scene.
Yep.
Right.
You look to now, we have massive GPUs.
We have really dedicated AI chips like the Google TPU and Gowdy from Intel and like all
these things.
There's tons of these things.
And we still have CPUs, but these days, CPUs have like 100 cores on it.
Right.
And so to me, again, many people are laser focused on the today problem.
Yeah.
But what happens when you look out five years or 10 years?
Yeah.
Right.
And to me, I look at this is driven by physics.
This is not a question about software or things like this.
physics is forcing hardware to get weird.
And more importantly, specialized in the rise of wearables,
the rise of personal computing, the rise of all, like, ARVR,
like all these things are a straight line towards very customized chips.
And so that's very interesting, yeah.
Yeah.
And so we're going to have all, I mean,
we're going to have even more crazy hardware in five years than we do today.
And this is where you start to say, like, how can we scale the software?
Right.
Nobody's going to be able to rewrite everything for every new generation of hardware.
That doesn't work.
And this is why we're focused on solving this problem.
What do you think of the open source risk five and, you know, AMD licensing models
and then hardware being built by other folks?
Obviously, Nvidia outsources their hardware in terms of how it's being, and they're a designer
as well, but it's proprietary and it's closed.
So is what happened with Python and other open source and, you know, everything we've seen in the
open source community?
Is that likely to happen with hardware?
Or is that, you know,
great question.
So immediately before module, I worked at a company called SciFive,
and they are the inventors of Risk V.
Yes.
Risk V is an open source instruction set.
And so what Risk V allows you to do is it allows any hardware maker
to create a member of the Risk V family.
And what that means, most importantly, is you get software.
And so that is huge.
Traditionally, you'd have, for example,
Arm owns the Arm Instruction Set,
and only Arm and its licensees can build Arm-compatible chips.
Or X-A-6, you can have Intel and AMD,
and they're the only ones allowed to build X-8-6 ships.
And so with Risk 5, it allows you to go build,
arbitrary people can invent new things and play there.
And I think that this is causing an explosion of innovation.
And again, the challenge with that,
and the good thing about that is you get explosion of innovation,
The challenge is that is you get all this crazy hardware, right?
And so there's no software, and so you need software that can scale on to all this innovation.
And so that's really where kind of the industries that will loggerheads.
Yeah, so AMD and these folks, they have blueprints, but they own those blueprints.
They're their patents.
You can't just take them and build a house with them if we're just using an analogy here.
But if you take the risk five, do they call it risk five or risk V?
It's risk five.
The nerdery on that is that there's four things before it.
Yeah.
Yeah.
I kind of got that.
I've heard somebody say risk V,
and I'm like,
are you sure it's risk V or is it sounds like risk five?
It's definitely five.
It's basically caught up to arm,
I think,
in terms of throughput or it's close enough.
So with any of these things,
it completely depends on what you measure.
There's advantages to arm,
there's advantages to risk five.
It's all super nuanced,
and a lot of people want to make overly simplified
does this thing better than this thing.
And in tech, it's never really that simple.
And so arm has got a very strong position.
They certainly have some challenges.
They've got to stand their toes.
But really, the innovation is the piece that I care about.
And I want to make it so that once these people invent really cool,
RIS 5-based silicon or arm-based silicon or whatever, right,
that they can actually do something about that.
Because having cool hardware that nobody uses is really kind of a problem right now.
All right, listen, when you're selling to business to business buyers, you really want to get your pitch in front of decision makers.
Why?
Because upper level execs are usually the ones making purchasing decisions.
Duh.
The problem is, high level folks can be really hard to find and target on most social media platforms.
But on LinkedIn, oh my God, they know all of the CTOs, all of the CFOs, all of the VPs of finance, engineering, HR, recruiting, all those types of,
titles are sitting there waiting for you.
And now let's just talk about the funnel.
LinkedIn's about to hit a billion members.
Did you know that?
950 million members at this point in time.
There are 180 million of those 950 who are senior level execs.
There are 10 million C-level executives in that 180 million senior level execs,
which are part of the 950 million members.
I am a C-level executive.
I am on LinkedIn all day long because LinkedIn equals business,
business equals LinkedIn.
and LinkedIn ads are built specifically for B2B marketers.
LinkedIn generates two to five times higher return on ad spend than other social media platforms.
LinkedIn equals business, business equals LinkedIn.
When people are on LinkedIn, they're ready to do business.
It's that simple.
So make business to business marketing, everything it can be.
And get a $100 credit on your next campaign from me, your boy, JCal.
I'm sending you the hundy, LinkedIn.com slash this week and startups to claim your credit.
That's LinkedIn.com slash this week.
startups, terms and conditions apply because they're giving you the honey.
Tell me how
Nvidia got here to a certain extent.
Yeah.
Because I think we watched this happen where nerds were playing Call of Duty and they
wanted their frame rates to, you know, it doesn't even matter.
It's beyond the just noticeable perception in biology.
You can't even tell the difference between 120 frames or 120, 240.
It doesn't even matter.
But these lunatics wanted the best, and I guess
NVIDIA just kept giving them better and better hardware.
And then you had this crazy crypto moment
where everybody started buying all this hardware from NVIDIA
to run jobs.
And now AI is kind of circuitous route, I think.
Maybe you could explain why that's brilliant
and then what the limitations of it are.
Because, again, it's not always one thing.
But I think the history of how they got here
is kind of important, or is it not?
It's totally important.
And I mean, to your audience of people who care about startups, it's super illustrative, right?
Because invidia didn't magically step onto success.
It was earned, right?
It wasn't an accident.
And so if you go back, I'm not a super expert in Nvidia history, but my understanding is it's a combination of two really important things.
So, Nvidia, like some of the other companies you're a fan of, goes, went through several phases where they made bet the farm bets, had near-death experiences.
and then we're right.
And so one of those bets was on programmability.
And so a lot of people were building the Call of Duty accelerator,
and there's a bunch of competition on just make games go faster,
just make games go faster, just make games go faster.
And Jensen and team bet I think it was the G4S3 on saying,
okay, well, hard coding for graphics is not enough.
Let's make it so you can do more general compute on this hardware.
And so it's not going to be like a CPU.
It's a different thing.
It's a different category.
created, but let's do this. And that was a huge bet and a non-obvious bet. Nobody else made
that bet back then. Almost drove them out of business through the complexity of executing on that.
But what it meant is that new kinds of things could run on the graphics card. And that
created new markets. And so one of the things you're pointing out is crypto, right? Well,
they didn't design a crypto accelerator. Crypto wandered up and said, I need tremendous amounts
of compute. And they were there and ready to serve it. And because they had programmability,
they're able to scale into the opportunity.
They talk about luck, right?
Well, how do you get lucky?
Well, part of it is being ready to take advantage of the luck that presents itself.
And I think that is really what happened to them.
If you look at machine learning, right,
a lot of people go back to the seminal moment in machine learning called the AlexNet moment.
Explain.
And AlexNet was when Fei-Fei's team at Stanford created this big data set called ImageNet.
And they created a competition around it.
and that competition was to go find the most accurate predictor and identifier for what was in an image.
And so for a few years, people were working on this using traditional machine learning techniques.
And then these folks invented this deep drone network called AlexNet that then solved ImageNet,
not solved it, but made massively forward in terms of prediction.
Now, the way that story is usually told is that it's a combination of two different things.
It's a combination of having a huge amount of data, but then also having GPU compute.
And so we need both data and compute to be able to solve that problem and make that massively forward,
which then catalyze so much of deep learning today.
But the thing they forget is that nobody had the convolution kernels, the algorithms, to implement Reson.
That didn't exist on a GPU back.
So the reason Alex and it happened is a combination of three things, actually.
It's a combination of data, the amount of compute that was available, and then the bet the Jensen and his team made on programmability
to allow some researchers to go invent some new algorithms
and then do it on their platform.
And then fast forward a few years,
it turns out, yeah,
they're lucky that deep learning caught on
and it turned out to be pretty economically important.
But that's what put them in the position
that cause all these things like TensorFlow and Pytorch
and things like that to get built on their platform.
And that's how Kuda got entrenched
into so much of machine learning today.
So the journey of Invidia,
I mean, you can play this back across so many startups, right?
Are you creating a new category?
are you leaning into the obvious thing
everybody's talking about today?
Are you seeing around the corner
and betting on where technology is going, right?
There's so many of these questions
that I think that, you know,
there's no one right answer,
but it really plays into a lot of the journey.
Well, and to your point about what you're doing, Mojo,
if you enable more people,
the street finds its own use for technology.
Exactly.
And Gibson quote, like, you say,
hey, listen, you want to do some of the,
you want to try to identify an image
and figure out if it's a hot dog or not?
Sure.
Use our GPU.
You don't need our permission because it's permissionless.
I mean, not crypto permissionless, but
it's your hardware, you own it, do what you want with it.
And it's one of the great, great things about whether it's open source or just open
platforms in general, people are building platforms.
Yep.
And so when you look at this from a playing field, having been an Apple, and watched what
happened with open platforms and apps, where do you fall on the, call it the AI rapper
debate of
2023.
Oh, this company,
we have a great
company, roam around.
They let you type,
they're building a vertical
itinerary,
travel itinerary
piece of software and say,
oh, you can go to chat GPT
and say,
hey, where should I go
in San Diego with my kids?
Or, you know,
roam around's building it
and they've got a very narrow
data set and they're,
they're really tweaking it
around travel.
So you have all these
verticalized ones.
I have a,
we invest in a verticalized
screenplay writing software.
So where a writer,
It's kind of like final draft just for that.
And I believe it's like, yeah,
there'll be a lot of these vertical things
because you have the interface
and you have all the kind of features that will go around it.
And sure, chat GPT could do a version of it,
but it's not going to do like a polished version of it.
So the AI wrapper derogatory statement
towards startups building verticalized AI apps
versus one giant language model quad or
magically solves all the problems.
Magically solves every problem on the planet.
Is that even possible?
or where do you think this all winds up?
Well, so, I mean, I think that there's many different angles
in terms of what is the better product,
what captures the most value,
in terms of investment hypothesis,
like what is the ROI on these things, right?
So when I look at this as saying,
I'm not a believer in a one-size-fits-all solution.
I mean, maybe theoretically, AGI someday will come,
and until then, I will hold on to that thought.
but in the absence of AGI which magically solves all problems,
I look at AI as being a solution to certain kinds of problems.
Right?
And some people, some of my friends even,
want to say that AI is better than software.
You know, and it's just like a straight replacement.
But that's, in my opinion, objectively false.
What you can look at...
What they mean by that is just having a chat interface with an AI agent
and talking to them, you'll solve more problems than having to write software.
It'll just do whatever the task is.
or are you saying in terms of writing software?
Well, I mean, you know this, Jason.
Like, you know that building a product is way more than like having an algorithm, right?
It's about building a relationship with the customers.
It's about having user interface.
It's about having a revenue model.
It's about having a brand.
It's having all of these things, right?
And so when I look at that, when I look at one of these verticals,
so you talk about the copywriting thing or these things,
these are clearly valuable products.
AI is clearly a valuable.
way to implement these products and it can be differentiation within that category.
I don't think that makes that product magical.
I think that that makes it comparable to other things in that vertical.
And so AI is a much more efficient and smart and product-focused way of building out that
technology.
But I would look at that as saying AI is an implementation detail of building into that
vertical.
And I think that has a huge amount of value.
And so if you're looking as an investment hypothesis, I would not value that as an AI
company per se.
I would value it as a vertical, consumer vertical, whatever it is company.
And now they're doing in a smart way using the best tech they have available.
Yeah, just like there's going to be some, you know, the Yelp app, the Yelp app is so much better than using the website, right?
And they just use that new technology to make it a better experience.
Everybody's also looking at the David versus Goliath thing, right?
And so everybody wants the little guys to take down the big guys.
but the big guys have all these other things going for them, including distribution and many of these other things.
Well, listen, you're on the inside of all this. I got to ask you, what is the inside track amongst people of your peers who are deep in the AI game and have been in it for a long time?
What's her take on what Open AI did? This open source, you know, or, you know, open, it's in the name.
And that, hey, we're going to, this is too important. This technology is way too important for any major company to have a wrap-on.
on it. It's really the world needs us to go out there and really make sure that it's not just deep
mind inside of buried in some Google, you know, a corridor and some building on a campus.
Sure.
We're going to build this. And then they got to 3.5 and they're like, whatever, three.
And they're like, you know what? We were wrong. I don't have ever said that. But this is way
too powerful. We're going to be a closed eye. Do people look at that as just a money grab as
cynicism or as sincere.
How does the industry, and I'm not saying
necessarily you, but do people
look at that and go, it's a money grab?
They went from a non-profit to a for-profit.
That's all it is.
You know, the people there want to make money,
which is fine, we all do.
You're racing venture capital.
It doesn't come without expectations.
So what's the take on that crazy move
to go from a nonprofit to a for-profit
from a open system to a closed system?
Honestly, this isn't my area of specialization.
I mean, I'd much rather talk about things.
What do you think about this weirdness?
My opinion is, what do you expect?
They took VC money to get returned.
Yeah.
The end.
Yeah.
Well, I mean, and so, I mean, I think that things that appear too good to be true sometimes are, right?
And so if you're expecting, if you're expecting somebody out of the goodness of their heart to dump billions of dollars of compute into building a free product, then, well, you're paying for it somehow.
Maybe it's with your data.
Maybe it's some other way.
I mean, I think this is generally true in the world.
And I think people are getting smarter about that.
And so, I mean, I don't know.
I mean, I think the surprise is.
surprising, but I don't, I don't know too much about the details on how they decided to do that or what it means.
Yeah, there's tradeoffs everywhere.
You look at the impact on society.
I am, you know, I'm an investor in a lot of companies.
And what I'm seeing on the front line of startups and inside really nimble organizations that are the tip of the spear in terms of using technology, not just to build their product, but to build their businesses, they're building 12-person businesses.
with four people.
They are getting a lot done with less.
And it happened, boom, in one year.
This is year one.
I mean, people still forget that it was last fall
that 3.5 came out and kind of blew people's minds,
let alone 4.0 and whatever else is coming next.
So when you look at the impact on the world,
knowing what you know from the seat you're in,
is what we saw this year,
which is to say,
I think people got 30 or 40% more efficient at their jobs,
if they know how to use this technology easily.
Is that going to compound or is it going to be the same?
And then impact on society.
Yeah, also, I don't know, I don't know the math on that, but the impact's going to be huge, right?
But the huge impact is also going to be spread out over time, right?
The impact, as you say, you have seen it, but you're zeroed into a very specific part of the problem.
We still can't hire programmers.
There's not enough programmers out there to implement all the stuff that needs to be implemented.
And so while it is true, it's important.
part of the ecosystem, it turns out that there's a big part that it isn't. One of my questions is that when you have disruptive technology, how do you think about technology diffusion? How long does it take something that should be disruptive? And everybody knows it's a 10x improvement or whatever. How long does it take to actually get out into the ecosystem? Because sure, the neural network algorithms change every week, but we humans don't. It takes a long time for us to learn new habits. And it takes time for all the playing cycle and things like this.
to change. One of the things I think people forget is that as a coder,
people focus on, okay, I'm going to study up, I'm going to put the semicolon's
right place, and I've worked on programming languages forever.
But so much of coding is working as part of a team.
Right? And so the way I look at this is I look at it as saying,
okay, imagine you had the amazingly awesome coder robot.
Right? And we're not amazingly awesome yet. We're promising, but we're not amazingly awesome
yet.
You still,
that's like adding a member to your team.
Right?
And so adding one member
to a four-person team is huge.
Huge left.
Particularly if they're really good.
We still need to review the code.
You still need to do a great in the product.
You still need to decide your product strategy.
You have to understand the relationship with the customer.
You have to,
so you're improving one really important part of the problem.
You still have to do all the other work.
Yeah.
Now chat GPT and things like this can help with some of that.
They can help with graphic design and like AI is good,
good at many different pieces.
but I think that it will take time for us all to figure out how best utilize this
and is it cumulative or is it disruptive or how does that work out over time?
How much faster are developers getting in your estimation?
Like with these co-pilots and it feels like they're getting 10, 20, 30% faster year over year?
I don't know if it's cumulative is the problem, right?
So because what I've seen is sort of what I was getting at is like,
I've seen a lot of boilerplate get automated.
I haven't seen a lot of the actually.
interesting part of product design get automated.
Fascinating.
Yeah, so that's where the human creativity will be.
Yep.
Yeah.
And so this is where like, yeah, if you take, I don't go back in the day, XML or something.
Like, if you take something super boilerplatey, then AI animation's amazing, right?
But there are also other better ways to do that.
You know, so that's a different way to look at the question.
We also have a little bit of a corollary for this.
I look back on my career and it's like, it was two decades before everybody got a PC
on their desk and in their home.
It was literally from like 1980 to 2000.
By the time you got to 2000,
the idea that somebody didn't have a computer at work was like,
really?
I mean,
you'd have to look really hard in an organization in 2000
to find somebody with a desk without a desktop computer on it.
Or cell phones, right?
And then you look at cell phones,
two decades.
Usually disruptive technology.
Diffusion takes time right now.
I think this makeup much faster than hardware transitions did
because the inherent time delays and manufacturing and stuff like that
is much lower,
but it'll be similar.
Well, I mean, now we, but I think that's, you just nailed the point, which is then you look at something like Google, Uber or, you know, some other software-based platforms that don't require, you know, hardware that are built on top of them.
Those things all took 10 years.
So, you know, I think maybe this next group is, you know, maybe go from 20 years to deploy, 10 years to deploy, and hit the masses.
And maybe now it's three, four, five.
Well, as you look at startups, right?
I mean, I think that I've seen so many of these, I'm sure you've seen probably 100x more, but so many of these folks that are like, look, I built a thing.
It's a thin layer on top of chat, GPT.
I hacked it together a month.
I'm going to make massive amounts of money,
and it's going to be amazing, right?
Yeah.
In my experience, which is obviously small selection size,
but if you can build something in a month,
so can everybody else.
Yeah.
There's no mode, by the way.
Exactly.
And so if you works,
then everybody's going to be after you, right?
And so that's one of the challenges.
And for me, this is where I, the things I work on can take years, right?
And so what I do is I say, okay, well,
this is going to be a 10 or 15 or 20 or 20.
20 year journey, how do I break it down in milestones? How do I have
usefully viable things that are maybe not the big win?
Everybody wants to jump to the end, but how do I make sure we're making
progress in delivering useful value and learning and iterating and cycling,
building up to something that's really quite huge. And to me,
that's a lot more interesting. What's your next one? What's the next milestone? What's
the waypoint that you're working towards? Yeah. So why don't we go back to modular?
Because I don't think we've talked much about products and where we are.
Yeah. So modular, what we're doing is we're tackling all this complexity, right?
This industry is a mess.
We have all these people, all these companies, all this stuff happening,
and it's just keeping track of it as a mess,
but also you have all these infighting groups,
like none of the LLN companies get along.
No, the hardware people get along, no, the cloud people get along.
Nobody gets along the space, right?
And so as a consequence of that, all that complexity is being forced on this.
And so modular is rebuilding this from the bottom up and providing a unified thing
that simplifies this way for people.
Mojo, which you brought up, is one of the major pieces of this.
What mojo is is it's a programming language.
Well, who in the right mind invents a new programming language?
Well, I've been there done that.
I've built OpenCL.
I built one of the most widely used implementations of C+++,
I built the Swift programming language from scratch, right?
And so why do you do that?
Well, you do that because you want to build and help and solve a problem
that you can't solve any other way.
Like building a programming language should never be in anybody's right mind.
The first thing you jump to.
But here's the problem we faced, which is that everybody in machine learning uses Python.
People generally love it, right?
Python is, I mean, my kids know Python, right?
It's ubiquitous.
And people don't consider it to be broken.
But then you run into AI where now you have high-performance GPUs and you have crazy accelerators
and you have all this kind of stuff going on and you have C++.
And you realize that Python is really great at composing opaque things that other people made.
But it doesn't give you the hack ability to actually go customize and change things.
And so what Mojo does is Mojo says, okay, well, let's take this problem.
And let's do a very hard tech project of building a new programming language,
inventing all new compilers and runtimes and very low-level system stuff that allows Python to scale.
Let's embrace Python and its entire ecosystem.
Because what I've learned in my experience with this kind of stuff is that generally humans love to learn things.
We all love to grow.
We like learning new techniques.
We want to put new things in our toolbox.
It's all great.
But we hate resetting to zero so that we can then learn.
And so what Mojo allows you to do is, if you know Python,
you can walk right in, the things you already know continue to work.
But now if you want to write some high performance code, you can do so.
And not everything needs to be high performance.
You can choose where you care about applying the time.
And that allows you to scale.
And so a big part about what modular does is our number one mandate is
meet the consumer where they are.
Right?
And guess what?
A lot of developers are on Python.
We love Python.
We want to make it better.
We're not trying to go, like,
make a completely different system
that has nothing to do with Python
and hope it ends up being better.
It's a different approach.
AI is what you're talking about.
Huge mess.
Like all these different fighting systems,
there's no thing to plug into.
None of the stuff is compatible.
So what modular provides is this thing called the AI engine.
The AI engine is a drop-in compatible replacement
for tense flow in Python.
And so if you're using Pi Torch, if you're using TensorFlow,
you do not have to rewrite your code.
Turns out who wants to rewrite their code?
Nobody stands up, right?
And so what we can do is we can be a drop in replacement that then provides a ton of value.
And so for a lot of enterprises, it has value in terms of
consolidating, eliminating all the point solutions.
And so many people have a little bit of TensorFlow, a little bit of PiTorch.
So that's a huge four.
Now they have a little bit of CPU, a little bit of GPU.
They have a little bit of this, a little bit of that.
They have different kinds of models and different kinds of specialized things,
and we can consolidate that into one simple thing that turns out is commercially supported.
Who wants to run their own mail server these days?
Do you want to build and run your own cobbled together storage thing?
Exactly. It doesn't make any sense.
Again, AI needs to grow up.
It's programmable and extensible.
Do you want to give up your product strategy to somebody else?
Well, no.
It turns out that people want to take models and then customize.
You want to make it work right for what you're doing.
And so having the ability to hack the system is actually super important.
It's accessible via hardware.
And all these different pieces, the mojo and engine story comes together.
How hard is it to make it compatible with each different hardware platform?
How long does that take?
It's super hard.
So, I mean, if you want me to talk about my backstory,
like I've been working on these super exotic, esoteric compilers and systems and GPUs and
accelerators and things for decades now, right?
And so a lot of what brought Modular to exist is this realization that if we keep building
one-off solutions to each of these things, we as a software industry will never scale.
And so a lot of the core tech, a lot of the core invention at Modular, and the reason that
what we have is interesting is we enable people to bring a part of a much faster.
And so, for example, we have just on CPU front as an example, lots of people use,
Intel CPUs. They're really great.
They're pervasively available
in cloud. It turns out
that PyTorch, for example,
super optimized by Intel for Intel
CPUs.
Also turns out that you can get AMD
CPUs in cloud.
Turns out their instance types are usually much
cheaper for the same amount of
performance horsepower.
But guess what? For some reason, it doesn't run super
effectively on AMD CPUs. Oh, wow. Go figure.
Go figure, right? And
so it turns out modular has massive
performance uplifts on Intel, even bigger uploads on AMD,
but then you can also go to these other instance types like Graviton,
which are arm-based cloud servers,
and they're even less expensive,
and our performance uploads are even bigger, right?
And so what we can do is we can provide the ability to move your workload
to the place that makes sense for your thing.
And for us, bringing up Graviton,
just in terms of bringing up an entire machine learning stack,
it took us four hours.
Wow.
For a completely new architecture.
And that's one of the things that nobody in the industry,
in the AI infra industry has is the ability to bring up the entire stack quickly
and then do performance to.
Most of the time, the problem you have is that you have to do all this incremental work
to get new kinds of models to run.
And so that's one of the reasons why you get all this fragmentation.
There's all these translators.
You have Apple decided they would get off Intel.
They never were on AMD, but Windows was on both.
And they started doing these M1, M2 chips.
they're pretty extraordinary in terms of running a laptop or a desktop in terms of performance video.
And of course, you know, battery life.
They're optimized for what, you know, a very consumer bent, let's say.
And that's the world I lived for years at Apple, right?
Right, right.
They're helping with hardware transitions, helping the watch get to 3-2-bit arm,
to 64-bit arm to all the complexity that goes into that,
that Apple makes magic for developers so that nobody has to know about it.
Yeah.
Are they, do you think they're going to play a role?
here?
Do you think their chips are so high performance that they've got a shot at taking on some
machine learning and, you know, AI jobs and sincerity?
Or is it just because I was just watching somebody, you know, putting Lambda, they were,
you know, trying to build some models on their M2 and they were just like, wow, this is
pretty extraordinary.
Yeah.
So what I've seen out there is that, so I've been out of Apple for a long time, so I don't speak
Apple.
I know nothing about the roadmap, et cetera, et cetera, et cetera, et cetera.
disclaimer, disclaimer.
I don't think they're interested in the training market.
Their hardware is completely irrelevant there, in my opinion.
And they're not even trying because they don't think it's an interesting market.
It's not consumer aligned.
It's very low margin compared to this.
I mean, in video is accepted, I guess.
But that's not their strong point.
What they're really focusing on is the client.
And so you look at it, there's all these Lama.com and things like this where people are running LLMs on their laptop.
Apple's all over that.
They're super into that.
And it turns out that, again, you look at the shift that we started from,
there's this training part of the problem and then the inference part of the problem.
What we've seen is this rise of pre-trained models.
And so training a model is actually becoming actually less important over time, maybe,
at least the number of people that participate in that can go down.
And, you know, if meta keeps launching, like, amazing models that they train themselves, right?
That are good enough or great enough.
Yeah.
Then you were on the inference side and, yeah, running it on your desktop.
becomes super interesting.
Right.
And inference is the part
that you integrate into your product.
Right.
And so that becomes the interesting things.
You want to run chat GPT on your phone.
You don't want to train chat GPT unless you're crazy.
Yeah.
Yeah.
Amazing.
Well, listen,
great start.
Really excited to see where you take it.
I know you're on a hiring binge right now.
And you really want to bring talent on board.
Yes.
pitch to developers of why to come work on this problem.
And what are you looking for?
And what's the culture like at module?
Yeah, so what we're doing is we're taking on a really hard technology problem.
Right.
So this is a part of the problem in a layer of the stack that very few people understand.
And honestly, it's things that people want to build on top of instead of having to understand.
Right.
But now for the specific kinds of hardware software cloud folks that care about super scale,
turns out there's a lot of money being spent in the space.
Turns out there's a great set of opportunities in front of us.
It's a really exciting time in the domain.
One of the things that's really unusual about modular
is that we don't run from demo to demo to demo to demo.
We actually build high-quality production stuff,
and we care about building things right.
And what I found is that if you build things right
and deliberately, strategically,
and you put down the bricks one after the other,
you can build some pretty epic things.
And you look at Mojo, for example.
We're building potentially the successor to Python.
Amazing.
Right?
We love Python.
Python's never going to go away,
but this thing can take Python and give it superpower.
And as it does that, right,
the opportunity to impact hundreds of millions of developers is profound, right?
And you look at AI.
How many developers is AI can impact?
Uncourable.
All, right?
All.
100%.
I mean, in the fact that we might have, you know,
a larger aperture of people who could participate in developing, right?
Exactly.
It wasn't open to as many people.
And now with these tools, it clearly.
Exactly.
And so, and so modular, right,
what we're doing is we're focusing on this layer of the stack that we think we contribute
Two.
So we're not building the LLM.
We want to help those people do that.
We're not building the cloud.
We're not building the hard work.
We're helping solve this problem that we think is really useful for people
and it will allow other people to build on the platform.
And as building this thing out, our platform is opening.
As an open platform, we think we're going to be able to help lots and lots and lots of people,
which is super fun.
And you want to build it right.
So I was just looking at your careers page.
Go to modular.com slash careers.
If you want to build important things and you want to build them right.
and enable a lot more people
to participate in the eye future.
Listen, you've been a great guest.
Please come on again.
And continue to success with it.
I'm a huge fan of your Jason,
so thank you for having me.
I appreciate that.
All right, everybody.
We'll see you next time
on this weekend startups.
