This Week in Startups - Expanding AI chip capabilities beyond Nvidia with Modular CEO Chris Lattner | E1808

Starting point is 00:00:00 If you go back in time, I built a technology called LLVM, which is this fairly obscure compiler technology that then is probably on your phone today and on many of your laptops and in your consoles and things like this. That technology helped unify a generation of compute around CPS in particular. And so LLVM was great for hardware people because they could integrate with LLVM,

Starting point is 00:00:21 and then they got all the C++ plus and all the Swift and all the other languages and Rust and Julian things like this for free. But machine learning doesn't have that. And so what modular is building is it's building that thing that once you plug into it, you have a full AI stack. And for hardware maker, that's a very powerful thing. This weekend startups is brought to you by Roots.

Starting point is 00:00:43 Invest in the only real estate investment trust that creates wealth for you and its residents at investwithroots.com slash twist. SuperGut is the only nutrition brand clinically proven to improve digestion, balanced blood sugar, sustain energy, and manage weight. Save 25% on the delicious shakes, bars, and prebiotic mix at supergut.com with code twist. And LinkedIn marketing. To redeem a free $100 LinkedIn ad credit and launch your first campaign, go to LinkedIn.com slash This Week in Startups.

Starting point is 00:01:23 All right, everybody, welcome back to This Week in Starters. We're excited for today's guest because He's worked at some of the biggest technology companies in the world and working on AI. His name is Chris Latner. His company is modular. He's worked at Apple. He's worked at Tesla. He's worked at Google.

Starting point is 00:01:41 And now he's got his own startup, as I just said, modular. So, as we all know, Nvidia's dominant right now in the AI space. $16 billion in revenue in Q3. That's 2x year over year. They're wildly profitable. Stocks doubled since 2020. 23. But as we've said, on this pod and all in, and there's going to be competitors coming, right? Of course. And some startups are going at Nvidia on the hardware front. We had

Starting point is 00:02:08 light matter on recently in episode 1787. And they're trying to use optics, photonics-based chips basically to move data around. It's going to make things cooler in data centers and help with these large AI jobs. Well, Chris is taking a different approach at modular. they're going to make it easier for developers to run AI modules on non-invideo hardware, and they just raised $100 million, as AI companies are apt to do in 2023. Chris, welcome to the show. Well, quite the introduction, Jason. Thank you for having me.

Starting point is 00:02:45 It's great to be here. Yeah, great to have you, and you are in the thick of it. One of the things I hear over and over again from people deep in the AI space, I had a conversation with Elon about this not recently, and we see it at OpenAI and other places, is only a small amount of the hardware that's being purchased is being used at any given point in time when AI jobs are running. So for people who are technical,

Starting point is 00:03:12 but maybe not working in the specific field, why is it that when we push a job, you know, we're doing chat GPT-5 or Claude 7.0, whatever people are doing, they're doing a Lambda or a Lama. I mean, there's just so many different things on Hugging Face right now. Why is it that so the hardware is not optimized to these jobs? Why are we find ourselves in this? And then what is the actual percentage of the hardware being used,

Starting point is 00:03:38 whether it's an H-100, A-100, or my M2 on my MacBook Pro? Yeah, so it's super interesting. If you zoom into what is AI these days, right? So many people focus on training. You have to start with the research. You have to start with models. Models are changing all the time. I mean, just follow what's happening.

Starting point is 00:03:56 It's hard to keep up with the pace of innovation and the model architectures. But then there's also the inference side of things and the deployment side of things. And so these two markets, these two problems are actually completely different. So what you're talking about is you're actually referring to the training side of this. And modern training jobs, as many people know, have gotten huge, right? You get tens of thousands of nodes, thousands of GPUs. These are monstrous jobs. And so because of that, what you get is these time sharing systems.

Starting point is 00:04:23 and so it's super funny. Like we went from personalized computers all the way back to the mainframe or the job sharing. Like I'm going to put in my punch cards. Right, that was Perot Systems. Yeah. Yeah. You've read time on somebody's mainframe. Well, yeah.

Starting point is 00:04:38 So we're back in those days. And so the actually better analogy, if I'm not joking about it, is HPC systems. And so if you go back 10 years ago or something, you'd get one of these massive supercomputer systems that a national lab would install. And then researchers would have to like walk up and allocate. time against it, right? And so the big question then is how do you amortize the spend for the hardware across a lot of work that happens on any one of these massive supercomputers? And training systems today, they're massive supercomputers in every way, shape, and form.

Starting point is 00:05:09 The program malls are very different. The workloads end up being a bit different. And so there's some differences, of course, but the way they get managed is very similar. Now, what I've seen is different groups that own these things, manage them sometimes better, sometimes worse. And one of the challenges you'll see is that, for example, the big research teams may allocate, you know, 20,000 GPUs or something. But then the question is, how do you fully utilize it? This is one of the cases where time sharing, like clouds, are actually really great because often you're not training models all the time, right? Your model training is actually proportional to the research cycle that you've got going on. And so if you're, you know, one of the massive companies like Google, where you have thousands and thousands of researchers,

Starting point is 00:05:52 what you'll do is you'll have this big hardware pool, and then you'll have the researchers that are all effectively putting in their slot so they can use the machines when they come up, and then they run their batch job for perhaps hours, perhaps days, perhaps months, right? And they get allocation for it. But if you get these smaller groups where sometimes they're on cloud, and so they're just renting by the hour,

Starting point is 00:06:11 sometimes they build their own data centers, and then the problem they have is, okay, cool, you have all this hardware. How are you utilizing? Is it being productively used? And so these are major questions. that I think that the entire industry is struggling with. But if you go just adjacent to that, that's training. That's where the models come from.

Starting point is 00:06:29 If you go to production, the character is completely different. And so here, you're not talking about supercomputers. Here you're talking about the fact that, you know, you may have tens of researchers that train a model and they use a massive amount of hardware to do so. But then you need to deploy that model. You need to deploy the model. The problems are completely different. Right.

Starting point is 00:06:50 here the problem is you have a billion users. And a lot of queries and then a lot of follow-up queries. And people want to, I guess, I'm not sure what it's called when you, well, there's prompt engineering and the prompts are getting more sophisticated. So all that creates load on the system. Yep. And the load on that system is really different. Instead of it being one massive computer that is then batch scheduled, what you need is

Starting point is 00:07:11 you need scale out. And so any one of those systems is actually a single node often. But now you need thousands and thousands of these nodes. and those are fully utilized, right? Because you've got users in 24 times, in all the time zones, right? And so that's actually a very different problem, and it's super interesting.

Starting point is 00:07:29 And so if you look at AI today, it's super fascinating to me how much energy has been put into the training side. Everybody's always talking about the research, models, and the training, and the training, and the training. Few people talk about what it takes to get that thing into production. Yeah. And one of the big challenges that we as an industry are facing today

Starting point is 00:07:45 is that, you know, these systems that people build with, like TensorFlow and Pytor, and these kinds of things were always built by the research team for training. And so getting that model and production is super difficult

Starting point is 00:07:56 and this is almost an unsolved problem these days. And one of the challenges there in particular is it's not just about cloud. Often you want to train a model and then put it on a phone. Right? And so it's a very different problem space

Starting point is 00:08:07 and it's much harder than some, I mean, it's very, both of these problems are really cool, but it's super hard. Explain to folks, after all the training has been done and then you have this language model and you then want to load it onto a phone. How does that all work? What is the output and how would you explain it to a layperson of, hey, we built the model, but now we want to

Starting point is 00:08:32 distribute the model to a bunch of different places and then let you play with it. But what is required there? So I don't think that it would be in good taste to talk about how we do this because it is so complicated and nasty and horrible that we cannot go into all the details. But I'll give you a sense. Because that's how I am. Right. So if you take a traditional enterprise, it's building ML into their products, right? Often they're not building one model into one product.

Starting point is 00:09:00 Right? So they have many different kinds of models, some recommender models for like, hey, maybe you should look at this in your shopping cart next. You have classification models. So you're looking at, okay, well, you like that shirt. Like, maybe you should pick this shirt. There's many different kinds of products.

Starting point is 00:09:16 They then get matrix into many different different. kinds of things that they're deploying into. So often cloud is a big deal, but then you have mobile apps and a lot of other things. And so what has ended up happening is that deploying ML today involves building this entire matrix of all these point solutions, because there's no one

Starting point is 00:09:33 thing that allows you to span across all of these things. And so what you end up using is like this catastrophic array of like 15 different tools. And all these tools have different problems. Like so I'm an Apple an Apple alumni. I have a ton of

Starting point is 00:09:49 The easy-to-use programming language for building apps. And so I love Apple and I love the Apple folks, but to deploy ML onto an Apple platform, you have to use their point solution called CoreML. And Coromel is not compatible with all the models, and so there's all this friction just to get onto an Apple device, right? And so Apple devices are pretty common out there. And if that's hard, you just think about what it means for this wide spectrum of different things. And one of the challenges here, the fundamental, the incentives,

Starting point is 00:10:19 structure problem is that hardware makers like Apple, like many other hardware makers, always want to build a solution for their hardware. And nobody's trying to build something that scales across everything. And so this is what we're focused on. Hey, everybody. Today I'm joined by Rootts CEO, Dan, welcome to the show. Thanks for having me, Jason. Tell everybody here in the audience, what is Roots and what makes it different than the other real estate investing platforms? I'm a complete neophyte. Roots is a reet with a little twist. Sorry, how to do it. We are the first real estate portfolio that we know of that builds wealth for both our investors and our residents. And we've created a unique win-win model that creates partners and not tenants.

Starting point is 00:11:00 Am I as an investor, if I wanted to put money into this, getting dividends, or am I just getting the growth of it? How does all that work? When you invest with us, you get to participate in two ways. One is through the distributions of profits generated at the company. And we pay those out quarterly. Over the last 12 months, that's equated to about a 6% cash on cash return to our investors just in distributions. And then the other way everybody participates is each quarter, we reevaluate what's called our net asset value. And as that ticks up, our unit price or our share price of our portfolio goes up as well.

Starting point is 00:11:38 And that's how you would basically be able to sell your share at any point and liquidate your investment and move on to your next piece or leave it in and keep it. keep growing with us. Head to invest with roots.com slash twist to sign up and start investing today. That's invest with roots, no spaces, no dashes, dot com slash twist to sign up today. Because Nvidia has Kuda, right? That's their software for writing their machine learning apps. Apple has theirs.

Starting point is 00:12:10 And these two things are just... Google has theirs. Tesla has theirs. Like, everybody builds their own thing. So if you go back in time, why does everybody build their own things? Is it just because it didn't exist before or because its customization is necessary

Starting point is 00:12:26 to get the end result they want? Well, because they don't have a choice functionally, right? And so it's super interesting. I mean, AI is so important to what we do, right? Nobody takes a step back and says, if AI is so important for the industry, why is all the AI software so bad?

Starting point is 00:12:43 Right? And so you look at that. Is it a function of time? We just were so young in the game? Yeah, that's, that's, a big aspect of it. So the analogy I give to people is that AI is like an adolescent. Like, it's like a teenager, right? It's, it has some, it's very exciting. It's overconfident. It's got some winds under his belt. It sometimes rolls over its parents car and causes a mess, right?

Starting point is 00:13:03 But what's happening right now is everybody just wants AI to grow up. Like, people want to build AI into their products. They want to not mess with the AI infrastructure. They want to actually be able to deploy things and build AI-enabled products, right? And right now, if you're one of the Fang companies, for example, you can take a team of 50 people and brute force it. But if you're many other people that should be using AI in their applications, it's so much more difficult. And to your question, like, why does they ever build their stack? They don't have a choice. Like, all of the technologies that exist today are built for a particular piece of hardware or they're built by a research team. This stuff is not production quality. And if you go,

Starting point is 00:13:43 if you go back in time, I built a technology called LLVM, which is this fairly obscure compiler technology that then is probably on your phone today and on many of your laptops and in your consoles and things like this. That technology helped unify a generation of compute around CPS in particular. LLVM was great for hardware people because they could integrate with LLVM, and then they got all the C++ plus and all the Swift and all the other languages and Rust and Julian and things like this for free. But machine learning doesn't have that.

Starting point is 00:14:11 And so what modular is building is it's building that thing that once you plug into it, you have a full AI stack. For hardware maker, that's a very powerful thing. And what's NVIDIA's take on what you're doing? Are they supportive of what you're doing? Or do they feel like what you're doing,

Starting point is 00:14:28 they're not supportive of because it's going to help, you know, people maybe port to other hardware platforms and maybe take away their dominance or to get the sense that they care about their dominance at this point? I mean, they seem to have run away with it. right now. Yeah, well, great question. So, I mean, there's this narrative in the industry that we're

Starting point is 00:14:45 here to hurt Nvidia or something. Invita is one of our most important partners, right? And, and one of the things that I think people forget about is invidia is really invested in building some really crazy exotic next generation products. Yeah. Right. And so what we're interested in doing is we're interested in expanding the developer ecosystem that can use those products. So we're on a very complementary set of missions here, right? And so what we're doing is we're looking at saying, okay, well, this whole AI thing, it evolved rapidly.

Starting point is 00:15:17 Again, it's very high potential, but it's all a mess. Like, the people who do it, as you know, are wicked smart. Some of the most brilliant people in the industry. But there's other good people, too, that have good ideas. And so if we expand out the developer community,

Starting point is 00:15:30 if we 10x a number of people that can participate, think about the amount of innovation that can happen. Think about the new use cases and applications. Yeah, right now people don't, actually know this, but a lot of what's happening in AI is limited to people who can code in Kuda, Kudo. What is it? Yeah, Kuda.

Starting point is 00:15:49 Yeah, Kuda. And then I guess some people write in C Sharp or C++. What are the other ways people generally get AI code, you know, down the hardware stack? Because you're building Mojo, I know, which is, you know, more Python-like, I think. Yeah, well, we'll talk about that. So it really, it really varies. And again, AI is not one thing. This is another thing that I think people get sometimes distracted by,

Starting point is 00:16:11 but it's not like transformers are one thing, for example. And so if you look at a lot of a model or like stable diffusion, which is a UNET model, which is a very different architecture, what you get is a lot of Python on the outside. The Python handles what's called tokenization of converting input text into something the model can understand. You then get something like PyTorch or TensorFlow involved, which is itself a gigantic, complicated thing that is awesome in some ways,

Starting point is 00:16:38 but also challenging in other ways. You get custom kuda kernels, as you're saying. So you want to get high performance out of one accelerator. And so you get C++ plus because sometimes Python is really slow. And so what ends up happening is the developer building one of these next generation models, you have to know all of these different things. And so practically speaking, no no sane humans actually can do that. And so this is why you need teams of experts.

Starting point is 00:17:03 And these teams are super experts in every single different one of these parts of the problem where somebody knows model architecture and differential equations, somebody knows kudos, somebody knows C++, somebody knows all these things. And so only that is what's able to bring these things together. Which we've seen this movie before in the early days of the web, setting up a web server itself, getting a sun microsystems, you know, server. You know, it wasn't like today, obviously. And remember when we had apps come out, even pre-iPhone, if you were trying to build something for Nokia or Docomo or any of these other platforms around the world. It was really hard.

Starting point is 00:17:39 And there was a limited number of people could do it, which meant you just didn't see a lot of apps. They would come very slowly, a couple of apps a year. They were super interesting. And they're expensive too, right? Because the development costs were so high. Yeah, which means something that's fun or interesting. The idea that there would be

Starting point is 00:17:56 an app for skiers, like I have an app on my phone for skiers called Slopes, there's like probably a half dozen of them. The fact that there's a solo developer or two-person development team on their weekend hustle building an app, it's just a crazy thought. I mean, you were at Apple when this happened, the concept that an app could be made by one person in their spare time and get to a million dollars in revenue or even $100,000 revenue, 10,000

Starting point is 00:18:18 revenue, 10,000 revenue. There were so many hurdles to that. You had to actually do deals with the carriers. You had to put up servers yourself. You had to figure out how to get that app on, yeah, getting the app, the distribution on people's phone was a roadblock. You just think about the genius of Steve Jobs. The app server distribution, the payment rails for people buying it. And, then there's a really lightweight, easy app discovery and the ability to write them. So you're working on Mojo. This is a programming language. Well, just before we move on from Apple, right?

Starting point is 00:18:48 So my job at Apple was to lead the developer tools team, right? I mean, I had many hats, but by the time I left, I was running the developer tool team with Xcode, the whole iOS app development ecosystem, built the Swift programming language. Also supported all of the internal hardware, which Apple has very fancy, very exotic, and next Gen hardware that they're building. And a major part of the job is to make people more productive. Make it so more people can participate exactly as you're saying,

Starting point is 00:19:16 because so many people have good ideas for apps, right? And so if you get more people involved, like the move from Objective C to Swift, massively simplified things, made it much easier to learn. That was a huge movement that then enabled entirely new categories. And so many people today tell me, you know, I was able to become a programmer because of Swift, right? And so ML, I believe, has got exactly the same thing going on, right? Where it's absolutely possible for the most advanced teams to achieve things, right? But first of all, like complexity, which is really our enemy here, complexity, like if you fill your head with accidental complexity, you don't have space for other stuff.

Starting point is 00:19:53 Yeah. And so by relieving the accidental complexity, you make the teams of experts even more productive. But then you're also more inclusive to other people that have good ideas, but either are, you know, repelled by the complexity. What are the strategies for getting rid of complexity? I mean, I'm just thinking about playing chess. You kind of learn some heuristics, some basic sets of moods, chunks of moves that you can apply in different places. Or, you know, we have co-pilots, which, you know, and we have open source.

Starting point is 00:20:21 We have a lot of different ways to help people with complexity. But when you look at complexity in the world, what do you think of? Do you have a playbook for reducing complexity? Yeah, absolutely. So, and this is one way that modular is very different than pretty much everybody. in space, but complexity comes through abstraction, reduction of complexity comes through abstraction, and through getting people to be able to work together.

Starting point is 00:20:45 Okay? And so the idea here is that you look at all the domains of people that are involved, including all the people putting together the transistors on the chip, right? There's so many different specialities. The details can't fit in any one head. So success comes from teams of people, right? And then composing on other people's work. And so a lot of what I think software has been successful,

Starting point is 00:21:08 I mean, you've built some pretty epic systems, right? Yep. It comes from being able to take things that other people built that you don't have to understand and then build new things on top of it, right? And so what a lot of folks are doing today in ML systems and ML ops and a lot of these things, they say, okay, well, there's so much complexity out here.

Starting point is 00:21:28 What are we going to do? Well, we're going to throw a layer of Python on top of the stack, and then you'll deal with our layer, and look how simple it is. therefore you don't even know about any of this complexity. Now, there have been dozens or hundreds of attempts at this. I mean, there's a lot of stuff out there. Some of it's really good, but the challenge with that is if you're building atop of something like TensorFlow or Pytorch

Starting point is 00:21:47 or, you know, you're trying to get onto novel kinds of hardware and like a TPU or something like that. Well, you actually get exposed to all this accidental complexity because it all leaks. And so, yeah, you get this cool demo, but you can't fix performance or scalability or programmability or programmability or security or like these core problems that people struggle with by adding a layer of Python on top of systems that are fundamentally broken.

Starting point is 00:22:14 Yeah, the thought doesn't work. And in a way, what we've seen happen in the modern web over time, you have cloud computing, abstracting away, putting up servers. And then storage got abstracted. I mean, GPS got abstracted

Starting point is 00:22:32 away. There's a software development kid an SDK for anything. There's an API for anything. And then even building glue between systems has gotten easier. I used to call it middleware, I guess, back in the day. I don't know if there's still a term for that. Enterprise Java beans. Yeah, it was always like weird stuff to try to get you to move data from one system

Starting point is 00:22:50 to the other. It seems like comical now. Maybe you can just talk about the complexity in the world writ large and in the technology stack because you've been at this for a couple of decades. It is pretty amazing. When somebody's coming in now, a 20-year-old developer in school who is, like, building stuff,

Starting point is 00:23:08 how much do they know about what's actually going on beneath, you know, you see the little tip of the iceberg, are they even aware of, like, the complexity underneath? Yeah, well, so, I mean, again, it's hard to make generalizations about all 20-year-olds. Yeah. Because there's some variance there, but...

Starting point is 00:23:24 On the average 20-year-old... On the average 20-year-old, they know Python. Yep. They know if you go into computer science, you know how to train a neural network, for example, but you don't know how to deploy. it, right? You get exposed to some other programming.

Starting point is 00:23:37 Maybe you'll get a little bit of C++ or something like that. But most people coming out of a computer science degree, no Python, and pretty much everybody that is not designed to be a computer scientist, so there's a lot of other fields out there, no Python. Right? And so Python is great because it's super high abstraction. It's like the ultimate duct tape language where you can bolt together these very powerful libraries. But Python also has certain challenges when it comes to perform.

Starting point is 00:24:04 or dealing with hardware or a lot of the things that inhabit the AI space. And so running Python on a service with a billion users is not always great. And so there are challenges there. And so if you come back to what is modular doing about this, what we're tackling instead of adding layers of Python on top of existing systems, we're saying, let's go explode those systems. Let's do the hard thing. Let's go build the system from the bottom up.

Starting point is 00:24:33 And this starts at the hardware. The hardware, there's a lot of really good hardware out there. To your point, nobody knows how it works. I mean, the people that built it do, but most application developers don't know how it works. And what has happened is that right on top of the hardware, there's all these different layers of effectively middleware, just like you said. Right. But each piece of hardware has a different layer of middleware. And so that means that when you get to the top layer, the part that anybody actually wants to work on is super fragmented.

Starting point is 00:24:59 And it makes sense. It's the insane structure of the people building the hardware. they want to build a thing for themselves. But the losers are all of us trying to get our jobs done. Many people in ML don't want to care about the hardware. They're made to care about it. You've heard me talk about Supercutta bunch. This has been a key part of my health journey.

Starting point is 00:25:19 It's an awesome nutrition company that my bestie, David Freeberg, from the Olin Podcast, started. I love their bars. I love their shakes, especially the gut balancing chocolate brownie bar. It is delicious. They also have an unflavored. prebiotic mix. You can add to anything. I like to put it in my coffee. You can put in your O'Mel. Their products are super helpful for weight loss. Why? Well, SuperGut's products mimic the effects of OZempic by boosting your GLP1 hormone. This helps quell hunger and boost

Starting point is 00:25:48 your metabolism, which is a great, great combination, obviously. And Supergut's prebiotic fiber that actually alleviates digestive issues. And obviously, the products all taste great. The best part, the team at SuperGut actually put the work in and scientifically prove their products, work. They conducted a placebo-controlled clinical trial with Stanford last year. That's been published in the medical journal, diabetes, obesity, and metabolism. The results were amazing. The participants in this study, they lost weight, they lowered their blood sugar, they improved their metabolic health, and they had improved digestion and so much more. Whether you want to improve your gut health, maybe drop a few pounds like I did, or just feel better throughout the day. And listen, you're busy,

Starting point is 00:26:26 you're traveling. I like to bring Supergut with me. Go to Supergut.com and use the code twist. You get 25% off. Go to supergut.com and use the code twist to get 25% off. I've been on this health journey. I've lost 40 pounds. A big part of that sincerely was me using supergut. So go to supergut.com and use the code twist for 25% off. What is this hard we're going to look like in five or 10 years? Because we're at this point in time where what Open Eye did with, I think, 3.5 really kind of captured people's imagination and, you know, being able to actually play with it, inspired a lot of developers to maybe get in there. And so here we are, everybody buying up sovereign wealth funds, you know, governments, countries, you know, individuals, companies, startups, everybody buying up all this hardware, racking it, data centers.

Starting point is 00:27:18 And it seems to me, having watched this happen with fiber, you know, we overbuilt fiber massively. and then all the fiber companies, WorldCom, etc. There were a ton of these went bankrupt. They became worth literally 98, 99% less than they were when they went public and all that

Starting point is 00:27:38 wound of getting bought by Google and other people at auctions. Are we in a similar moment right now where we're building up massive capacity or do you think there's enough jobs here to actually use this hardware? And then the second part of the question, so there's something about like this moment in time,

Starting point is 00:27:53 where does this all wind up? If we're sitting here, five years from today, are we looking and going, hey, wow, there's somebody just leapfrogged Nvidia or there's three choices. You can go just like you do Android or you can do an iPhone or you can pick AWS Azure or Google or Rackspace or right on down the line.

Starting point is 00:28:13 Yeah, well, so great question. So there are really two different questions. Two different questions. One question there is the TodayPro. And today problem, everybody's talking about Nvidia and the stockouts of Nvidia and wouldn't be great if there are other options. It's super funny because the majority of spend by many metrics is actually on the inference side,

Starting point is 00:28:32 which is still very dominated by CPUs. Yeah. And again, like, we talk about the pain point. Well, the pain point is people try to build these massive systems and there are not enough GPUs to go around. But meanwhile, so much AI is in our life. That's all being served in cloud. A lot of that's happening.

Starting point is 00:28:47 I mean, some is on GPUs in cloud, but a lot of that's on CPUs. Right. And it works totally fine. And it works totally fine. If you're on Amazon and it's showing you some additional products like the one you're looking at, in all likelihood, that is a machine learning job that's being done on a CPU that was written five years ago or 10 years ago. Or if you do Google search query, there's dozens of models all talking and doing weird things. And there's this intricate dance, right?

Starting point is 00:29:11 And so it's really interesting. If you look at that, your question about is there going to be an oversupply and overabundance? I have no way to know, right? My goal is increase consumption by creating new. categories. And so, and it has nothing to do with H100 or Nvidia. It's just about AI and the applications of it are like a good thing. It makes people's worlds better. And so if we can increase the number of cool things and make our lives better, that seems good to me. Now, your question about where do we go from here, right? So forget about cloud for a second.

Starting point is 00:29:43 Like, so I've been working in the hardware software boundary for for decades now. And the thing that when I zoom out and I look at, look at this time, it's been super interesting. You know, people talk about Moore's Law ended, you know, whatever. And what is Moore's Law? Well, different nerves will argue pedantically what that means, but it really means, you know, back in the day, we'd give a new laptop, and every year would be, you know, 80, 2x faster.

Starting point is 00:30:06 18 months to be twice as fast. Your Pentium chip was twice as fast. Absolutely, on the same code, right? And so what ended up happening, I don't know, 10 years ago-ish, is we had multi-core CPUs. Ah, we have more than one of these to deal with, and then we had GPUs come on the scene. Yep.

Starting point is 00:30:22 Right. You look to now, we have massive GPUs. We have really dedicated AI chips like the Google TPU and Gowdy from Intel and like all these things. There's tons of these things. And we still have CPUs, but these days, CPUs have like 100 cores on it. Right. And so to me, again, many people are laser focused on the today problem.

Starting point is 00:30:43 Yeah. But what happens when you look out five years or 10 years? Yeah. Right. And to me, I look at this is driven by physics. This is not a question about software or things like this. physics is forcing hardware to get weird. And more importantly, specialized in the rise of wearables,

Starting point is 00:30:59 the rise of personal computing, the rise of all, like, ARVR, like all these things are a straight line towards very customized chips. And so that's very interesting, yeah. Yeah. And so we're going to have all, I mean, we're going to have even more crazy hardware in five years than we do today. And this is where you start to say, like, how can we scale the software? Right.

Starting point is 00:31:21 Nobody's going to be able to rewrite everything for every new generation of hardware. That doesn't work. And this is why we're focused on solving this problem. What do you think of the open source risk five and, you know, AMD licensing models and then hardware being built by other folks? Obviously, Nvidia outsources their hardware in terms of how it's being, and they're a designer as well, but it's proprietary and it's closed. So is what happened with Python and other open source and, you know, everything we've seen in the

Starting point is 00:31:49 open source community? Is that likely to happen with hardware? Or is that, you know, great question. So immediately before module, I worked at a company called SciFive, and they are the inventors of Risk V. Yes. Risk V is an open source instruction set.

Starting point is 00:32:09 And so what Risk V allows you to do is it allows any hardware maker to create a member of the Risk V family. And what that means, most importantly, is you get software. And so that is huge. Traditionally, you'd have, for example, Arm owns the Arm Instruction Set, and only Arm and its licensees can build Arm-compatible chips. Or X-A-6, you can have Intel and AMD,

Starting point is 00:32:33 and they're the only ones allowed to build X-8-6 ships. And so with Risk 5, it allows you to go build, arbitrary people can invent new things and play there. And I think that this is causing an explosion of innovation. And again, the challenge with that, and the good thing about that is you get explosion of innovation, The challenge is that is you get all this crazy hardware, right? And so there's no software, and so you need software that can scale on to all this innovation.

Starting point is 00:32:56 And so that's really where kind of the industries that will loggerheads. Yeah, so AMD and these folks, they have blueprints, but they own those blueprints. They're their patents. You can't just take them and build a house with them if we're just using an analogy here. But if you take the risk five, do they call it risk five or risk V? It's risk five. The nerdery on that is that there's four things before it. Yeah.

Starting point is 00:33:21 Yeah. I kind of got that. I've heard somebody say risk V, and I'm like, are you sure it's risk V or is it sounds like risk five? It's definitely five. It's basically caught up to arm, I think,

Starting point is 00:33:32 in terms of throughput or it's close enough. So with any of these things, it completely depends on what you measure. There's advantages to arm, there's advantages to risk five. It's all super nuanced, and a lot of people want to make overly simplified does this thing better than this thing.

Starting point is 00:33:49 And in tech, it's never really that simple. And so arm has got a very strong position. They certainly have some challenges. They've got to stand their toes. But really, the innovation is the piece that I care about. And I want to make it so that once these people invent really cool, RIS 5-based silicon or arm-based silicon or whatever, right, that they can actually do something about that.

Starting point is 00:34:10 Because having cool hardware that nobody uses is really kind of a problem right now. All right, listen, when you're selling to business to business buyers, you really want to get your pitch in front of decision makers. Why? Because upper level execs are usually the ones making purchasing decisions. Duh. The problem is, high level folks can be really hard to find and target on most social media platforms. But on LinkedIn, oh my God, they know all of the CTOs, all of the CFOs, all of the VPs of finance, engineering, HR, recruiting, all those types of, titles are sitting there waiting for you.

Starting point is 00:34:46 And now let's just talk about the funnel. LinkedIn's about to hit a billion members. Did you know that? 950 million members at this point in time. There are 180 million of those 950 who are senior level execs. There are 10 million C-level executives in that 180 million senior level execs, which are part of the 950 million members. I am a C-level executive.

Starting point is 00:35:07 I am on LinkedIn all day long because LinkedIn equals business, business equals LinkedIn. and LinkedIn ads are built specifically for B2B marketers. LinkedIn generates two to five times higher return on ad spend than other social media platforms. LinkedIn equals business, business equals LinkedIn. When people are on LinkedIn, they're ready to do business. It's that simple. So make business to business marketing, everything it can be.

Starting point is 00:35:30 And get a $100 credit on your next campaign from me, your boy, JCal. I'm sending you the hundy, LinkedIn.com slash this week and startups to claim your credit. That's LinkedIn.com slash this week. startups, terms and conditions apply because they're giving you the honey. Tell me how Nvidia got here to a certain extent. Yeah. Because I think we watched this happen where nerds were playing Call of Duty and they

Starting point is 00:35:58 wanted their frame rates to, you know, it doesn't even matter. It's beyond the just noticeable perception in biology. You can't even tell the difference between 120 frames or 120, 240. It doesn't even matter. But these lunatics wanted the best, and I guess NVIDIA just kept giving them better and better hardware. And then you had this crazy crypto moment where everybody started buying all this hardware from NVIDIA

Starting point is 00:36:23 to run jobs. And now AI is kind of circuitous route, I think. Maybe you could explain why that's brilliant and then what the limitations of it are. Because, again, it's not always one thing. But I think the history of how they got here is kind of important, or is it not? It's totally important.

Starting point is 00:36:42 And I mean, to your audience of people who care about startups, it's super illustrative, right? Because invidia didn't magically step onto success. It was earned, right? It wasn't an accident. And so if you go back, I'm not a super expert in Nvidia history, but my understanding is it's a combination of two really important things. So, Nvidia, like some of the other companies you're a fan of, goes, went through several phases where they made bet the farm bets, had near-death experiences. and then we're right. And so one of those bets was on programmability.

Starting point is 00:37:15 And so a lot of people were building the Call of Duty accelerator, and there's a bunch of competition on just make games go faster, just make games go faster, just make games go faster. And Jensen and team bet I think it was the G4S3 on saying, okay, well, hard coding for graphics is not enough. Let's make it so you can do more general compute on this hardware. And so it's not going to be like a CPU. It's a different thing.

Starting point is 00:37:38 It's a different category. created, but let's do this. And that was a huge bet and a non-obvious bet. Nobody else made that bet back then. Almost drove them out of business through the complexity of executing on that. But what it meant is that new kinds of things could run on the graphics card. And that created new markets. And so one of the things you're pointing out is crypto, right? Well, they didn't design a crypto accelerator. Crypto wandered up and said, I need tremendous amounts of compute. And they were there and ready to serve it. And because they had programmability, they're able to scale into the opportunity.

Starting point is 00:38:12 They talk about luck, right? Well, how do you get lucky? Well, part of it is being ready to take advantage of the luck that presents itself. And I think that is really what happened to them. If you look at machine learning, right, a lot of people go back to the seminal moment in machine learning called the AlexNet moment. Explain. And AlexNet was when Fei-Fei's team at Stanford created this big data set called ImageNet.

Starting point is 00:38:37 And they created a competition around it. and that competition was to go find the most accurate predictor and identifier for what was in an image. And so for a few years, people were working on this using traditional machine learning techniques. And then these folks invented this deep drone network called AlexNet that then solved ImageNet, not solved it, but made massively forward in terms of prediction. Now, the way that story is usually told is that it's a combination of two different things. It's a combination of having a huge amount of data, but then also having GPU compute. And so we need both data and compute to be able to solve that problem and make that massively forward,

Starting point is 00:39:13 which then catalyze so much of deep learning today. But the thing they forget is that nobody had the convolution kernels, the algorithms, to implement Reson. That didn't exist on a GPU back. So the reason Alex and it happened is a combination of three things, actually. It's a combination of data, the amount of compute that was available, and then the bet the Jensen and his team made on programmability to allow some researchers to go invent some new algorithms and then do it on their platform. And then fast forward a few years,

Starting point is 00:39:42 it turns out, yeah, they're lucky that deep learning caught on and it turned out to be pretty economically important. But that's what put them in the position that cause all these things like TensorFlow and Pytorch and things like that to get built on their platform. And that's how Kuda got entrenched into so much of machine learning today.

Starting point is 00:39:57 So the journey of Invidia, I mean, you can play this back across so many startups, right? Are you creating a new category? are you leaning into the obvious thing everybody's talking about today? Are you seeing around the corner and betting on where technology is going, right? There's so many of these questions

Starting point is 00:40:12 that I think that, you know, there's no one right answer, but it really plays into a lot of the journey. Well, and to your point about what you're doing, Mojo, if you enable more people, the street finds its own use for technology. Exactly. And Gibson quote, like, you say,

Starting point is 00:40:27 hey, listen, you want to do some of the, you want to try to identify an image and figure out if it's a hot dog or not? Sure. Use our GPU. You don't need our permission because it's permissionless. I mean, not crypto permissionless, but it's your hardware, you own it, do what you want with it.

Starting point is 00:40:41 And it's one of the great, great things about whether it's open source or just open platforms in general, people are building platforms. Yep. And so when you look at this from a playing field, having been an Apple, and watched what happened with open platforms and apps, where do you fall on the, call it the AI rapper debate of 2023. Oh, this company,

Starting point is 00:41:05 we have a great company, roam around. They let you type, they're building a vertical itinerary, travel itinerary piece of software and say, oh, you can go to chat GPT

Starting point is 00:41:14 and say, hey, where should I go in San Diego with my kids? Or, you know, roam around's building it and they've got a very narrow data set and they're, they're really tweaking it

Starting point is 00:41:22 around travel. So you have all these verticalized ones. I have a, we invest in a verticalized screenplay writing software. So where a writer, It's kind of like final draft just for that.

Starting point is 00:41:34 And I believe it's like, yeah, there'll be a lot of these vertical things because you have the interface and you have all the kind of features that will go around it. And sure, chat GPT could do a version of it, but it's not going to do like a polished version of it. So the AI wrapper derogatory statement towards startups building verticalized AI apps

Starting point is 00:41:51 versus one giant language model quad or magically solves all the problems. Magically solves every problem on the planet. Is that even possible? or where do you think this all winds up? Well, so, I mean, I think that there's many different angles in terms of what is the better product, what captures the most value,

Starting point is 00:42:12 in terms of investment hypothesis, like what is the ROI on these things, right? So when I look at this as saying, I'm not a believer in a one-size-fits-all solution. I mean, maybe theoretically, AGI someday will come, and until then, I will hold on to that thought. but in the absence of AGI which magically solves all problems, I look at AI as being a solution to certain kinds of problems.

Starting point is 00:42:37 Right? And some people, some of my friends even, want to say that AI is better than software. You know, and it's just like a straight replacement. But that's, in my opinion, objectively false. What you can look at... What they mean by that is just having a chat interface with an AI agent and talking to them, you'll solve more problems than having to write software.

Starting point is 00:42:57 It'll just do whatever the task is. or are you saying in terms of writing software? Well, I mean, you know this, Jason. Like, you know that building a product is way more than like having an algorithm, right? It's about building a relationship with the customers. It's about having user interface. It's about having a revenue model. It's about having a brand.

Starting point is 00:43:16 It's having all of these things, right? And so when I look at that, when I look at one of these verticals, so you talk about the copywriting thing or these things, these are clearly valuable products. AI is clearly a valuable. way to implement these products and it can be differentiation within that category. I don't think that makes that product magical. I think that that makes it comparable to other things in that vertical.

Starting point is 00:43:41 And so AI is a much more efficient and smart and product-focused way of building out that technology. But I would look at that as saying AI is an implementation detail of building into that vertical. And I think that has a huge amount of value. And so if you're looking as an investment hypothesis, I would not value that as an AI company per se. I would value it as a vertical, consumer vertical, whatever it is company.

Starting point is 00:44:02 And now they're doing in a smart way using the best tech they have available. Yeah, just like there's going to be some, you know, the Yelp app, the Yelp app is so much better than using the website, right? And they just use that new technology to make it a better experience. Everybody's also looking at the David versus Goliath thing, right? And so everybody wants the little guys to take down the big guys. but the big guys have all these other things going for them, including distribution and many of these other things. Well, listen, you're on the inside of all this. I got to ask you, what is the inside track amongst people of your peers who are deep in the AI game and have been in it for a long time? What's her take on what Open AI did? This open source, you know, or, you know, open, it's in the name.

Starting point is 00:44:49 And that, hey, we're going to, this is too important. This technology is way too important for any major company to have a wrap-on. on it. It's really the world needs us to go out there and really make sure that it's not just deep mind inside of buried in some Google, you know, a corridor and some building on a campus. Sure. We're going to build this. And then they got to 3.5 and they're like, whatever, three. And they're like, you know what? We were wrong. I don't have ever said that. But this is way too powerful. We're going to be a closed eye. Do people look at that as just a money grab as cynicism or as sincere.

Starting point is 00:45:26 How does the industry, and I'm not saying necessarily you, but do people look at that and go, it's a money grab? They went from a non-profit to a for-profit. That's all it is. You know, the people there want to make money, which is fine, we all do. You're racing venture capital.

Starting point is 00:45:38 It doesn't come without expectations. So what's the take on that crazy move to go from a nonprofit to a for-profit from a open system to a closed system? Honestly, this isn't my area of specialization. I mean, I'd much rather talk about things. What do you think about this weirdness? My opinion is, what do you expect?

Starting point is 00:45:55 They took VC money to get returned. Yeah. The end. Yeah. Well, I mean, and so, I mean, I think that things that appear too good to be true sometimes are, right? And so if you're expecting, if you're expecting somebody out of the goodness of their heart to dump billions of dollars of compute into building a free product, then, well, you're paying for it somehow. Maybe it's with your data. Maybe it's some other way.

Starting point is 00:46:15 I mean, I think this is generally true in the world. And I think people are getting smarter about that. And so, I mean, I don't know. I mean, I think the surprise is. surprising, but I don't, I don't know too much about the details on how they decided to do that or what it means. Yeah, there's tradeoffs everywhere. You look at the impact on society. I am, you know, I'm an investor in a lot of companies.

Starting point is 00:46:37 And what I'm seeing on the front line of startups and inside really nimble organizations that are the tip of the spear in terms of using technology, not just to build their product, but to build their businesses, they're building 12-person businesses. with four people. They are getting a lot done with less. And it happened, boom, in one year. This is year one. I mean, people still forget that it was last fall that 3.5 came out and kind of blew people's minds, let alone 4.0 and whatever else is coming next.

Starting point is 00:47:10 So when you look at the impact on the world, knowing what you know from the seat you're in, is what we saw this year, which is to say, I think people got 30 or 40% more efficient at their jobs, if they know how to use this technology easily. Is that going to compound or is it going to be the same? And then impact on society.

Starting point is 00:47:29 Yeah, also, I don't know, I don't know the math on that, but the impact's going to be huge, right? But the huge impact is also going to be spread out over time, right? The impact, as you say, you have seen it, but you're zeroed into a very specific part of the problem. We still can't hire programmers. There's not enough programmers out there to implement all the stuff that needs to be implemented. And so while it is true, it's important. part of the ecosystem, it turns out that there's a big part that it isn't. One of my questions is that when you have disruptive technology, how do you think about technology diffusion? How long does it take something that should be disruptive? And everybody knows it's a 10x improvement or whatever. How long does it take to actually get out into the ecosystem? Because sure, the neural network algorithms change every week, but we humans don't. It takes a long time for us to learn new habits. And it takes time for all the playing cycle and things like this. to change. One of the things I think people forget is that as a coder,

Starting point is 00:48:27 people focus on, okay, I'm going to study up, I'm going to put the semicolon's right place, and I've worked on programming languages forever. But so much of coding is working as part of a team. Right? And so the way I look at this is I look at it as saying, okay, imagine you had the amazingly awesome coder robot. Right? And we're not amazingly awesome yet. We're promising, but we're not amazingly awesome yet. You still,

Starting point is 00:48:51 that's like adding a member to your team. Right? And so adding one member to a four-person team is huge. Huge left. Particularly if they're really good. We still need to review the code. You still need to do a great in the product.

Starting point is 00:49:03 You still need to decide your product strategy. You have to understand the relationship with the customer. You have to, so you're improving one really important part of the problem. You still have to do all the other work. Yeah. Now chat GPT and things like this can help with some of that. They can help with graphic design and like AI is good,

Starting point is 00:49:18 good at many different pieces. but I think that it will take time for us all to figure out how best utilize this and is it cumulative or is it disruptive or how does that work out over time? How much faster are developers getting in your estimation? Like with these co-pilots and it feels like they're getting 10, 20, 30% faster year over year? I don't know if it's cumulative is the problem, right? So because what I've seen is sort of what I was getting at is like, I've seen a lot of boilerplate get automated.

Starting point is 00:49:47 I haven't seen a lot of the actually. interesting part of product design get automated. Fascinating. Yeah, so that's where the human creativity will be. Yep. Yeah. And so this is where like, yeah, if you take, I don't go back in the day, XML or something. Like, if you take something super boilerplatey, then AI animation's amazing, right?

Starting point is 00:50:05 But there are also other better ways to do that. You know, so that's a different way to look at the question. We also have a little bit of a corollary for this. I look back on my career and it's like, it was two decades before everybody got a PC on their desk and in their home. It was literally from like 1980 to 2000. By the time you got to 2000, the idea that somebody didn't have a computer at work was like,

Starting point is 00:50:24 really? I mean, you'd have to look really hard in an organization in 2000 to find somebody with a desk without a desktop computer on it. Or cell phones, right? And then you look at cell phones, two decades. Usually disruptive technology.

Starting point is 00:50:37 Diffusion takes time right now. I think this makeup much faster than hardware transitions did because the inherent time delays and manufacturing and stuff like that is much lower, but it'll be similar. Well, I mean, now we, but I think that's, you just nailed the point, which is then you look at something like Google, Uber or, you know, some other software-based platforms that don't require, you know, hardware that are built on top of them. Those things all took 10 years. So, you know, I think maybe this next group is, you know, maybe go from 20 years to deploy, 10 years to deploy, and hit the masses.

Starting point is 00:51:08 And maybe now it's three, four, five. Well, as you look at startups, right? I mean, I think that I've seen so many of these, I'm sure you've seen probably 100x more, but so many of these folks that are like, look, I built a thing. It's a thin layer on top of chat, GPT. I hacked it together a month. I'm going to make massive amounts of money, and it's going to be amazing, right? Yeah.

Starting point is 00:51:25 In my experience, which is obviously small selection size, but if you can build something in a month, so can everybody else. Yeah. There's no mode, by the way. Exactly. And so if you works, then everybody's going to be after you, right?

Starting point is 00:51:39 And so that's one of the challenges. And for me, this is where I, the things I work on can take years, right? And so what I do is I say, okay, well, this is going to be a 10 or 15 or 20 or 20. 20 year journey, how do I break it down in milestones? How do I have usefully viable things that are maybe not the big win? Everybody wants to jump to the end, but how do I make sure we're making progress in delivering useful value and learning and iterating and cycling,

Starting point is 00:52:02 building up to something that's really quite huge. And to me, that's a lot more interesting. What's your next one? What's the next milestone? What's the waypoint that you're working towards? Yeah. So why don't we go back to modular? Because I don't think we've talked much about products and where we are. Yeah. So modular, what we're doing is we're tackling all this complexity, right? This industry is a mess. We have all these people, all these companies, all this stuff happening, and it's just keeping track of it as a mess,

Starting point is 00:52:24 but also you have all these infighting groups, like none of the LLN companies get along. No, the hardware people get along, no, the cloud people get along. Nobody gets along the space, right? And so as a consequence of that, all that complexity is being forced on this. And so modular is rebuilding this from the bottom up and providing a unified thing that simplifies this way for people. Mojo, which you brought up, is one of the major pieces of this.

Starting point is 00:52:46 What mojo is is it's a programming language. Well, who in the right mind invents a new programming language? Well, I've been there done that. I've built OpenCL. I built one of the most widely used implementations of C+++, I built the Swift programming language from scratch, right? And so why do you do that? Well, you do that because you want to build and help and solve a problem

Starting point is 00:53:08 that you can't solve any other way. Like building a programming language should never be in anybody's right mind. The first thing you jump to. But here's the problem we faced, which is that everybody in machine learning uses Python. People generally love it, right? Python is, I mean, my kids know Python, right? It's ubiquitous. And people don't consider it to be broken.

Starting point is 00:53:31 But then you run into AI where now you have high-performance GPUs and you have crazy accelerators and you have all this kind of stuff going on and you have C++. And you realize that Python is really great at composing opaque things that other people made. But it doesn't give you the hack ability to actually go customize and change things. And so what Mojo does is Mojo says, okay, well, let's take this problem. And let's do a very hard tech project of building a new programming language, inventing all new compilers and runtimes and very low-level system stuff that allows Python to scale. Let's embrace Python and its entire ecosystem.

Starting point is 00:54:09 Because what I've learned in my experience with this kind of stuff is that generally humans love to learn things. We all love to grow. We like learning new techniques. We want to put new things in our toolbox. It's all great. But we hate resetting to zero so that we can then learn. And so what Mojo allows you to do is, if you know Python, you can walk right in, the things you already know continue to work.

Starting point is 00:54:31 But now if you want to write some high performance code, you can do so. And not everything needs to be high performance. You can choose where you care about applying the time. And that allows you to scale. And so a big part about what modular does is our number one mandate is meet the consumer where they are. Right? And guess what?

Starting point is 00:54:50 A lot of developers are on Python. We love Python. We want to make it better. We're not trying to go, like, make a completely different system that has nothing to do with Python and hope it ends up being better. It's a different approach.

Starting point is 00:55:01 AI is what you're talking about. Huge mess. Like all these different fighting systems, there's no thing to plug into. None of the stuff is compatible. So what modular provides is this thing called the AI engine. The AI engine is a drop-in compatible replacement for tense flow in Python.

Starting point is 00:55:15 And so if you're using Pi Torch, if you're using TensorFlow, you do not have to rewrite your code. Turns out who wants to rewrite their code? Nobody stands up, right? And so what we can do is we can be a drop in replacement that then provides a ton of value. And so for a lot of enterprises, it has value in terms of consolidating, eliminating all the point solutions. And so many people have a little bit of TensorFlow, a little bit of PiTorch.

Starting point is 00:55:40 So that's a huge four. Now they have a little bit of CPU, a little bit of GPU. They have a little bit of this, a little bit of that. They have different kinds of models and different kinds of specialized things, and we can consolidate that into one simple thing that turns out is commercially supported. Who wants to run their own mail server these days? Do you want to build and run your own cobbled together storage thing? Exactly. It doesn't make any sense.

Starting point is 00:56:04 Again, AI needs to grow up. It's programmable and extensible. Do you want to give up your product strategy to somebody else? Well, no. It turns out that people want to take models and then customize. You want to make it work right for what you're doing. And so having the ability to hack the system is actually super important. It's accessible via hardware.

Starting point is 00:56:23 And all these different pieces, the mojo and engine story comes together. How hard is it to make it compatible with each different hardware platform? How long does that take? It's super hard. So, I mean, if you want me to talk about my backstory, like I've been working on these super exotic, esoteric compilers and systems and GPUs and accelerators and things for decades now, right? And so a lot of what brought Modular to exist is this realization that if we keep building

Starting point is 00:56:52 one-off solutions to each of these things, we as a software industry will never scale. And so a lot of the core tech, a lot of the core invention at Modular, and the reason that what we have is interesting is we enable people to bring a part of a much faster. And so, for example, we have just on CPU front as an example, lots of people use, Intel CPUs. They're really great. They're pervasively available in cloud. It turns out that PyTorch, for example,

Starting point is 00:57:22 super optimized by Intel for Intel CPUs. Also turns out that you can get AMD CPUs in cloud. Turns out their instance types are usually much cheaper for the same amount of performance horsepower. But guess what? For some reason, it doesn't run super

Starting point is 00:57:37 effectively on AMD CPUs. Oh, wow. Go figure. Go figure, right? And so it turns out modular has massive performance uplifts on Intel, even bigger uploads on AMD, but then you can also go to these other instance types like Graviton, which are arm-based cloud servers, and they're even less expensive, and our performance uploads are even bigger, right?

Starting point is 00:57:57 And so what we can do is we can provide the ability to move your workload to the place that makes sense for your thing. And for us, bringing up Graviton, just in terms of bringing up an entire machine learning stack, it took us four hours. Wow. For a completely new architecture. And that's one of the things that nobody in the industry,

Starting point is 00:58:16 in the AI infra industry has is the ability to bring up the entire stack quickly and then do performance to. Most of the time, the problem you have is that you have to do all this incremental work to get new kinds of models to run. And so that's one of the reasons why you get all this fragmentation. There's all these translators. You have Apple decided they would get off Intel. They never were on AMD, but Windows was on both.

Starting point is 00:58:40 And they started doing these M1, M2 chips. they're pretty extraordinary in terms of running a laptop or a desktop in terms of performance video. And of course, you know, battery life. They're optimized for what, you know, a very consumer bent, let's say. And that's the world I lived for years at Apple, right? Right, right. They're helping with hardware transitions, helping the watch get to 3-2-bit arm, to 64-bit arm to all the complexity that goes into that,

Starting point is 00:59:06 that Apple makes magic for developers so that nobody has to know about it. Yeah. Are they, do you think they're going to play a role? here? Do you think their chips are so high performance that they've got a shot at taking on some machine learning and, you know, AI jobs and sincerity? Or is it just because I was just watching somebody, you know, putting Lambda, they were, you know, trying to build some models on their M2 and they were just like, wow, this is

Starting point is 00:59:32 pretty extraordinary. Yeah. So what I've seen out there is that, so I've been out of Apple for a long time, so I don't speak Apple. I know nothing about the roadmap, et cetera, et cetera, et cetera, et cetera. disclaimer, disclaimer. I don't think they're interested in the training market. Their hardware is completely irrelevant there, in my opinion.

Starting point is 00:59:47 And they're not even trying because they don't think it's an interesting market. It's not consumer aligned. It's very low margin compared to this. I mean, in video is accepted, I guess. But that's not their strong point. What they're really focusing on is the client. And so you look at it, there's all these Lama.com and things like this where people are running LLMs on their laptop. Apple's all over that.

Starting point is 01:00:10 They're super into that. And it turns out that, again, you look at the shift that we started from, there's this training part of the problem and then the inference part of the problem. What we've seen is this rise of pre-trained models. And so training a model is actually becoming actually less important over time, maybe, at least the number of people that participate in that can go down. And, you know, if meta keeps launching, like, amazing models that they train themselves, right? That are good enough or great enough.

Starting point is 01:00:36 Yeah. Then you were on the inference side and, yeah, running it on your desktop. becomes super interesting. Right. And inference is the part that you integrate into your product. Right. And so that becomes the interesting things.

Starting point is 01:00:48 You want to run chat GPT on your phone. You don't want to train chat GPT unless you're crazy. Yeah. Yeah. Amazing. Well, listen, great start. Really excited to see where you take it.

Starting point is 01:00:59 I know you're on a hiring binge right now. And you really want to bring talent on board. Yes. pitch to developers of why to come work on this problem. And what are you looking for? And what's the culture like at module? Yeah, so what we're doing is we're taking on a really hard technology problem. Right.

Starting point is 01:01:15 So this is a part of the problem in a layer of the stack that very few people understand. And honestly, it's things that people want to build on top of instead of having to understand. Right. But now for the specific kinds of hardware software cloud folks that care about super scale, turns out there's a lot of money being spent in the space. Turns out there's a great set of opportunities in front of us. It's a really exciting time in the domain. One of the things that's really unusual about modular

Starting point is 01:01:43 is that we don't run from demo to demo to demo to demo. We actually build high-quality production stuff, and we care about building things right. And what I found is that if you build things right and deliberately, strategically, and you put down the bricks one after the other, you can build some pretty epic things. And you look at Mojo, for example.

Starting point is 01:02:00 We're building potentially the successor to Python. Amazing. Right? We love Python. Python's never going to go away, but this thing can take Python and give it superpower. And as it does that, right, the opportunity to impact hundreds of millions of developers is profound, right?

Starting point is 01:02:15 And you look at AI. How many developers is AI can impact? Uncourable. All, right? All. 100%. I mean, in the fact that we might have, you know, a larger aperture of people who could participate in developing, right?

Starting point is 01:02:30 Exactly. It wasn't open to as many people. And now with these tools, it clearly. Exactly. And so, and so modular, right, what we're doing is we're focusing on this layer of the stack that we think we contribute Two. So we're not building the LLM.

Starting point is 01:02:42 We want to help those people do that. We're not building the cloud. We're not building the hard work. We're helping solve this problem that we think is really useful for people and it will allow other people to build on the platform. And as building this thing out, our platform is opening. As an open platform, we think we're going to be able to help lots and lots and lots of people, which is super fun.

Starting point is 01:02:59 And you want to build it right. So I was just looking at your careers page. Go to modular.com slash careers. If you want to build important things and you want to build them right. and enable a lot more people to participate in the eye future. Listen, you've been a great guest. Please come on again.

Starting point is 01:03:14 And continue to success with it. I'm a huge fan of your Jason, so thank you for having me. I appreciate that. All right, everybody. We'll see you next time on this weekend startups.

This Week in Startups - Expanding AI chip capabilities beyond Nvidia with Modular CEO Chris Lattner | E1808

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.