a16z Podcast - The State of AI with Marc & Ben

Starting point is 00:00:00 Most of the content created on the internet is created by average people. And so kind of the content on average, as a whole, on average, is average. The test for whether your idea is good is how much can you charge for it? Can you charge the value? Or are you just charging the amount of work it's going to take the customer to put their own wrapper on top of open AI? The paradox here would be the cost of developing any given piece of software falls. But the reaction to that is a massive surge of demand for software. offer capabilities.

Starting point is 00:00:31 And I think this is one of the things that's always been underestimated about humans is our ability to come up with new things we need. There's no large marketplace for data. In fact, what there are is there are very small markets for data. In this wave of AI, big tech has a big compute and data advantage. But is that advantage big enough to drown out all the other startups trying to rise up? Well, in this episode, A16C co-founders, Mark and Dries and Ben Horowitz, who both, by the way, had a front row seat to several prior tech waves,

Starting point is 00:01:05 tackle the state of AI. So what are the characteristics that will define successful AI companies? And is proprietary data the new oil, or how much is it really worth? How good are these models realistically going to get? And what would it take to get 100 times better? Mark and Ben discuss all this and more, including whether the venture capital model needs a refresh to match the rate of change happening all around it.

Starting point is 00:01:31 And of course, if you want to hear more from Ben and Mark, make sure to subscribe to the Ben and Mark podcast. All right, let's get started. It is kind of the darkest side of capitalism when a company is so greedy, they're willing to destroy the country and maybe the world to like just get a little extra profit. And they do it like the really kind of nasty thing is they claim, oh, it's for safety. You know, we've created an alien that we can't control. But we're not going to stop working on it.

Starting point is 00:01:59 We're going to keep building it as fast as we can, and we're going to buy every freaking GPU on the planet. But we need the government to come in and stop it from being open. This is literally the current position of Google and Microsoft right now. It's crazy. The content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security

Starting point is 00:02:24 and is not directed at any investor or potential investment. investors in any A16Z fund. Please note that A16Z and its affiliates may maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see A16Z.com slash disclosures. Hey, folks, welcome back. We have an exciting show today. We are going to be discussing the very hot topic of AI. We are going to focus on the state of AI as it exists right now in April of 2024.

Starting point is 00:02:53 And we are focusing specifically on the intersection of AI and company building. So hopefully this will be relevant to anybody working on a startup or anybody at a larger company. We have as usual solicited questions on X, formerly known as Twitter, and the questions have been fantastic. So we have a full lineup of listener questions, and we will dive right in. So first questions, so three questions on the same topic. So Michael asks, in anticipation of upcoming AI capabilities, what should founders be focusing on building right now?

Starting point is 00:03:20 Gwen asks, how can small AI startups compete with established players with massive compute and data scale advantages? And Alastair MacLay asks, for startups building on top of Open AI, etc., what are the key characteristics of those companies that will benefit from future exponential improvements in the base models versus those that will get killed by them? So let me start with one point, Ben, and then we'll jump right to you. So Sam Holtman recently gave an interview, I think maybe Lex Friedman or one of the podcasts. And he actually said something I thought was actually quite helpful. Let's see, Ben, if you agree with it. He said something along the lines of, you want to assume that the big foundation models coming out of the big AI companies are going to get, a lot better. So you want to assume they're going to get like a hundred times better. And as a

Starting point is 00:04:00 startup founder, you wanted them think, okay, if this current foundation models get 100 times better is my reaction, oh, that's great for me and for my startup, because I'm much better off as a result. Or is your reaction the opposite? Is it, oh, shit? I'm in real trouble. So let me just stop right there, Ben, and see what you think of that as general advice. Well, I think generally that's right, but there's some nuances to it, right? So I think that from Sam's perspective, if he was probably discouraging people from building foundation models, which I don't know that I would entirely agree with that and that a lot of the startups building foundation models are doing very well.

Starting point is 00:04:37 And there's many reasons for that. One is there are architectural differences, which lead to how smart is a model, there's how fast is a model, there's how good is a model in a domain. And that goes for not just text models, but image models as well. There are different domains, different kinds.

Starting point is 00:04:54 kinds of images that response to prompts differently. If you ask Mid Journey and ideogram, the same question, they react very differently, you know, depending on the use cases that they're tuned for. And then there's this whole field of distillation where, you know, Sam can go build the biggest, smartest model in the world, and then you can walk up as a startup and kind of do a distilled version of it and get a model very, very smart at a lot less cost. So there are things that, yes, the big company models are going to get way better, kind of way better at what they are.

Starting point is 00:05:31 So you need to deal with that. So if you're trying to go head-to-head full frontal assault, you probably have a real problem, just because they have so much money. But if you're doing something that's different enough or like, you know, different domain and so forth, for example, you know, at Databricks, they've got a, foundation model, but they're using in a very specific way in conjunction with their kind of leading data platform. So, okay, now if you're an enterprise and you need a model that knows all the nuances of how your enterprise data model works and what things mean and needs

Starting point is 00:06:12 access control and what needs to use your specific data and domain knowledge and so forth, then it doesn't really hurt them if Sam's model gets way better. Similarly, 11 Labs with their voice model has kind of embedded into everybody. Everybody uses it as part of kind of the AI stack. And so it's got kind of a developer hook into it. And then, you know, they're going very, very fast to what they do and really being very focused in their area. So there are things that I would say like extremely promising

Starting point is 00:06:44 that are kind of ostensibly, but not really competing with OpenAI or Google or Microsoft. So I think it sounds a little more coarse-grained than I would interpret it if I was building a startup. Right. Let's stick into this a little bit more. So let's start with the question of, do we think the big models, the God models, are going to get 100 times better? I kind of think so. And then I'm not sure.

Starting point is 00:07:07 So if you think about the language models, let's do those because those are probably ones that people are most familiar with. I think if you look at the very top models, you know, Claude and Open AI and Mistral and Lama, The only people who I feel like really can tell the difference as users amongst those models are the people who study them. You know, like they're getting pretty close. So, you know, you would expect if we're talking 100x better, that one of them might be separating from each other a lot more. But the improvement, so 100% better in what way? Like for the normal person using it in a normal way, like asking it questions and finding out stuff? Well, let's say some combination of just, like, breadth of knowledge and capability.

Starting point is 00:07:52 Yeah, like I think in some of them, they are, yeah. Right, but then also just combined with, like, sophistication of the answers, you know, sophistication of the output, the quality of the output, sophistication of the output, you know, lack of hallucination, factual grounding. Well, that, I think, is for sure going to get 100 times better. Like that, yeah, I mean, they're on a path for that. The things that are, so against that, right, the alignment problem where, okay, Yeah, they're getting smarter, but they're not allowed to say what they know.

Starting point is 00:08:21 And then that alignment also kind of makes them dumber in other ways. And so you do have that thing. The other kind of question that's come up lately, which is kind of do we need a breakthrough to go from what we have now, which I would categorize is artificial human intelligence as opposed to artificial general intelligence, meaning it's kind of the artificial version of us. we've structured the world in a certain way using our language and our ideas and our stuff

Starting point is 00:08:51 and it's learned that very well, amazing. And it can do kind of a lot of the step that we can do but are we then the asymptote or you need a breakthrough to get to some kind of higher intelligence or general intelligence. And I think if we're the asymptote, then in some ways it won't get 100 times better because it's already, like, pretty good relative to us.

Starting point is 00:09:18 But, yeah, like, it'll know more things. It'll hallucinate less. On all those dimensions, it'll be 100 times better, I think. You know, there's this graph floating around. I forget exactly what the axes are, but it's basically shows the improvement across the different models. To your point, it shows an asymptote against the current tests that people are using that's sort of like at or slightly above human levels,

Starting point is 00:09:36 which is what you would think if you're being trained on an entirely human data. Now, the counterargument on that is, are the tests just too simple, right? It's a little bit like the question people have run the SAT, is if you have a lot of people getting 800s, you know, on both math and verbal on the SAT, is the scale too constrained. Do you need a test? They can actually test for Einstein.

Starting point is 00:09:52 Right, right. It's memorized the tests that we have. And it's great of them. Right. But you can imagine SAT that like really can detect gradations of people who have like ultra high IQs who are ultra good at math or something. You could imagine test for AI. You know, you can imagine tests the test for reasoning above human levels when it assumes.

Starting point is 00:10:08 Yeah, well, maybe the AI needs to write the test. Yeah, and then there's a related question that comes up a lot. It's an argument we've been having internally, which is also I'll start to take some sort of more provocative and probably more bullish, or as you would put it, sort of science fictiony predictions on some of this stuff. So there's this question that comes up, which is like, okay, you take an LLM, you train it on the internet. What is the internet data?

Starting point is 00:10:27 What is the internet data corpus? It's an average of everything, right? It's a representation of human activity. Representation of human activity is going to kind of, you know, because of the sort of distribution of intelligence of the population, you know, most of it's somewhere in the middle. And so the data set on average sort of represents the average human. You're teaching it to be very average, yeah.

Starting point is 00:10:42 Yeah, you're teaching to be very average. It's just because most of the content create on the internet is created by average people. And so kind of the content on average, you know, as a whole on average is average is average. And so therefore, the answer is our average, right? You're going to get back an answer that sort of represents the kind of thing that an average 100 IQ, you know, kind of by definition, the average human is 100 IQ. It's IQ's index to 100 at the center of the bell curve. And so by definition, you're kind of getting back the average.

Starting point is 00:11:03 I actually argue like that may be the case for the default prompt today. Like you just ask the thing, does the Earth revolve around the sun or something? You get like the average answer to that and maybe that's fine. This gets to the point as well, okay, the average. coverage data might be of an average person, but the data set also contains all of the things written and thought by all the really smart people. All that stuff is in there, right? And all the current people who are like that, their stuff is in there. And so then it's sort of like a prompting question, which is like how do you prompt it in order to get basically, in order to basically

Starting point is 00:11:28 navigate to a different part of what they call the latent space to navigate to a different part of the data set that basically is like the super genius part. And, you know, the way these things work is if you craft the prompt in a different way, it actually leads it down a different path inside the data set, gives you a different kind of answer. And here's another example of this. If you ask it, write code to do X, write code to sort of list or whatever, render an image, it will give you average code to do that. If you say, write me secure code to do that, it will actually write better code with fewer security holes, which is very interesting, right?

Starting point is 00:11:55 Because it's accessing a different purpose of training data, which is secure code. And if you ask, you know, write this image generation thing the way John Carmack would write it, you get a much better result because it's tapping into the part of the latent space represented by John Carmack's code, who's the best graphics programmer in the world. And so you can imagine prompting crafts in many different domains such that you're kind of unlocking the latent super genius, even if that's not the default answer? Yeah, now, so I think that's correct. I think there's still a potential limit to its smartness in that.

Starting point is 00:12:23 So we had this conversation in the firm the other day where you have, there's the world, which is very complex. And intelligence kind of is, you know, how well can you understand, describe, or present the world? But our current iteration of artificial intelligence consists of humans structuring the world and then feeding that structure that we've come up with into the AI. And so the AI kind of is good at predicting how humans have structured the world as opposed to how the world actually is, which is something more probably complicated,

Starting point is 00:13:01 maybe irreducible or what have you. So do we just get to a limit where, like, it can be really smart, but its limit is going to be the smartest humans as opposed to smarter than the smartest humans. And then kind of related, is it going to be able to figure out brand new things, you know, new laws of physics and so forth? Now, of course, there are like one in three billion humans that can do that or whatever. That's a very rare kind of intelligence. So it still makes the AI extremely useful.

Starting point is 00:13:34 But they play a different role if they're kind of artificial humans than if they're like artificial, you know, super duper mega humans. Yeah. So let me make the sort of extreme bull case for the hundred. Because, okay, so the cynic would say, the Sam Malman would be saying they're going to get 100 times better precisely if they're not going to. Yeah, yeah, yeah, yeah.

Starting point is 00:13:57 Right? Because he'd be saying that basically in order to scare people into not competing. Well, I think that whether or not they are going to get 100, times better, Sam would be very likely to say that. Like, Sam, for those you don't know him, is he's a very smart guy, but for sure he's a competitive genius. There's no question about that. So you have to take that account. Right. So if they weren't going to get a lot better, he would say that. But of course, if they were going to get a lot better, to your point, he would also say that. Yes. Why not? Right. And so let me make the bull case that they are going

Starting point is 00:14:27 to get 100 times better or maybe even, you know, on an upper curve for a long time. And there's like enormous controversy, I think, on every one of the things I'm about to say, but you can find very smart people in the space who believe basically everything I'm about to say. So one is, there is generalized learning happening inside the neural networks. And we know that because we now have introspection techniques where you can actually go inside and look inside the neural networks to look at the neural circuitry that is being evolved as part of the training process. And, you know, these things are evolving, you know, general computation functions. There was a case recently where somebody trained one of these on a chess database and, you know,

Starting point is 00:14:57 just by training lots of chess games, it actually imputed a world model of a chess board, you know, inside the neural network and, you know, that was able to do original moves. And so the neural network training process does seem to work. And then specifically, not only that, but, you know, meta and others recently have been talking about how so-called overtraining actually works, which is basically continuing to train the same model against the same data for longer, you know, putting more and more compute cycles against it. You know, I've talked to some very smart people in the field, including there,

Starting point is 00:15:22 who basically think that actually that works quite well. The diminishing returns people were worried about about more training. And they proved it in a new Lama release, right? That's the primary technique they use. Yeah, exactly. Like one guy in the space basically told me, basically he's like, yeah, we don't necessarily need more data at this point

Starting point is 00:15:36 to make these things better. We maybe just need more compute cycles. We just train it 100 times more and it may just get actually a lot better. So, Juan, the labeling, it turns out that supervised learning ends up being a huge boost to these things. Yeah.

Starting point is 00:15:48 So we've got that. We've got all of the kind of, you know, let's say rumors and reports of various kinds of self-improvement loops, you know, that kind of underway. And most of the sort of super advanced practitioners in the field think that there's now some form of self-improvement loop that works,

Starting point is 00:16:01 which basically is, you basically get an AI to do what's called chain of thoughts. You get it to basically go step by step to solve a problem. You get it to the point where it knows how to do that. And then you basically retrain AI on the answers. And so you're kind of basically doing sort of a forklift upgrade across cycles of the reasoning capability. And so a lot of the experts think that sort of thing's starting to work now. And then there's still a raging debate about synthetic data, but there's quite a few people who are actually quite bullish on that.

Starting point is 00:16:23 Yeah. And then there's even this tradeoff. There's this kind of dynamic where like LLMs might be okay at writing code, but they might be really good at validating code. you know, they might actually be better at validating code than they are at writing it. That would be a big help. Yeah, well, but that also means like AOS may be able to self-validate your own code. Yeah, yeah.

Starting point is 00:16:39 They can validate their own code. And we have this anthropomorphic bias that's very deceptive with these things because you think of the model as an it. And so it's like, how could you have an it that's better at validating code the writing code, but it's not an it. What it is is, it's this giant latent space. It's this giant neural network. And the theory would be there are totally different parts of the neural network for writing

Starting point is 00:16:54 code and validating code. And there's no consistency requirement whatsoever that the network would be equally good at both of those things. And so if it's better at one of those things, right? So the thing that it's good at might be able to make the thing that it's bad at better and better. Right, right, right, right. Sure, sure. Right. Sort of a self-improvement thing. And so then on top of that, there's all the other things coming, right, which is it's everything, there's all these practical things, which is there's an enormous chip constraint right now. So every AI that anybody uses today is its capabilities are basically being gated by the availability of chips. But like that will resolve over time.

Starting point is 00:17:25 You know, there's also, if your point out like data labeling, there is a lot of data in these things now, but there is a lot more data out in the world. And there's, you know, at least in theory, some of the leading AI companies are actually paying to generate new data. And by the way, even like the open source data sets are getting much better. And so there's a lot of like data improvements that are coming. And then, you know, there's just the amount of money pouring into the space to be able to underwrite all this. And then by the way, there's also just the systems, right, which is a lot of the current systems, you know, we're built by by science. And now the really world class engineers are showing up and tuning them up and

Starting point is 00:17:51 getting them work better. And, you know, maybe that's not a, maybe that's not a, which makes training, by the way, way more efficient as well, not just inference, but also training. Yeah, exactly. And then even, you know, another improvement area is basically Microsoft released their Phi small language model yesterday. And apparently it's competitive. It's a very small model competitive with much larger models. And the big thing they say that they did was they basically optimized the training set. So they basically de-duplicated the training set. They took out all the copies and they really optimized on a small amount of training data, on a small amount of high-quality training data, as opposed to the larger amounts of low-quality

Starting point is 00:18:22 data that most people train on. You add all these up and you've got eight or ten, different combination of sort of practical and theoretical improvement vectors that are all in play. And it's hard for me to imagine that some combination of those doesn't lead to like really dramatic improvement from here. I definitely agree. I think that's for sure going to happen. Right. Like if you were, so back to Sam's proposition, I think if you were a startup and you were like, okay, in two years I can get as good as GPT4, you shouldn't do that. Right. When I hit, that would be a bad mistake. Right. Right. Well, this also goes to, you know, a lot of when entrepreneurs are afraid of. Well, I'll give you an example. So a lot of entrepreneurs

Starting point is 00:18:56 this thing they're trying to figure out, which is, okay, I really think, I know how to build a SaaS app that harnesses an LLM to do really good marketing collateral. Let's just make a very similar, a very similar thing. And so I build a whole system for that. Will it just turn out to be that the big models in six months will be even better in making marketing collateral just from a simple prompt such that my apparently sophisticated system is just irrelevant because the big model just does it. Yeah. Yeah. How are we just talk about that? Like apps. You know, another way you can think about it is the criticism of a lot of current AI app companies is their quote-unquote, you know, GPT wrappers.

Starting point is 00:19:28 There's sort of thin layers of wrapper around the core model, which means the core model could commoditize them or displace them. But the counter argument, of course, is it's a little bit like calling all, you know, old software apps, you know, database wrappers, you know, wrappers around a database. It turns out like actually wrappers around a database is like most modern software, and a lot of that

Starting point is 00:19:43 actually turned out to be really valuable. And it turns out there's a lot of things to build around the core engine. So, yeah, so Ben, how do we think about that when we run into companies thinking about building apps? Yeah, you know, it's very tricky question. because there's also this correctness gap, right? So, you know, why do we have co-pilots?

Starting point is 00:20:00 Where are the pilots? Right, where are the I... There's no AI pilots. They're only AI co-pilots. There's a human in the loop on absolutely everything. And that really kind of comes down to this, you know, you can't trust the AI to be correct in drawing a picture or writing a program or, you know,

Starting point is 00:20:21 even like that, writing a course. brief without making up citations, you know, all these things kind of require a human and it kind of turns out to be like fairly dangerous to not. And then I think that so what's happening a lot with the application layer is people saying, well, to make it really useful, I need to turn this co-pilot into a pilot. And can I do that? And so that's an interesting and hard problem. And then there's a question of, is that better done at the model level or at some layer on top that, you know, kind of teases the correct answer out of the model, you know, by doing things like using code validation or what have you, or is that just something that

Starting point is 00:21:02 the models will be able to do? I think that's one open question. And then, you know, as you get into kind of domains and, you know, potentially wrappers on things, I think there's a different dimension than what the models are good at, which is what is the process flow, which is kind of in database for all to. So on the database kind of analogy, there is like the part of the task in a law firm that's writing the brief, but there's 50 other tasks and things that have to be integrated into the way a company works, like the process flow, the orchestration of it. And maybe there are, you know, on a lot of these things, like if you're doing video production, there's many tools, or music even, right?

Starting point is 00:21:49 Like, okay, who's going to write the lyrics? Which AI will write the lyrics and which AI will figure out the music? And then, like, how does that all come together and how do we integrate it and so forth? And those things tend to just require a

Starting point is 00:22:05 real understanding of the end customer and so forth in a way. And that's typically been how, like, applications have been different than platforms in the past. It's like there's real knowledge about how the customer using it wants to function that doesn't have anything to do

Starting point is 00:22:22 with the kind of intel or is just different than what the platform is designed to do. And to get that out of the platform for a kind of company or a person turns out to be really, really hard. And so those things, I think, are likely to work, you know, especially if the process is very complex. And it's something that's funny.

Starting point is 00:22:44 As a firm, you know, we're a little more hardcore technology-oriented. And we've always struggled with those, you know, in terms of, oh, this is like a some process application for like plumbers to figure out this. And we're like, well, where's the technology? But, you know, a lot of it is how do you encode, you know, some level of domain expertise and kind of how things work in the actual world back into the software? I often think of Intel founders that you can think about this in terms of price, you can kind of work backwards from pricing a little bit, which is to say sort of business value and what you can charge for, which is, you know, the natural thing for any technologists to do

Starting point is 00:23:23 is to kind of say, I have this new technological capability and I'm going to sell it to people and, like, what am I going to charge for it? It's going to be somewhere between, you know, my cost of providing it and then, you know, whatever markup I think I can justify, you know, and if I have a monopoly providing it, maybe the markup's infinite. But, you know, it's kind of this, it's a technology forward, you know, kind of supply forward, you know, pricing model. there's a completely different pricing model for kind of business value backwards and or sort of, you know, so-called value pricing, value-based pricing. And that's, you know, to your point, that's basically a pricing model that says, okay, what's the business value to the customer of the

Starting point is 00:23:55 thing? And if the business value is, you know, a million dollars, then can I charge 10% of that and get $100,000, right, or whatever. And then, you know, why is it cost $100,000 as compared to $5,000 is because, well, because to the customer is worth a million dollars. And so they'll pay 10% for it. Yeah, actually, so a great example of that, like, we've got a company in our portfolio, Crest AI, that does things like debt collection. Okay, so if I can collect way more debt with way fewer people with my, you know, it's a co-pilot type solution, then what's that worth?

Starting point is 00:24:37 Well, it's worth a heck of a lot more than just buying, an open AI license because an open AI license is not going to easily collect debts or kind of enable your debt collectors to be massively more efficient or that kind of thing. So it's bridging that gap between the value. And I think you had a really important point. The test for whether your idea is good is how much can you charge for it? Can you charge the value? Or are you just charging the amount of work it's going to take the customer to put their

Starting point is 00:25:10 own wrapper on top of open AI. Like that's the, that's the real test to me of like how deep and how important is what you've done. Yeah. And so to your point on like the kinds of, you know, the kinds of businesses that technology investors have had a hard time with, you know, kind of kind of thinking about, you know, maybe accurately is sort of, it's the company that is, it's a, it's a vendor that has built something where it is a specific solution to a business problem where it turns out the business problem is very valuable to the customer. And so therefore, they will pay a percentage of the value provided back to back in the terms for price for the software. And that actually turns out you can have businesses that are not

Starting point is 00:25:47 very technologically differentiated that are actually extremely lucrative. And then because that business is so lucrative, they can actually afford to go think very deeply about how technology integrates into the business, what else they can do. You know, this is like the story of a salesforce.com, for example, right? And by the way, there's kind of a a chance, a theory that the models are all getting really good. There are open source models. They are like that are awesome. You know, Lama, Mistral, like, these are great models.

Starting point is 00:26:18 And so the actual layer where the value is going to accrue is going to be like tools, orchestration, that kind of thing, because you can just plug in whatever the best model is at the time, whereas the models are going to be competing, you know, in a death battle with each other and be commoditized down to the, you know, the cheapest one wins and that kind of thing. So, you know, you could argue that the best thing to do is to kind of connect the power to the people. Right. Right. So that actually takes us to the next question, and this is a two and one question. So Michael asks, and I'll say these are diametrically opposed, which is why I paired them. So Michael asks, why RVC is making huge investments in generative AI startups when it's

Starting point is 00:27:03 clear these startups won't be profitable anytime soon, which is a loaded, loaded question, but we'll take it. And then Kaiser asks, if AI deflates the cost of building a startup, how will the structure of tech investment change? And of course, Ben, this goes to exactly what you just said. So it's basically the questions are diametrically opposed because if you squint out of your left eye, right, what you see is basically the amount of money being invested in the foundation model companies kind of going up to the right at a furious pace. You know, these companies are raising hundreds of millions, billions, billions of dollars. And it's just like, oh my God, look at these sort of capital, you know, sort of, I don't know, infernos, you know,

Starting point is 00:27:34 that hopefully will result in value at the end of the process. But my God, look at how much money is being invested in these things. If you squint through your right eye, you know, you think, wow, that now all of a sudden it's, like, much easier to build software. It's much easier to have a software company. It's much easier to, like, have a small number of programmers writing complex software because they've got all these AI co-pilots and all these automated, you know, software development capabilities that are coming online. And so on the other side, the cost of building an AI like application startup might, you know, crash and it might just be that like the, you know, the Salesforce.com might cost, you know, a tenth or a hundredth or a thousandth

Starting point is 00:28:06 amount of money that it took to build, you know, the old database driven Salesforce.com. And so, yeah, so what do we think of that dichotomy, which is you can actually look, you can actually look out of either eye and see either cost to the moon as like for startup funding or cost actually going to zero? Yeah, well, like, so it is interesting. I mean, we actually have companies in both camps, right? Like I think probably the companies that have gotten to profitability, the fastest, maybe in the history of the firm have been AI companies. There have been, you know, AI companies in the portfolio where the revenue grows so fast that it actually kind of runs out ahead of the cost. And then there are, like, you know, people who are in the foundation model race who

Starting point is 00:28:49 are raising hundreds of millions, you know, even billions of dollars to kind of keep pace and so forth. They also are kind of generating revenue at a fast rate. The headcount and all of them is small. So I would say, you know, where AI money goes, and even, you know, like if you look at OpenAI, which is the big spender in startup world, which, you know, we are also investors and is, you know, headcount-wise, they're pretty small against their revenue. Like, it is not a big company headcount. Like, if you look at the revenue level on how fast.

Starting point is 00:29:25 they've gotten there. It's pretty small. Now, the total expenses are ginormous, but they're going into the model creation. So it's an interesting thing. I mean, I'm not entirely sure how to think about it. But I think, like, if you're not building a foundation model, it will make you more efficient and probably gets profitability quicker. Right. So the counter, and this is a very bullish counterargument, but the counterargument to that would be basically that falling costs for, like, new software companies are a mirage. And the reason for that is this thing in economics called the Jevins Paradox,

Starting point is 00:30:00 which I'm going to read from Wikipedia. So the Jevins Paradox occurs when technological progress increases the efficiency with which a resource is used, reducing the amount of that resource necessary for any one use. But the falling cost induces increases in demand, right, elasticity, enough that the resource use

Starting point is 00:30:18 overall is increased rather than reduced. Yeah, that's certainly possible. Right. And so this is, you see, You see versions of this, for example, you build in your freeway, and it actually makes traffic jams worse, right? Because basically what happens is, oh, it's great, now there's more roads, now we can have more people live here, we can have more people that, you know, we can make these companies bigger, and now there's more traffic than ever, and now there's more traffic's even worse. Or you saw the classic example is during the Industrial Revolution coal consumption, as the price of coal drops, people use so much more coal that actually the overall consumption actually increased. And people are getting a lot more power, but the result was the use of a lot more coal in the paradox.

Starting point is 00:30:54 And so the paradox here would be, yes, the cost of developing any given piece of software falls, but the reaction to that is a massive surge of demand for software capabilities. And so the result of that actually is, although it looks like starting software companies, the price is going to fall. Actually, it's going to happen as it's going to rise for the high-quality reason that you're going to be able to do so much more. Yeah. Right?

Starting point is 00:31:15 With software, the products are going to be so much better, and the roadmap is going to be so amazing of the things you can do, and the customers are going to be so happy with it that they're going to want more and more and more. So the result of it And by the way, another example of Jevon's Paradox playing out in another related industries in Hollywood you know, CGI in theory

Starting point is 00:31:31 should have reduced the price of making movies in reality has increased it because audience expectations went up. Yeah. And now you go to a Hollywood movie and it's wall-to-wall CGI. And so, you know, movies are more expensive to make than ever.

Starting point is 00:31:41 And so the result of it, you know, but the result in Hollywood is at least much more, let's say, visually elaborate, you know, movies, whether they're better or not another question, but like much more visually elaborate, compelling kind of visually stunning movies

Starting point is 00:31:51 through CGI. The version here would be much better software. Yeah. Like radically better software to the end user, which causes end users to want a lot more software, which causes actually the price of development to rise. You know, if you just think about like a simple case like travel, like, okay, booking a trip through Expedia is like complicated.

Starting point is 00:32:10 You're likely to get it wrong. You're clicking on menus and this and that and the other. And like, you know, da-da-da-da. An AI version of that would be like, you know, send me to Paris, put me in a hotel I'll love at the best price. you know, send me on the best possible kind of airline, an airline ticket. And then, you know, like, make it, like, really special for me. And, like, maybe you need a human to go, okay, like, we're going to, you know, or maybe the AI gets more complicated and says, okay, well, we know the person loves chocolate

Starting point is 00:32:41 and we're going to, like, you know, FedEx in the best chocolate in the world from Switzerland into this hotel in Paris and this and that and the other. And so, like, the quality, you can, the quality could get to levels that we can't even imagine today, just because, you know, the software tools aren't, aren't what they're going to be. So, yeah, that's right. Yeah, I kind of buy that, actually. I think I brought in the argument. You're both. How about, yeah, or how about I'm going to land in whatever, Boston at 6 o'clock?

Starting point is 00:33:11 I want to have dinner at 7 with a table full of, like, super interesting people. Yeah, right, right, right, right. You know. Yeah. Right. Yeah, yeah, yeah. No travel agent would do that for you today, nor would you want them to. Yeah, no.

Starting point is 00:33:25 No. Right, well, and then you think about it, it's got to be integrated into my personal AI and, like, and this, you know, there's just, like, unlimited kind of ideas that you can do. And I think this is one of the kind of things that's always been underestimated about humans is, like, our ability to come up with new things we need. Like, that has been unlimited. And there's a very kind of famous case where John Maynard Keynes, who the kind of prominent economists in the kind of first half of last century, had this thing that he predicted,

Starting point is 00:34:01 which is like nobody, because of automation, nobody would ever work a 40-hour work week. You know, like because once their needs were met, needs being like shelter and food. And, you know, I don't even know if transportation was in there. like that was it it was over and like you would never work past the need for shelter and food like why would you like there's no reason to but of course needs expanded so then everybody needed a refrigerator everybody needed not just one car but a car for everybody in the family everybody needed a television set everybody needed like glorious vacations everybody you know so what are we going to need next i'm quite sure that i can't imagine it but like somebody's going to imagine it

Starting point is 00:34:46 and it's quickly going to become a need. Yeah, that's right. By the way, as Keynes famously said, his essay, I think, was economic prospects for our grandchildren, which was basically that. Yeah. You just articulated. So Carl Marx had another version of that.

Starting point is 00:34:59 I just pulled up the quote. So that society, when, you know, when the Marxist utopia, socialism is achieved, society regulates the general production, thus makes it possible for me to do blah, blah, to hunt in the morning, fish in the afternoon, rear cattle in the evening,

Starting point is 00:35:15 criticize after dinner. What a glorious life. What a glorious life. Like, if I could just list four things that I do not want to do, it's hunt, fish, rear cattle, and criticize. Yeah, yeah. Right?

Starting point is 00:35:29 And by the way, it says a lot about Marx that those were his four things. Well, criticizing being his favorite thing, I think, is basically communism in a nutshell. Yeah, exactly. I don't want to get too political, but yes. Yes, 100%. And so, yeah, so it's this, this,

Starting point is 00:35:45 Yeah, what they have, what Keynes and Marks had in common is just this incredibly constricted, it's incredibly constricted view of what people want to do. And then, and then correspondingly, you know, the other thing is just like, you know, people, people, people who want to have a mission. I mean, probably some people just want to fish and hunt. Yeah. But, you know, a lot of people want to have a mission. They want to have a cause.

Starting point is 00:36:00 They want to have a purpose. It's actually a good thing in life, it turns out, you know. It turns, it turns out. Yeah. In the startling turn of events. Okay. So, yeah, so, yeah, I think that I've long felt, you know, a little bit of the soft rates the world thing, a decade ago.

Starting point is 00:36:15 I've always thought that basically demand for software is sort of perfectly elastic, possibly to infinity. And the theory there basically is if you just continuously bring down the cost of software, you know, which has been happening over time, then basically demand, you know, basically is like basically perfectly correlates upward. And the reason is because, you know, kind of, as we've been discussing, but it's kind of there's, there's always something else to do in software. There's always something else to automate. There's always something else to improve. There's always something to make better. And, you know, in the moment with the constraints that you have today, you may not, you know, think about, that is, but the minute you don't have those constraints, you'll imagine what it is. Well, I'll just give you an example. I mean, so I'll give you an example that's playing out

Starting point is 00:36:50 with AI right now, right? So there have been, and we have, you know, we have companies that do this, you know, there have been companies that have made AI, you know, that have made software systems for doing security cameras forever, right? And it's like, and for a long time, it was like a big deal to have software that would do, like, you know, have different security camera feeds and store them on a DVR and be able to replay them and have an interface that lets you do that. Well, it's like, you know, AI security cameras all of a sudden can have, like, actual, semantic knowledge of what's happening in the environment. And so they can say, you know, hey, that's Ben.

Starting point is 00:37:16 And then they can say, oh, hey, you know, that's Ben, but he's carrying a gun. Yeah. Right. Right. And by the way, that's Ben and he's carrying a gun, but that's because, like, he hunts on, you know, on Thursdays and Fridays as compared to that's Mary. And she never carries a gun and, like, you know, like something is wrong. And she's really mad, right?

Starting point is 00:37:32 She's got a, yeah, really esteemed expression in her face and we should probably be worried about it, right? And so there's, like, an entirely new set of capabilities you can do, just as one example, for security systems that were never possible pre-AI. And a security system that actually has a semantic understanding. of the world is obviously much more sophisticated than the one that doesn't and might actually be more expensive to make, right? Right.

Starting point is 00:37:51 Well, and just imagine health care, right? Like, you could wake up every morning and have a complete diagnostic, you know, like, how am I doing today? Like, what are all my levels of everything? And, you know, how should I interpret them, you know, better than, you know, this is one thing where AI is really good is, you know, medical diagnosis because it's a super high. dimensional problem, but if you can get access to, you know, your continuous glucose reading, you know, maybe sequence your blood now and again, this and that and the other, yeah, you've got an incredible kind of view of things and who doesn't want to be healthier, you know, like now we have

Starting point is 00:38:32 a scale. That's basically what we do. You know, maybe check your heart rate or something, but like pretty primitive stuff compared to where we could go. Yeah, that's right. Okay, good. All right, So let's go to the next topic. So on the topic of data. So Major Tom asks, as these AI models allow for us to copy existing app functionality at minimal cost, proprietary data seems to be the most important moat. How do you think that will affect proprietary data value? What other moats do you think companies can focus on building in this new environment?

Starting point is 00:39:01 And then Jeff Weisshopped asks, how should companies protect sensitive data, trade secrets, proprietary data, individual privacy in the brave new world of AI? So let me start with a provocative statement. Ben, see if you agree with it, which is, you know, you sort of hear a lot, this sort of statement or cliche is like data is the new oil. And so it's like, okay, you know, data is the key input to training AI, making all this stuff work. And so, you know, therefore, you know, data is basically the new resource. It's the limiting resource. It's the super valuable thing. And so, you know, whoever has the best data is going to win and you see that directly in how you train AI's.

Starting point is 00:39:33 And then, you know, you also have like a lot of companies, of course, that are now trying to figure out what to do with AI. And a very common thing you'll hear from companies is, well, we have proprietary data, right? So I'm a hospital chain or I'm a, you know, whatever, any kind of business, insurance company or whatever. And I've got all this proprietary data that I can apply, you know, that I'll be able to, you know, build things with my proprietary data with AI that won't just, you know, be something that anybody will be able to have. Let me argue that basically, let's see, I mean arguing in like almost every case like that, it's not true. It's basically what the Internet kids would call cope. It's simply not true. And the reason it's just not true is because the amount of data available on the Internet and just generally in the environment.

Starting point is 00:40:10 and just generally in the environment is just a million times greater. And so while it may not, you know, while it may not be true that I have your specific medical information, I have so much medical information off the Internet for so many people and so many different scenarios that it just swamps the value of, quote, your data. You know, it's just like overwhelming. And so your proprietary data as, you know, company X will be a little bit useful in the margin, but it's not actually going to move the needle.

Starting point is 00:40:38 And it's not really going to be a barrier to entry in most cases. and then let me cite as proof for my belief that this is mostly cope is there has never been nor is there now any sort of basically any level of rich or sophisticated marketplace for data market for data there's no there's no large marketplace for data and in fact in fact what there are is there are very small markets for data so there are these businesses called data brokers that will sell you you know large numbers of like you know information about users on the internet or something and they're just small businesses like they're just not large it just turns out like information on lots of people is just

Starting point is 00:41:10 not very valuable. And so if the data actually had value, you know, it would have a market price and you would see a transacting and you actually very specifically don't see that, which is sort of a, yeah, sort of quantitative proof that the data actually is not nearly as valuable as people think it is. Where I agree, so I agree that the data like just says, here's a bunch of data and I can sell it without doing anything to the data is like massively over. I definitely agree with that. And like maybe I can imagine some exceptions like some, you know, special population genomic databases or something that are that were very hard to acquire that are useful in some way.

Starting point is 00:41:54 That's, you know, that's not just like living on the Internet or something like that, I could imagine, where that's super highly structured, very general purpose and not widely available. But for most data in companies, it's not like that. and that it tends to not, it's either widely available or not general purpose. It's kind of specific. Having said that, right, like companies have made great use of data,

Starting point is 00:42:19 for example, a company that you're familiar with, meta, uses its data to kind of great ends itself, feeding it into its own AI systems, optimizing its products in incredible ways. And I think that, you know, us, Andres and Horowitz, actually, you know, so we just raised $7.2 billion. And it's not a huge deal, but we took our data and we put it into an AI system. And our LPs were able, there's a million questions investors have about everything we've done, our track record, every company we've invested and so forth.

Starting point is 00:42:54 And for any of those questions, they could just ask the AI. They could be wake up at 3 o'clock in the morning and go, do I really want to trust these guys and go in and ask the AI a question? And boom, they'd get an answer back instantly. They'd have to wait for us and so forth. So we really kind of improved our investor relations product tremendously through use of our data. And I think that almost every company can improve its competitiveness through use of its own data. But the idea that it's collected some data that it can go like sell or that is oil or what have you, that's, yeah, that's probably not true.

Starting point is 00:43:36 I would say. And, you know, it's kind of interesting because a lot of the data that you would think would be the most valuable would be like your own code base. Right, your software that you've written. So much of that lives in GitHub.

Starting point is 00:43:49 Nobody is actually, I don't know of any company. We could work with, you know, whatever, a thousand software companies. And do we know any that's like building their own programming model on their own code? Like, or, and would that be a good idea?

Starting point is 00:44:05 Probably not just because there's so much code out there that the systems have been trained on. So like that's not so much of advantage. So I think it's a very specific kind of data that would have value. Well, let's make it actionable then. If I'm running a big company, like if I'm running an insurance company or a banker or a hospital chain or something like that, or, you know, a consumer package goods company, Pepsi or something, like what, how should I validate? Like, how should I validate that I actually have a valuable proprietary data asset that I should really be focusing on using. versus in the alternate, by the way,

Starting point is 00:44:39 maybe there's other things. Maybe I should be taking all the effort I would spend on trying to optimize use of that data, and maybe I should use it entirely trying to build things using internet data instead. Yeah, so I think, I mean, look, if you're, right, if you're in the insurance business, then, like, all your actuarial data is both interesting,

Starting point is 00:44:57 and then I don't know that anybody publishes their actual, or actuarial data. And so, like, I'm not sure how you would train the model on stuff off of the internet. You know, similarly... Can I challenge that one? So that would be a good thing. That would be a good test case.

Starting point is 00:45:13 So I'm an insurance company. I've got records on 10 million people and, you know, the actuarial tables and when they get sick and when they die. Okay, that's great. But like, there's lots and lots of actuarial, general actuarial data on the internet for large-scale populations, you know,

Starting point is 00:45:26 because governments collect the data and they process it and they publish reports. And there's lots of academic studies. And so, like, is your large data set, giving you any additional actuarial information that the much larger data set on the internet isn't already providing you? Like are your insurance clients

Starting point is 00:45:44 actually actuarially any different than just everybody? I think so because on intake on the you know, when you get insurance, they give you like a blood test, they've got all these things. I know if you're a smoker and so forth. And in the, I think in the general data set,

Starting point is 00:46:00 like, yeah, you know who dies, but you don't know what the fuck they did coming in. And so what you realize, are looking for is like, okay, for this profile of person with these kinds of lab results, how long are they live? And that's where the value is. And I think that, you know, interesting, like, you know, I was thinking about like a company like Coinbase where, right, they have incredibly valuable assets in the terms of money. They have to stop people from breaking in. They've done a massive amount of work on that. They've seen all kinds of break in types. I'm sure they

Starting point is 00:46:34 have tons of data on that. It's probably, like, weirdly specific to people trying to break into crypto exchanges. And so, you know, like, I think it could be very useful for them. I don't think they could sell it to anybody. But, you know, I think every company's got data that if, you know, fed into an intelligent system would help their business. And I think almost nobody has data that they could just go sell. And then there's this kind of in-between question.

Starting point is 00:47:04 which is what data would you want to let Microsoft or Google or OpenAI or anybody get their grubby little fingers on? And that I'm not sure. That I think is the question that enterprises are wrestling with more than, it's not so much should we go like sell our data, but it should we train their own model just so we can maximize the value or should we feed it into the big model? And if we feed it into the big model, do all of our competitors now have the thing that we just did?

Starting point is 00:47:40 And, you know, or could we trust the big company to not do that to us? Which I kind of think the answer on trusting the big company not to F with your data is probably I won't do that. Well, I mean, yes. If your competitiveness depends on that, you probably shouldn't do that. Well, there are at least reports that certain big companies are using all kinds of data that they shouldn't be using to train their models already. so yep i think like i think those reports are very likely true right or they have open data right like this is you know we've talked about this before but you know the same companies that are saying they're not stealing all the data from people you are taking it an unauthorized way refuse to

Starting point is 00:48:24 say open their data like why not tell us where your data came from and in fact they're trying to shut down all openness no open source no open weights no open data no open nothing and go to the government and try and get to do that. You know, if you're not a thief, then why are you doing that? Right, right, right. What are you heading? By the way, there's other twist and turns here. So, for example, in the insurance example, I kind of deliberately loaded it because you may

Starting point is 00:48:47 know it's actually illegal to use genetic data for insurance purposes, right? So there's this thing called the Geno Law, Genetic Information Non-Discrimination Act of 2008. And basically it basically banshears in the U.S. from actually using genetic data for the purpose of doing, you know, health assessment, actual actual assessment of it, which, by the way, because now the genomics are getting really good, like that data probably actually is, you know, among the most accurate data you could have if you were actually trying to predict, like, when people are going to get sick and die, and they're literally, they're literally not allowed to use it.

Starting point is 00:49:18 Yeah, it is, I think that this is an interesting, like, weird misapplication of good intentions in a policy way that's probably going to kill more people than ever get saved by every kind of health, FDA, et cetera, policy that we have, which is, you know, in a world of AI, having access to data on all humans, why they get sick, what their genetics were, et cetera, et cetera, et cetera, is the most, that is, you know, you don't talk about data being the new oil, like that is the new oil. the health care oil is, you know, if you could match those up, then we'd never not know why we're sick. You know, you could make everybody much healthier, all these kinds of things.

Starting point is 00:50:08 But, you know, to kind of stop the insurance company from kind of overcharging people who are more likely to die, we've kind of locked up all this data. A kind of better idea would be to just go, okay, for the people who are likely to, like, we subsidize health care, like, massively for individuals anyway, just, like, differentially, you know, subsidize. And, you know, and then, like, you solve the problem and you don't lock up all the data. But, you know, it's typical of politics and policy. I mean, most of them are like that, I think, yeah. Well, there's this interesting questions like an insurance, like, basically, one of the

Starting point is 00:50:50 questions people have asked about insurance is, like, if you had perfectly predictive information on, like, individual outcomes, does the whole concept of insurance actually still work, right? Because the whole theory of insurance is risk pooling, right? It's precisely the fact that you don't know what's going to happen in the specific case that means you build these statistical models and then you risk pool and then you have variable payouts depending on exactly what happens. But if you literally knew what was going to happen in every case, because, for example, you had all of this predictive genomic data, then all of a sudden it wouldn't make sense to risk pool because you just say, well, no, this person is going to cost X, that person is

Starting point is 00:51:22 going to cost Y. There's no. Health insurance already doesn't make sense in that way, right? Like in France, the idea of insurance is kind of like the, it started with crop insurance where like, okay, you know, my crop fails. And so we all put money in a pool in case like my crop fails so that, you know, we can cover it. It's kind of designed for it to risk pool for a catastrophic unlikely incident. Like, everybody's got to go to the doctor all the fucking time. And some people get sicker than others and that kind of thing. But, like, the way our health insurance works is, like, all medical gets, you know,

Starting point is 00:52:01 paid for through this insurance systems, which is this layer of loss in bureaucracy and giant companies and all this stuff when, like, if we're going to pay for people's health care, just pay for people's health care. Like, what are we doing? right? And if you want to disincent people from like going for nonsense reasons than just up the co-pay, like it's like what are we doing? Just. Well, and then from a justice standpoint, from a fairness standpoint, like what it makes sense for me, you know, would it make sense for me, you know, if I knew that you were going to be more expensive than me, like, you know, I'm directly, you know,

Starting point is 00:52:39 if everybody knows what future health care costs is per person. Yeah. There has a very good predictive model for it. You know, societal willingness to all pool in. The, you know, way that we do today might really diminish. Yeah, yeah. Well, and then, like, you could also, if you knew, like, there's things that you do genetically, and maybe we give everybody a pass on that. It's like you can't control your genetics. But then, like, there's things you do behaviorally that, like, dramatically increases your chance of getting sick. And so maybe, you know, we incentivize people to stay healthy instead of just, like, paying for them not to die. There's a lot of systemic fixes we could do to the healthcare system. It couldn't be designed in a more ridiculous way,

Starting point is 00:53:19 I think. Well, it couldn't be designed a more ridiculous way. It's actually more ridiculous than some other countries, but it's pretty crazy here. Nathan Odie asks, what are the strongest common themes between the current state of AI and Web 1.0? And so let me start there. Let me give you a theory, Ben, and see what you think. So I get this question, you know, because of my role and, you know, Ben, you with me at Netscape. We get this question a lot because of our role early on with the internet. And there's an, you know, the internet boom was like a major, major event in technology, and it's still within a lot of, you know, people's memories. And so, you know, the sort of, you know, people like to reason from analogy.

Starting point is 00:53:52 So it's like, okay, the AI boom must be like the internet boom, starting an AI company, must be like starting an internet company. And so, you know, what is this like? And we actually got a bunch of questions like that, you know, that are kind of analogy questions like that. I actually think, you know, and then, Ben, you know, you and I were there for the internet boom. So, you know, we live through that and the bust and the boom and the bust.

Starting point is 00:54:10 So I actually think that the analogy doesn't really work. for the most, it works in certain ways, but it doesn't really work for the most part. And the reason is because the internet, the internet was a network, whereas AI is a computer. Yep. Okay, yeah.

Starting point is 00:54:27 So people understand what we're saying. It's more like the PC boom. Or the PC boom, or even I would say the microprocessor, like my best analogy is to the microprocessor. Yeah. Or even to the original computers, like back to the mainframe era. And the reason is because, yeah,

Starting point is 00:54:42 look, what the internet did was the internet, you know, obviously it was a network, but the network connected together many existing computers, and then, of course, people built many other new kinds of computers to connect to the Internet. But fundamentally, the Internet was a network. And that's important because most of the sort of industry dynamics, competitive dynamics, startup dynamics around the Internet,

Starting point is 00:54:59 had to do with basically building, either building networks or building applications that run on top of networks. And this, you know, the Internet generation of startups was very consumed by network effects. And, you know, all these positive feedback loops that you get when you connect a lot of people together. And, you know, things like met, you know, so-called Metcast law, which is sort of the value of a network, you know, expands, you know, kind of the way it expands is you have more people to it. And then, you know, there were all these fights, you know, all the social networks or whatever fighting to try to get network effects and try to steal each other's users because of the network effects.

Starting point is 00:55:29 And so it's kind of, you know, it's dominated by network effects, which is what you expect from a network business. AI, like there are some networks effects in AI that we can talk about, but it's more like a microprocessor. It's more like a chip. It's more like a computer. It's a system that basically, right, if data comes in, data gets processed, data comes out, things happen. That's a computer. It's an information processing system. It's a computer.

Starting point is 00:55:52 It's a new kind of computer. It's a, you know, we like to say the sort of computers up until now have been what are called von Neumann machines, which is to say they're deterministic computers, which is they're like, you know, hyper literal. And they do exactly the same thing every time. And if they make a mistake, it's, yes, the programmer's fault. But they're very limited in their ability to interact with people and understand the world. We think of AI and large language models as a new kind of computer, a probabilistic computer, a neural network-based computer that, you know, by the way, is not very accurate and doesn't give you the same result every time and, in fact, might actually argue with you and tell you that it doesn't want to answer your question. Yeah, yeah, which makes it very different in nature than the old computers, and it makes it kind of compulsibility, you know, the ability to build things, big things out of little things more complex. right but but the capabilities are new and different and valuable and important because it can

Starting point is 00:56:45 understand language and images and you know that all these do all these things that you see when you use all that means we never solve with deterministic computers we can now go after right yeah exactly and so i think i think ben i think the analogy and i think the lessons learned are much more likely to be drawn from the early days of the computer industry or from the early days of the microprocessor than the early days of the internet does that does that sound right i think so yeah I definitely think so. And that doesn't mean there's no boom and bust and all that because that's just the nature of technology. You know, people get too excited and then they get too depressed.

Starting point is 00:57:18 So there will be some of that, I'm sure. There will be over-billed-out, you know, potentially of, eventually of chips and power and that kind of thing. You know, we start with the shortage. But I agree. Like, I think networks are fundamentally different in the nature of how they evolved than computers. And kind of just the adoption curve and all those kinds of things. things will be different. Yeah, so then, and this kind of goes to where, how I think the industry is going to

Starting point is 00:57:42 unfold. And so this is kind of my best theory for kind of what happens from here. It's kind of this, you know, this giant question of like, you know, is the industry going to be a few God models or, you know, a very large number of models of different sizes and so forth? So the computer, like famously, you know, the original computers, like the original IBM mainframes, you know, the big computers, you know, they were very, very large and expensive and there were only a few of them. And the prevailing view, actually, for a long time was that's all there would ever be. And there was this famous statement by Thomas Watson, Sr.,

Starting point is 00:58:11 who was the creator of IBM, which was the dominant company for the first, like, you know, 50 years of the computer industry. And he said, he said, I didn't believe this is actually true, but he said, I don't know, I don't know that the world will ever need more than five computers.

Starting point is 00:58:24 And I think the reason for that, it was literally, it was like the government's going to have two, and then there's like three big insurance companies, and then that's it. Who else would need to do all that math? Exactly. Yeah, who else would need to, who else needs to keep track a few

Starting point is 00:58:37 amounts of numbers, who else needs that level of, you know, calculation capability. It's just not a relevant, you know, it's just not, not a relevant concept. And by the way, they were, like, big and expensive. And so who else can afford them, right? And who else can afford all the headcount required to manage them and maintain them? I mean, and this is in the days, I mean, these things were big. These things were so big that you'd have an entire building that got built around a computer, right? And they'd have, like, they'd famously have all these guys in white lab coats, literally, like,

Starting point is 00:59:00 taking care of the computer because everything had to be kept super clean or the computer would stop working. And so, you know, it was this thing where, you know, there's a, it's a, day we have the idea of an AI god model which is like a big foundation model that you know then we have the idea of like a god mainframe like there would just be a few a few of these things and by the way if you watch old science fiction it almost always has this sort of conceit it's like okay there's a big supercomputer and it either is like doing the right thing or doing the wrong thing and if it's doing the wrong thing you know that's that's often the plot of the science fiction movies is you have to go in and try to

Starting point is 00:59:28 figure how to fix it or defeat it so it's sort of this this idea of like a single top down thing of course and that held for a long time like that held for you know the first few decades and then And, you know, even when computers, computers started to get smaller. So then you had so-called mini-computers was the next phase. And so that was a computer that, you know, didn't cost $50 million. Instead, it costs, you know, $500,000. But even still, $500,000 is a lot of money. People aren't putting mini-computers in their homes.

Starting point is 00:59:52 And so it's like mid-sized companies can buy mini-computers, but certainly people can't. And then, of course, with the PC, they shrunk down to like $2,500. And then, you know, with the smartphone, they shrunk down to $500. And then, you know, sitting here today, obviously, you have computers of every shape, size description all the way down to, you know, computers that cost, a penny, you know, you've got a computer in your thermostat that, you know, basically controls the temperature in the room, and it, you know, probably cost a penny, and it's probably some embedded arm chip with firmware on it. And there's, you know, many billions of those

Starting point is 01:00:18 all around the world. You buy a new car today. It has something, new cars today have something on the order of 200 computers in them, maybe more at this point. And so you just basically assume with the chip today, sitting here today, you just kind of assume that everything has a chip in it. You assume that everything, by the way, draws electricity or has a battery because it needs to power the chip. And then increasingly you assume that everything's on the internet, because basically all computers are assumed to be on the internet or they will be. And so, as a constant, what you have is the computer industry today is this massive pyramid. And you still have a small number of like these supercomputer clusters or these giant mainframes

Starting point is 01:00:50 that are like the God model, you know, the God mainframes. And then you've got, you know, a larger number of many computers. You've got a larger number of PCs. You've got a much larger number of smartphones. And then you've got a giant number of embedded systems. And it turns out like the computer industry is all of those things. And, you know, what is it? What size of computer do you want is based on, well, what exactly are you trying to do and who are you and what do you need?

Starting point is 01:01:11 And so if that analogy holds, it basically means actually we are going to have AI models of every conceivable shape, size, description capability, right, based on trained on lots of different kinds of data running at very different kinds of scale, very different privacy, different policies, different security policies. You know, you're just going to have like enormous variability and variety, and it's going to be an entire ecosystem and not just a couple of companies. Yeah, let me see what you think of that. Well, I think that's right. And I also think that the other thing that's interesting about this era of computing, if you look at prayers of computing from the mainframe to the smartphone, a huge source of lock-in was basically the difficulty of using them. So, you know, nobody ever got fired for buying IBM because, like, you know,

Starting point is 01:01:58 you had people trained on them. You know, people knew how to use the operating system. like it was, you know, it was just kind of like a safe choice due to the massive complexity of, like, dealing with a computer. And then even with the smartphone, like the, you know, why is the Apple computer smartphone so dominant, you know, what makes it so powerful it's well? Because, like, switching off of it is so expensive and complicated and so forth. It's an interesting question with AI because AI is the easiest computer to use by far. it speaks English. It's like talking to a person. And so, like, what is the lock-in there? And so are you completely free to use the size, price, choice, speed that you need for your particular

Starting point is 01:02:46 task? Or are you locked into the God model? And, you know, I think it's still a bit of an open question, but it's pretty interesting. And that, that thing could be very different than prior generations. Yeah. Yeah, that makes sense. And then just to complete the question, what would we say? So, Ben, what would you say our lessons learned from the Internet era that we live through that would apply that people should think about?

Starting point is 01:03:11 I think a big one is probably just the boom-bust nature of it that, like, you know, the demand, the interest in the Internet, the recognition of what it could be was so high that money just kind of poured in in buckets and you know and then the underlying thing which in internet age was the telecom infrastructure and fiber and so forth got just unlimited funding and unlimited fiber was built out and then eventually we had a fiber glut and all the telecom companies went bankrupt and and that was great fun but you know like we entered in a good place and i think that that's something like that's probably pretty likely to happen in AI where like you know every company is going to get funded. We don't need that many AI companies, so a lot of them are going to bust. There's

Starting point is 01:04:02 going to be a huge, you know, huge investor losses. There will be an overbuilt out of chips for sure at some point. And then, you know, we're going to have too many chips and, you know, some chip companies will go bankrupt for sure. And then, you know, and I think probably the same thing with data centers and so forth, like, we'll be behind, behind, behind, and then we'll overbuild at some point. So that all be very interesting. I think that, and that's kind of the, that's every new technology. So Carlotta Perez has a great kind of, has done, you know, amazing work on this where like that is just the nature of a new technology is that you overbuild, you underbuild it, then you overbuild, and you know, and there's a hype cycle that funds the build out and a lot of money

Starting point is 01:04:49 is lost, but we get the infrastructure and that's awesome because that's when it really gets adopted didn't change this world. I want to say, you know, with the internet, the other, the other kind of big kind of thing is the internet went through a couple of phases, right? Like, it went through a very open phase, which was unbelievably great. It was probably one of the greatest booms to the economy. It, you know, it certainly created tremendous growth in power in America, both, you know, kind of economic power and soft cultural power and these kinds of things. And then, you know, it became closed with the next generation architecture with, you know, kind of discovery on the internet being owned entirely by Google and, you know, kind of other things, you know, being

Starting point is 01:05:32 owned by other companies. And, you know, AI, I think, could go either way, so it could be very open or like, you know, with kind of misguided regulation, you know, we could actually force our way from something that, you know, is open source, open weights, anybody can build it. We'll have a plethora of this technology will be like use all of American innovation to compete, or will, you know, we'll cut it all off, we'll force it into the hands of the companies that kind of on the internet today. And, you know, and we'll put ourselves at a huge disadvantage, I think, competitively against China in particular, but everybody in the world. So I think that's something I think that definitely, you know, that we're involved with trying to make sure it doesn't happen,

Starting point is 01:06:21 but it's a real possibility right now. Yeah. There's sort of an irony is that networks used to be all proprietary, and then they opened up. Yeah, yeah, yeah, right. Landman, Apple Talk, NetBooey, NetBioes. Yeah, exactly. And so these are all the early proprietary networks from all individual-specific vendors, and then the Internet appeared and kind of TCPIP and everything opened up.

Starting point is 01:06:42 The AI is trying to go the other. I mean, the big company is trying to take AI the other way. It started out as, like, open, just like, basically. just like the research. Everything was open source and AI, yeah. Right, right. And now they're trying to, they're trying to lock it down.

Starting point is 01:06:52 So it's a, it's a fairly nefarious turn of events. Yeah, yeah, very nefarious. You know, like, and it's remarkable to me. I mean, it is kind of the darkest side of capitalism when a company is so greedy,

Starting point is 01:07:07 they're willing to destroy the country and maybe the world to like just get a little extra profit. But, you know, and they do it. Like the, the really kind of nasty thing is they claim, oh, it's for safety. You know, we've created an alien that we can't control. But we're not going to stop work on it. We're going to keep building it as fast as we can,

Starting point is 01:07:24 and we're going to buy every freaking GPU on the planet. But we need the government to come in and stop it from being open. This is literally the current position of Google and Microsoft right now. It's crazy. And we're not going to secure it. So we're going to make sure that, like, Chinese spies can just, like, steal our chip plans, take them out of the country, and we won't even realize for six months.

Starting point is 01:07:46 Yeah, yeah, it has no. nothing to do with security. It only has to do with monopoly. Yes. The other, you know, just I've been going back on your point of speculation. So there's this critique that we hear a lot, right, which is like, okay, you idiots, basically it's like you idiots, you idiots, you idiots, entrepreneurs, investors. You idiots, it's just like there's a speculative bubble with every new technology. Like, basically, like, when are you people going to learn to not do that? Yeah. And there's an old joke. There's an old joke that relates to this, which is the foremost dangerous words in investing are, this time is different.

Starting point is 01:08:13 The 12 most dangerous words in investing are the foremost dangerous words are in in investing are this time is different, right? Like, so like, does history repeat? Does it not repeat? The, my sense of it, and you referenced Carlotta Perez's book, which I agree is good, although I don't think it works as well anymore, we can talk about sometime, but, but, you know, is a good, at least background piece on this. You know, it's just like, it's just incontrovertibly true.

Starting point is 01:08:34 Basically, every significant technology advance in history was greeted by some kind of financial bubble, basically since financial markets had existed. And this, you know, by the way, this includes like everything from, you know, radio and television, the railroads, you know, lots and lots of prior. By the way, there was actually a so-called, there was an electronics boom bust in the 60s called the, it was called the Tronics. Every company had the name Tronix.

Starting point is 01:08:55 And so, you know, there was that. So, you know, there was like a laser boom bus cycle. There were all these like boom bus cycles. And so basically it's like any new technology, that's what economists call it general purpose technology, which is to say something that can be used in lots of different ways. Like it inspires sort of a speculative mania. And, you know, and look, the critique is like,

Starting point is 01:09:12 okay, why do you need to have the speculative mania? Why do you need to have the cycle? because, like, you know, people, some people invest in the things, they lose a lot of money. And then there's this bus cycle that, you know, causes everybody to get depressed, maybe it delays the rollout. And it's like two things. Number one is like, well, you just don't know.

Starting point is 01:09:26 Like, if it's a general purpose technology like AI is and it's potentially useful in many ways, like nobody actually knows up front, like what the successful use cases are going to be or what successful companies are going to be. Like, you actually have to, you have to learn by doing. You're going to have some misses. That's venture capital. Yeah.

Starting point is 01:09:40 Yeah, exactly. Yeah, exactly. So, yeah, the true venture capital model, kind of wires this in, right? We basically, in core venture capital, the kind that we do, we sort of assume that half the companies fail, half the projects fail. And, you know, if any of us, if we or any of our... Failed completely, like, lose money.

Starting point is 01:09:57 Yeah. Lose money, exactly, yeah. And so, like, and of course, if we or any of our competitors, you know, could figure out how to do the 50% that work without doing the 50% that don't work, we would do that. But, you know, here we sit 60 years into the field and, like, nobody's figured that out. So there is, there is that unpredictability to it. And then the other kind of interesting way to think about, this is like, okay, what would it mean I have a society in which a new technology did not inspire

Starting point is 01:10:19 speculation? And it would mean having a society that basically is just like inherently like super pessimistic about both the prospects of the new technology, but also the prospects of entrepreneurship and, you know, people inventing new things and doing new things. And of course, there are many societies like that on planet Earth, you know, that just like fundamentally like don't have the spirit of invention and adventure that, you know, that a place like Silicon Valley does. And, you know, are they better off or worse off? And, you know, generally speaking, they're worse off. They're just, you know, less future oriented, less, less, less focused on building things, less focused on figuring out how to get growth. And so I think there's a, at least my sense,

Starting point is 01:10:55 there's a comes with the territory thing. Like we, we would all prefer to avoid the downside of a speculative boom-bus cycle, but like it seems to come with the territory every single time. And I, at least I have not, nobody, I'm aware, no society I'm aware of has ever figured out how to capture the good without also having the bet. Yeah. And like, why would you? I mean, it's kind of like, you know, the whole western United States was built off the gold rush. And like every kind of treatment in like popular culture of the gold rush kind of focuses on the people who didn't make any money. But there were people who made a lot of money, you know, and found gold. And, you know, in the internet bubble, which, you know, was completely ridiculed by, you know,

Starting point is 01:11:36 kind of every, every movie, if you go back and watch any movie between like 2001 and 2004, they're all like how only morons did dot com and this and that and the other. And there were all these funny documentaries and so forth. But like, that's when Amazon got started. You know, that's when eBay got started. That's when Google got started. You know, these companies, you know, were started in the bubble in the kind of time this great speculation, there was gold in those companies.

Starting point is 01:12:09 And if you hit any one of those, like you funded, you know, probably the next set of companies, you know, which included things like, you know, Facebook and X and, you know, Snap and all these things. And so, yeah, I mean, like, that's just the nature of it. I mean, like, that's what makes it exciting. And, you know, it's just a, it's an amazing kind of thing that, you know, look, the transfer of money from people who have excess. money to people who are trying to do new things

Starting point is 01:12:39 and make the world a better place is the greatest thing in the world. And if some of the people with excess money lose some of that excess money and trying to make the world a better place, why are you mad about that? Like that's the thing that I could never have seen. Like, why would you be mad at young, ambitious people trying to improve the world, getting funded,

Starting point is 01:13:04 and some of that being misguided? Like, why is that bad? Right, right. As compared to, yeah, as it compared to, especially as compared to everything else in the world and all the people who are not trying to. So you'd rather like, you know, lots of mansions and boats and jets. Right. Right. Like, what do you talk?

Starting point is 01:13:22 Right. Right. Exactly. Donate money to ruin us. Yeah, ruinous causes. Right. Such as ones that are on the news right now. Okay.

Starting point is 01:13:31 So, all right. We were at a minute 20. We made it all the way through four questions. We're doing good. great. So let's call it here. Thank you, everybody, for joining us. And I believe we should do a part two of this, if not parts three through six, because we have a lot more questions to go. But thanks everybody for joining us today. All right. Thank you.

a16z Podcast - The State of AI with Marc & Ben

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.