The a16z Show - The State of AI with Marc & Ben

Starting point is 00:00:00 Most of the content created on the internet is created by average people. And so kind of the content on average, as a whole, on average, is average. The test for whether your idea is good is how much can you charge for it? Can you charge the value? Or are you just charging the amount of work it's going to take the customer to put their own wrapper on top of Open AI? The paradox here would be the cost of developing any given piece of software falls. But the reaction to that is a massive surge of demand for software capabilities. And I think this is one of the things that's always been underestimated about humans

Starting point is 00:00:37 is our ability to come up with new things we need. There's no large marketplace for data. In fact, what there are is there are very small markets for data. In this wave of AI, big tech has a big compute and data advantage. But is that advantage big enough to drown out all the other startups trying to rise up? Well, in this episode, A16C co-founders Mark Indreason and Ben Horace. Horowitz, who both, by the way, had a front row seat to several prior tech waves, tackle the state of AI. So what are the characteristics that will define successful AI companies?

Starting point is 00:01:12 And is proprietary data the new oil, or how much is it really worth? How good are these models realistically going to get? And what would it take to get 100 times better? Mark and Ben discuss all this and more, including whether the venture capital model needs a refresh to match the rate of change happening all around it. And of course, if you want to hear more from Ben and Mark, make sure to subscribe to the Ben and Mark podcast. All right, let's get started.

Starting point is 00:01:41 It is kind of the darkest side of capitalism when a company is so greedy, they're willing to destroy the country and maybe the world to just get a little extra profit. And they do it, like the really kind of nasty thing is they claim, oh, it's for safety. You know, we've created an alien that we can't control. But we're not going to stop working on it.

Starting point is 00:01:59 We're going to keep building it as fast as we can, and we're going to buy every freaking GPU on the planet. But we need the government to come in and stop it from being open. This is literally the current position of Google and Microsoft right now. It's crazy. The content here is for informational purposes only, should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security

Starting point is 00:02:25 and is not directed at any investor or potential investors in any A16 Z fund. Please note that A16Z and its affiliates may maintain investments in the companies discussed in this podcast. For more details, including a link to our investments, please see A16Z.com slash disclosures. Hey, folks, welcome back. We have an exciting show today. We are going to be discussing the very hot topic of AI. We are going to focus on the state of AI as it exists right now in April of 2024.

Starting point is 00:02:53 And we are focusing specifically on the intersection of AI and company building. So hopefully this will be relevant to anybody working on a startup or anybody, at a larger company. We have as usual solicited questions on X, formerly known as Twitter, and the questions have been fantastic. So we have a full lineup of listener questions, and we will dive right in.

Starting point is 00:03:10 So first questions, so three questions on the same topic. So Michael asks, in anticipation of upcoming AI capabilities, what should founders be focusing on building right now? Gwen asks, how can small AI startups compete with established players

Starting point is 00:03:24 with massive compute and data scale advantages? And Alastair McLeay asks, for startups building on top of Open AI, et cetera, what are the key characteristics of those companies that will benefit from future exponential improvements in the base models versus those that will get killed by them? So let me start with one point, Ben, and then we'll jump right to you. So Sam Holtman recently gave an interview, I think maybe Lex Friedman or one of the podcasts. And he actually said something I thought was actually quite helpful.

Starting point is 00:03:47 Let's see, Ben, if you agree with it. He said something along the lines of, you want to assume that the big foundation models coming out of the big AI companies are going to get a lot better. So you want to assume they're going to get like 100 times better. And as a startup founder, you wanted them think, okay, if this current foundation models get 100 times better, is my reaction, oh, that's great for me and for my startup because I'm much better off as a result? Or is your reaction the opposite? Is it, oh, shit? I'm in real trouble.

Starting point is 00:04:13 So let me just stop right there, Ben, and see what you think of that as general advice. Well, I think generally that's right, but there's some nuances to it, right? So I think that from Sam's perspective, he was probably discouraging people from building foundation models. models, which I don't know that I would entirely agree with that and that a lot of the startups building foundation models are doing very well. And there's many reasons for that. One is there are architectural differences, which lead to how smart is a model. There's how fast is a model.

Starting point is 00:04:45 There's how good is a model in a domain. And that goes for not just text models, but image models as well. There are different domains, different kinds of images that response to prompts differently. If you ask Mid Journey and Ideogram, the same question. They react very differently, you know, depending on the use cases that they're tuned for. And then there's this whole field of distillation where, you know, Sam can go build the biggest, smartest model in the world, and then you can walk up as a startup and kind of do a distilled version of it and get a model very, very smart at a lot less cost. So there are things that, yes, the big company models are going to get where.

Starting point is 00:05:27 better, kind of way better at what they are. So you need to deal with that. So if you're trying to go head-to-head full frontal assault, you probably have a real problem just because they have so much money. But if you're doing something that's different enough or like, you know, different domain and so forth, for example, you know, at Databricks, they've got a foundation model, but they're using in a very specific way in conjunction with their kind of leading data platform. So, okay, now if you're an enterprise and you need a model that knows all the nuances of how your enterprise data model works and what things mean and needs access control and what needs to use your specific data and domain knowledge and so forth, then it doesn't really

Starting point is 00:06:19 hurt them if Sam's model gets way better. Similarly, 11 labs with their voice. model has kind of embedded into everybody. Everybody uses it as part of kind of the AI stack. And so it's got kind of a developer hook into it. And then, you know, they're going very, very fast to what they do and really being very focused in their area. So there are things that I would say like extremely promising that are kind of ostensibly, but not really competing with open AI or Google or Microsoft. So I think it sounds a little more quick. course, green, then I would interpret it if I was building a startup. Right.

Starting point is 00:06:58 Let's stick into this a little bit more. So let's start with the question of, do we think the big models, the God models, are going to get 100 times better? I kind of think so. And then I'm not sure. So if you think about the language models, let's do those because those are probably the people are most familiar with. I think if you look at the very top models, you know, Claude and OpenAI and Mistral and

Starting point is 00:07:20 Lama, the only people who I feel like really can tell the difference. as users amongst those models are the people who study them. They're getting pretty close. So you would expect, if we're talking 100x better, that one of them might be separating from each other a lot more. But the improvement, so 100% better in what way?

Starting point is 00:07:42 Like for the normal person, using it in a normal way, like asking it questions and finding out stuff? Let's say some combination of just breadth of knowledge and capability? Yeah, like I think in some of them, They are, yeah. Right, but then also just combined with like sophistication of the answers, sophistication of the output, the quality of the output, sophistication of the output, you know, lack of hallucination, factual grounding.

Starting point is 00:08:05 Well, that I think is for sure going to get 100 times better. Like that, yeah, I mean, they're on a path for that. The things that are, so against that, right, the alignment problem where, okay, yeah, they're getting smarter, but they're not allowed to say what they know. and then that alignment also kind of makes them dumber in other ways. And so you do have that thing. The other kind of question that's come up lately, which is kind of do we need a breakthrough to go from what we have now, which I would categorize is artificial human intelligence

Starting point is 00:08:40 as opposed to artificial general intelligence, meaning it's kind of the artificial version of us. We've structured the world in a certain way using our language and our ideas and our stuff. And it's learned that very well, amazing. And it can do kind of a lot of the step that we can do, but are we then the asymptote or you need a breakthrough to get to some kind of higher intelligence or general intelligence? And I think if we're the asymptote, then in some ways it won't get 100 times better because it's already like pretty good relative to us. But yeah, like it'll know more things. it'll hallucinate less.

Starting point is 00:09:21 On all those dimensions, it'll be a hundred times better, I think. You know, there's this graph floating around. I forget exactly what the axes are, but it basically shows the improvement across the different models. To your point, it shows an asymptote against the current tests that people are using

Starting point is 00:09:33 that's sort of like add or slightly above human levels, which is what you would think if you're being trained on an entirely human data. Now, the counter argument on that is, are the tests just too simple, right? It's a little bit like the question people I've ever run the SAT, which is if you have a lot of people getting 800s,

Starting point is 00:09:46 you know, on both math and verbal on the SAT, is the scale too constrained? do you need a test that can actually test for Einstein? Right, right, right. It's memorized the tests that we have. And it's great of them. Right. But you can imagine SAT that really can detect gradations of people who have like ultra high IQs

Starting point is 00:10:01 who are ultra good at math or something. You can imagine test for AI. You know, you can imagine tests that test for reasoning above human levels, when it assumes. Yeah, well, maybe the AI needs to write the test. Yeah, and then there's a related question that comes up a lot. It's an argument we've been having internally, which is also I'll start to take some sort of more provocative and probably more bullish, or as you would put it, sort of science fictiony predictions on some of this stuff. So there's this question that comes up, which is like, okay, you take

Starting point is 00:10:24 an LLM, you train it on the internet. What is the internet data? What is the internet data data corpus? It's an average of everything, right? It's a representation of sort of human activity. Representation of human activity is going to kind of, you know, because of the sort of distribution of intelligence of the population, you know, most of it's somewhere in the middle. And so the data set on average sort of represents the average human. You're teaching it to be very average, yeah. Yeah, you're teaching to be very average. It's just because most of the content create on the internet is created by average people. And so kind of the content on average, you know, as a whole on average is average, and so therefore the answer is our average, right? You're going to get back an answer that sort of

Starting point is 00:10:54 represents the kind of thing that an average 100 IQ, you know, kind of by definition, the average human is 100 IQ. It's IQ's index to 100 at the center of the bell curve. And so by definition, you're kind of getting back the average. I actually argue like that may be the case for the default prompt today. Like, you just ask the thing, does the earth revolve around the sun or something? You get like the average answer to that and maybe that's fine. This gets to the point as well, okay, the average data might be of an average person, but the data set also contains all of the things written and thought by all the really smart people. All that stuff is in there, right?

Starting point is 00:11:21 And all the current people who are like that, their stuff is in there. And so then it's sort of like a prompting question, which is like how do you prompt it in order to get basically, in order to basically navigate to a different part of what they call the latent space to navigate to a different part of the dataset that basically is like the super genius part. And, you know, the way these things work is if you craft the prompt in a different way, it actually leads it down a different path inside the dataset, gives you a different kind of answer.

Starting point is 00:11:41 And here's another example of this. If you ask it write code to do X, write code to sort of list or whatever, render an image, it will give you average code to do that. If you say, write me secure code to do that, it will actually write better code with fewer security holes, which is very interesting, right? Because it's accessing a different purpose of training data, which is secure code. And if you ask, you know, write this image generation thing the way John Carmack would write it, you get a much better result because it's tapping into the part of the latent space

Starting point is 00:12:04 represented by John Carmack's code, who's the best graphics programmer in the world. And so you can imagine prompting crafts in many different domains such that you're kind of unlocking the latent super genius, even if that's not the default answer? Yeah, now, so I think that's correct. I think there's still a potential limit to its smartness in the... So we had this conversation in the firm the other day where you have... There's the world, which is very complex. And intelligence kind of is, you know, how well can you understand, describe, or present the world?

Starting point is 00:12:36 But our current iteration of artificial intelligence consists of, of humans structuring the world and then feeding that structure that we've come up with into the AI. And so the AI kind of is good at predicting how humans have structured the world as opposed to how the world actually is, which is something more probably complicated, maybe the irreducible or what have you. So do we just get to a limit where like it can be really smart, but its limit is going to be the smartest humans as opposed to smarter than the smartest humans. And then kind of related, is it going to be able to figure out brand new things, you know, new laws of physics and so forth? Now, of course, there are like one in three billion humans that can do that or whatever.

Starting point is 00:13:28 That's a very rare kind of intelligence. So it still makes the AI is extremely useful. But they play a different role if they're kind of artificial humans than if they're like artificial, you know, super duper mega humans. Yeah. So let me make the sort of extreme bull case for the hundred. Because, okay, so the cynic would say, the Sam Malman would be saying they're going to get 100 times better precisely if they're not going to.

Starting point is 00:13:55 Yeah, yeah, yeah, yeah. Right? Because he'd be saying that basically in order to scare people into not competing. Well, I think that whether or not they are going to get 100 times better, Sam would be very likely to say that. Like, Sam, for those you don't know him, is he's a very smart guy, but for sure he's a competitive genius. There's no question about that. So you have to take that account. Right. So if they weren't going to get a lot better, he would say that.

Starting point is 00:14:20 But of course, if they were going to get a lot better, to your point, he would also say that. Yes. Why not? Right. And so let me make the bull case that they are going to get 100 times better or maybe even, you know, on an upper curve for a long time. And there's like enormous controversy, I think, on every one of the things I'm about to say. But you can find very smart people in the space who believe basically everything I'm about to say. So one is there is generalized learning happening inside the neural networks. And we know that because we now have introspection techniques where you can actually go inside and look inside the neural networks

Starting point is 00:14:47 to look at the neural circuitry that is being evolved as part of the training process. And these things are evolving general computation functions. There was a case recently where somebody trained one of these on a chess database and just by training on lots of chess games that actually imputed a world model of a chess board inside the neural network and that was able to do original moves. And so the neural network training process

Starting point is 00:15:06 does seem to work. And then specifically, not only that, but, you know, meta and others recently have been talking about how so-called overtraining actually works, which is basically continuing to train the same model against the same data for longer, you know, putting more and more compute cycles against it. You know, I've talked to some very smart people in the field, including there, who basically think that actually that works quite well. The diminishing returns people were worried about, about more training. And they proved it in a new Lama release, right? That's the primary technique they use. Yeah, exactly. Like one guy in the space basically told me, it basically is like, yeah, we don't

Starting point is 00:15:35 necessarily need more data at this point to make these things better. We maybe just need more compute cycles. We just train it 100 times more and it may just get actually a lot better. So, Juan, the labeling, it turns out that supervised learning ends up being a huge boost to these things. Yeah. So we've got that. We've got all of the kind of, you know, let's say rumors and reports of various kinds of self-improvement loops, you know, that kind of underway. And most of the sort of super advanced practitioners in the field think that there's now some form of self-improvement loop that works, which basically is you basically get an AI to do what's called chain of thoughts. You get it to basically go step by step to solve a problem.

Starting point is 00:16:06 You get it to the point where it knows how to do that. And then you basically retrain AI on the answers. And so you're kind of basically doing a sort of a forklift upgrade across cycles of the reasoning capability. And so a lot of the experts think that sort of thing's starting to work now. And then there's still a raging debate about synthetic data, but there's quite a few people who are actually quite bullish on that. Yeah. And then there's even this tradeoff. There's this kind of dynamic where like LLMs might be okay at writing code, but they might be really good at validating code.

Starting point is 00:16:30 They might actually be better at validating code than they are at writing it. That would be big help. Yeah, well, but that also means like AOS may be able to self-validate your own code. Yeah, yeah. They can validate their own code. And we have this anthropomorphic bias that's very deceptive with these things because you think of the model as an it. And so it's like, how could you have an it that's better at validating code

Starting point is 00:16:47 the writing code, the writing code? But it's not an it. What it is is this giant latent space. It's this giant neural network. And the theory would be there are totally different parts of the neural network for writing code and validating code. And there's no consistency requirement whatsoever that the network would be equally good at both of those things.

Starting point is 00:17:00 And so if it's better at one of those things, right? So the thing that it's good at might be able to make the thing that it's bad at better and better. Right, right, right, right. Sure, sure. Right, sort of a self-improvement thing. And so then on top of that, there's all the other things coming, right, which is it's everything, there's all these practical things, which is there's an enormous chip constraint right now. So every AI that anybody uses today is its capabilities are basically being gated by the availability of chips. But like that will resolve over time.

Starting point is 00:17:25 You know, there's also, if your point out like data labeling, there is a lot of data in these things now, but there is a lot more data out in the world. And there's, you know, at least in theory, some of the leading AI companies are actually. paying to generate new data. And by the way, even like the open source data sets are getting much better. And so there's a lot of like data improvements that are coming. And then, you know, there's just the amount of money pouring into the space to be able to underwrite all this. And then by the way, there's also just the systems engineering work that's happening, right, which is a lot of the current systems, you know, we're basically, we're built by scientists. And now they're really world-class engineers are showing up and tuning them up and getting

Starting point is 00:17:51 them work better. And, you know, maybe that's not a, maybe that's not a, which makes training, which, by the way, way, way more efficient as well, not just inference, but also training. Yeah, exactly. And then even, you know, Another improvement area is basically Microsoft released their Phi small language model yesterday. And apparently it's competitive. It's a very small model competitive with much larger models. And the big thing they say that they did was they basically optimized the training set. So they basically de-duplicated the training set.

Starting point is 00:18:16 They took out all the copies and they really optimized on a small amount of training data, on a small amount of high-quality training data, as opposed to the larger amounts of low-quality data that most people train on. You add all these up and you've got eight or ten different combination of sort of practical and theoretical and improvement vectors that are all in play. And it's hard for me to imagine that some combination of those doesn't lead to like really dramatic improvement from here. I definitely agree. I think that's for sure going to happen. Right. Like if you were, so back to Sam's proposition, I think if you were a startup and you were like, okay, in two years I can get as good as GPT4, you shouldn't do that.

Starting point is 00:18:49 Right. Right. That would be a bad mistake. Right. Well, this also goes to, you know, a lot of entrepreneurs are afraid of, well, I'll give you an example. So a lot of entrepreneurs here's this thing they're trying to figure out, which is okay, I really think I know how to build a SaaS app that. harnesses an LLM to do really good marketing collateral. Let's just make a very similar, a very, very simple thing.

Starting point is 00:19:05 And so I build a whole system for that. Will it just turn out to be that the big models in six months will be even better in making marketing collateral just from a simple prompt such that my apparently sophisticated system is just irrelevant because the big model just does it? Yeah. So how are we just talk about that? Like apps, you know, another way you can think about it is the criticism of a lot of current AI app companies is they're quote-unquote, you know, GPT wrappers.

Starting point is 00:19:28 There's sort of thin layers of wrapper around the core model. which means the core model could commoditize them or displace them. But the counter argument, of course, is it's a little bit like calling all old software apps, database wrappers, you know, wrappers around a database. It turns out like actually wrappers around a database is like most modern software, and a lot of that actually turned out to be really valuable. And it turns out there's a lot of things to build around the core engine. So, yeah, so, Ben, how do we think about that when we run into companies thinking about building apps?

Starting point is 00:19:51 Yeah, you know, it's very tricky question because there's also this correctness gap, right? So, you know, why do we have co-pilots? Where are the pilots? Right? Where are the AI? There's no AI pilots. They're only AI co-pilots. There's a human in the loop on absolutely everything.

Starting point is 00:20:09 And that really kind of comes down to this, you know, you can't trust the AI to be correct in drawing a picture or writing a program or, you know, even like writing a court brief without making up citations. You know, all these things kind of require a human. kind of turns out to be like fairly dangerous to not. And then I think that so what's happening a lot with the application layer is people saying, well, to make it really useful, I need to turn this copilot into a pilot. And can I do that? And so that's an interesting and hard problem. And then there's a question of, is that better done at the model level or at some layer on top

Starting point is 00:20:54 that, you know, kind of teases the correct answer out of the model, you know, by doing things like using code validation. or what have you, or is that just something that the models will be able to do? I think that's one open question. And then, you know, as you get into kind of domains and, you know, potentially wrappers on things, I think there's a different dimension than what the models are good at, which is what is the process flow, which is kind of in database where all this is. So on the database kind of analogy, there is like the part of the task in a law firm, that's writing the brief, but there's 50 other tasks and things that have to be integrated into

Starting point is 00:21:36 the way a company works, like the process flow, the orchestration of it. And maybe there are, you know, on a lot of these things, like if you're doing video production, there's many tools, or music even, right? Like, okay, who's going to write the lyrics? Which AI will write the lyrics and which AI will figure out the music? And then, like, how does that all come together and how do we integrate it and so forth? And those things tend to just require a real understanding of the end customer and so forth in a way. And that's typically been how like applications have been different than platforms in the past. There's real knowledge about how the customer using it wants to function that doesn't have anything to do with the kind of intel or is just different than what the platform is designed to do.

Starting point is 00:22:29 And to get that out of the platform for a kind of company or a person turns out to be really, really hard. And so those things, I think, are likely to work, you know, especially if the process is very complex. And it's something that's funny. As a firm, you know, we're a little more hardcore technology oriented. And we've always struggled with those, you know, in terms of, oh, this is like a some process application for, like, plumbers to figure out this. and we're like, well, where's the technology? But, you know, a lot of it is how do you encode, you know, some level of domain expertise and kind of how things work in the actual world back into the software?

Starting point is 00:23:12 I often think I have Intel founders that you can think about this in terms of price. You can kind of work backwards from pricing a little bit, which is to say sort of business value and what you can charge for, which is, you know, the natural thing for any technologists to do is to kind of say, I have this new technological capability and I'm going to sell it to people. and like, what am I going to charge for it? It's going to be somewhere between, you know, my cost of providing it. And then, you know, whatever markup I think I can justify, you know,

Starting point is 00:23:33 and if I have a monopoly providing it, maybe the markup's infinite. But, you know, it's kind of this, it's a sort of technology forward, you know, kind of supply forward, you know, pricing model. There's a completely different pricing model for kind of business value backwards and or sort of, you know, so-called value pricing, value-based pricing. And that's, you know, to your point, that's basically a pricing model that says, okay, what's the business value to the customer of the thing?

Starting point is 00:23:55 And if the business value is, you know, a million dollars, then can I charge 10% of that and get $100,000, right, or whatever? And then, you know, why is it cost $100,000 as compared to $5,000 is because, well, because to the customer it's worth a million dollars. And so they'll pay 10% for it. Yeah, actually, so a great example of that, like we've got a company in our portfolio, Crest AI, that does things like debt collection. Okay. So if I can collect way more debt with way fewer people with my, you know, it's a co-pilot type solution, then what's that worth? Well, it's worth a heck of a lot more than just buying an Open AI license because an OpenAI license is not going to easily collect debts or kind of enable your debt collectors to be massively more efficient or that kind of thing.

Starting point is 00:24:53 So it's bridging that gap between the value. And I think you had a really important point. The test for whether your idea is good is how much can you charge for it? Can you charge the value? Or are you just charging the amount of work it's going to take the customer to put their own wrapper on top of Open AI? Like that's the real test to me of like how deep and how important is what you've done. Yeah. And so to your point on like the kinds of business.

Starting point is 00:25:22 you know, the kinds of businesses that technology investors have had a hard time with, you know, kind of thinking about, you know, maybe accurately, is sort of, it's the company that is, it's a vendor that has built something where it is a specific solution to a business problem, where it turns out the business problem is very valuable to the customer. And so therefore, they will pay a percentage of the value provided back to, back in the terms for price for the software. And that actually turns, that actually turns out you can have businesses that are not

Starting point is 00:25:47 very technologically differentiated that are actually extremely lucrative. Yeah. And then because that business is so lucrative, they can actually afford to go think very deeply about how technology integrates into the business, what else they can do. This is like the story of a Salesforce.com, for example, right? And by the way, there's kind of a chance, a theory,

Starting point is 00:26:07 that the models are all getting really good. There are open source models. They are like, that are awesome. You know, Lama, Mistral, like these are great models. And so the actual light, where the value is going to accrue is going to be like tools, orchestration, that kind of thing, because you can just plug in whatever the best model is at the time, whereas the models are going to be competing, you know, in a death battle with each other

Starting point is 00:26:34 and, you know, be commoditized down to the, you know, the cheapest one wins and that kind of thing. So, you know, you could argue that the best thing to do is to kind of connect the power to the people. Right. Right. So that actually takes us to the next question, and this is a two and one question. So Michael asks, and I'll say these are diametrically opposed, which is why I paired them. So Michael asks, why RVC is making huge investments in generative AI startups when it's clear these startups won't be profitable anytime soon, which is a loaded question, but we'll take it. And then Kaiser asks if AI deflates the cost of building a startup, how will the structure of tech investment change? And, of course, Ben, this goes to exactly what you just said.

Starting point is 00:27:16 So it's basically the questions are diametrically opposed because if you squint out of your left eye, Right? What you see is basically the amount of money being invested in the foundation model companies kind of going up to the right at a furious pace. You know, these companies are raising hundreds of millions, billions, tens of billions of dollars. And it's just like, oh, my God, look at these sort of capital, you know, sort of, I don't know, infernos that hopefully will result in value at the end of the process. But my God, look at how much money is being invested in these things. If you squint through your right eye, you know, you think, wow, that now all of a sudden it's like much easier to build software. It's much easier to have a software company. It's much easier to like have a small number of programmers writing complex software because they've got all these.

Starting point is 00:27:50 AI co-pilots and all these automated software development capabilities that are coming online. And so on the other side, the cost of building an AI like application startup might crash. And it might just be that like the Salesforce, the AI Salesforce.com might cost, you know, a 10th or 100th or a thousandth amount of money that it took to build, you know, the old database driven Salesforce.com. And so, yeah, so what do we think of that dichotomy, which is you can actually look, you can actually look out of either eye and see either cost to the moon as like for startup funding or cost actually going to zero?

Starting point is 00:28:20 Yeah, well, like, so it is interesting. I mean, we actually have companies in both camps, right? Like I think probably the companies that have gotten to profitability, the fastest, maybe in the history of the firm have been AI companies. There have been, you know, AI companies in the portfolio where the revenue grows so fast that it actually kind of runs out ahead of the cost. And then there are, like, you know, people who are in the foundation model race who are raising hundreds of millions,

Starting point is 00:28:51 even billions of dollars to kind of keep pace and so forth. They also are kind of generating revenue at a fast rate. The headcount, and all of them is small. So I would say, you know, where AI money goes, and even, you know, like if you look at OpenAI, which is the big spender in startup world, which, you know, we are also investors and is, you know, headcount-wise, they're pretty strong.

Starting point is 00:29:18 small against their revenue. Like, it is not a big company headcount. Like, if you look at the revenue level and how fast they've gotten there, it's pretty small. Now, the total expenses are ginormous, but they're going into the model creation. So it's an interesting thing. I mean, I'm not entirely sure how to think about it. But I think, like, if you're not building a foundation model, it will make you more efficient and probably get to profitability quicker. Right. So So the counter, and this is a very bullish counter argument, but the counter argument to that would be basically that falling costs for like building new software companies

Starting point is 00:29:56 are a mirage. And the reason for that is this thing in economics called the Jevons Paradox, which I'm going to read from Wikipedia. So the Jevons Paradox occurs when technological progress increases the efficiency with which a resource is used, reducing the amount of that resource necessary for any one use. But the falling cost induces increases in demand, right, elasticity, enough that the resource use

Starting point is 00:30:18 overall is increased rather than reduced. Yeah, that's certainly possible. Right. And so this is, you see versions of this, for example, you build in your freeway, and it actually makes traffic jams worse, right? Because basically what happens is, oh, it's great, now there's more roads, now we can have more people live here, we can have more people that, you know, we can make these companies bigger, and now there's more traffic than ever, and now the traffic's even worse. Or you saw the classic example is during the Industrial Revolution coal consumption. As the price of coal drops, people use so much more coal that

Starting point is 00:30:46 actually the overall consumption actually increased. People are getting a lot more power, but the result was the use of a lot more coal in the paradox. And so the paradox here would be, yes, the cost of developing any given piece of software falls, but the reaction to that is a massive surge of demand for software capabilities. And so the result of that actually is,

Starting point is 00:31:06 although it looks like starting software companies, the price is going to fall, actually it's going to happen, it's going to rise, for the high-quality reason that you're going to be able to do so much more. Yeah. right with software the products are going to be so much better and the roadmap is going to be so amazing of the things you can do and the customers are going to be so happy with it that they're going to want more and more and more yeah so the result of it and by the way another example of jevons

Starting point is 00:31:26 paradox playing out in another related industries in hollywood you know cgii in theory should have reduced the price of making movies in reality has increased it because audience expectations went up yeah and now you go to a hollywood movie and it's wald-wall cg and so you know movies are more expensive to make than ever and so the result of it you know so but the result in Hollywood is at least much more, let's say, visually elaborate, you know, movies, whether they're better or not as another question, but like much more visually elaborate, compelling, kind of visually stunning movies through CGI. The version here would be much better software. Yeah. Like radically better software to the end user, which causes end users to want a lot more software,

Starting point is 00:31:59 which causes actually the price of development to rise. You know, if you just think about, like, a simple case like travel, like, okay, booking a trip through Expedia is like complicated. You're likely to get it wrong. You're clicking on menus and this and that and the other. Like, you know, da-da-da-da. An AI version of that would be like, you know, send me to Paris, put me in a hotel I'll love at the best price, you know, send me on the best possible kind of airline, an airline ticket. And then, you know, like, make it like really special for me. And like maybe you need a human to go, okay, like we're going to, you know, or maybe the AI gets more complicated and says, okay, well, we know the person loves chocolate. and we're going to like FedEx in the best chocolate in the world from Switzerland

Starting point is 00:32:46 into this hotel in Paris and this and that and the other. And so like the quality, you can, the quality could get to levels that we can't even imagine today just because, you know, the software tools aren't what they're going to be. So. Yeah, that's right.

Starting point is 00:33:02 Yeah, I kind of buy that actually. I think I brought in your argument. You're both. How about, yeah, or how about I'm going to land in whatever, Boston at 6,000? o'clock. I want to have dinner at seven with a table full of like super interesting people. Yeah, right, right, right, right. You know.

Starting point is 00:33:17 Yeah. Right? Yeah, yeah, yeah. Yeah. No travel agent would do that for you today, nor would you want them to. Yeah. No. No.

Starting point is 00:33:27 Right. Well, and then you think about it, it's got to be integrated into my personal AI and like, and this. Yeah, there's just like unlimited kind of ideas that you can do. And I think this is one of the kind of things that's always been, underestimated about humans is like our ability to come up with new things we need. Like that has been unlimited. And there's a very kind of famous case where John Maynard Keynes, who the kind of prominent

Starting point is 00:33:56 economists in the kind of first half of last century, had this thing that he predicted, which is like nobody, because of automation, nobody would ever work a 40-hour work week. you know, like because once their needs were met, needs being like shelter and food, and, you know, I don't even know if transportation was in there. Like, that was it. It was over and, like, you would never work past the need for shelter and food. Like, why would you? Like, there's no reason to.

Starting point is 00:34:25 But, of course, needs expanded. So then everybody needed a refrigerator. Everybody needed not just one car, but a car for everybody in the family. Everybody needed a television set. Everybody needed, like, glorious vacations. everybody, you know. So what are we going to need next? I'm quite sure that I can't imagine it, but like somebody's going to imagine it, and it's quickly going to become a need. Yeah, that's right. By the way, as Keynes famously said, it was his essay, I think, was economic

Starting point is 00:34:53 prospects for our grandchildren, which was basically that. Yeah. You know, what you just articulated. So Carl Marx had another version of that. I just pulled up the quote. So that society, when, you know, when the Marxist utopia, socialism is achieved, society regulates the general production, thus makes it possible for me to do, blah, blah, to hunt in the morning, fish in the afternoon, rear cattle in the evening,

Starting point is 00:35:15 criticize after dinner. What a glorious life. What a glorious life. Like, if I could just list four things that I do not want to do, it's hunt, fish, rear cattle, and criticize. Right? And by the way, it says a lot about Marx that those were his four things.

Starting point is 00:35:31 Well, criticizing being his favorite thing, I think it's basically communism in a nutshell. Yeah, exactly. I don't want to get too political, but yes. Yes, 100%. And so, yeah, so it's this, yeah, what they have, what Keynes and Marks had in common is just this incredibly constricted,

Starting point is 00:35:49 it's incredibly constricted view of what people want to do. And then correspondingly, you know, the other thing is just like, you know, people, people who want to have a mission. I mean, probably some people just want to fish and hunt. Yeah. But, you know, a lot of people want to have a mission. They want to have a cause.

Starting point is 00:36:00 They want to have a purpose. They want to be useful. They want to be productive. It's actually a good thing in life, it turns out. It turns, it turns out in a startling turn of events. Okay, so yeah, so yeah, I think that I've long felt, you know, a little bit of the software, it's the world thing a decade ago. I've always thought, I've always thought that basically demand for software is sort of perfectly

Starting point is 00:36:18 elastic, possibly to infinity. And the theory there basically is if you just continuously bring down the cost of software, you know, which has been happening over time, then basically demand, you know, basically is like basically perfectly correlates upward. And the reason is because, you know, kind of, as we've been discussing, but it's kind of there's always something else to do in software. There's always something else to automate. There's always something else to optimize.

Starting point is 00:36:37 There's always something else to improve. There's always something to make better. And, you know, in the moment with the constraints that you have today, you may not, you know, think of what that is. But the minute you don't have those constraints, you'll imagine what it is. I'll just give you an example. I mean, so I'll give you an example of playing out with AI right now, right? So there have been, you know, we have companies that do this.

Starting point is 00:36:54 You know, there have been companies that have made AI, you know, that have made software systems for doing security cameras forever, right? And it's like, for a long time, it was like a big deal to have software that would do like, you know, have different security camera feeds and store them on a DVR and be able to replay them and have an interface that lets you do that. Well, it's like, you know, AI security cameras all of a sudden can have like actual like semantic knowledge

Starting point is 00:37:13 of what's happening in the environment. And so they can say, you know, hey, that's Ben. And then they can say, oh, hey, you know, that's Ben, but he's carrying a gun. Yeah. Right? Right. And by the way, that's Ben and he's carrying a gun.

Starting point is 00:37:23 But that's because like he hunts on, you know, on Thursdays and Fridays as compared to that's Mary. And she never carries a gun and like, you know, like something is wrong. And she's really mad. Right. She's got a really steamed expression interface, and we should probably be worried about it. Right. And so there's like an entirely new set of capabilities you can do, just as one example, for security systems that were never possible pre-AI. And a security system that actually has a semantic understanding of the world is obviously much more sophisticated than the one that doesn't and might actually be more expensive to make, right? Right. Well, and just imagine health care, right? Like you could wake up every morning and have a complete diagnostic, you know, like, how am I doing today?

Starting point is 00:38:02 like what are all my levels of everything. And, you know, how should I interpret them? You know, better than, you know, this is one thing where AI is really good is, you know, medical diagnosis because it's a super high dimensional problem. But if you can get access to, you know, your continuous glucose reading, you know, maybe sequence your blood now and again, this and that and the other, yeah, you've got an incredible kind of view of things. And who doesn't want to be healthier?

Starting point is 00:38:30 You know, like now we have a sky. scale. That's basically what we do. You know, maybe check your heart rate or something, but like pretty primitive stuff compared to where we could go. Yeah, that's right. Okay, good. All right. So let's go to the next topic. So on the topic of data, so Major Tom asks, as these AI models allow for us to copy existing app functionality at minimal cost, proprietary data seems to be the most important moat. How do you think that will affect proprietary data value? What other moats do you think companies can focus on building in this new environment? And then Jeff Weishaupt asks, how should companies protect sensitive data, trade secrets, proprietary data, individual privacy in the brave new world of AI.

Starting point is 00:39:09 So let me start with a provocative, let me start with a provocative statement, Ben, see if you agree with it, which is, you know, you sort of hear a lot, this sort of statement or cliche is like data is the new oil. And so it's like, okay, you know, data is the key input to training AI, making all this stuff work. And so, you know, therefore, you know, data is basically the new resource. It's the limiting resource. It's the super valuable thing. And so, you know, whoever has the best data is going to win. and you see that directly in how you train AIs. And then you also have like a lot of companies, of course,

Starting point is 00:39:36 that are now trying to figure out what to do with AI. And a very common thing you'll hear from companies is, well, we have proprietary data, right? So I'm a, you know, I'm a hospital chain or I'm a, you know, whatever. Any kind of business, insurance company or whatever. And I've got all this proprietary data that I can apply, you know, that I'll be able to, you know, build things with my proprietary data with AI that won't just, you know, be something that anybody will be able to have.

Starting point is 00:39:56 Let me argue that basically, let's see, I mean, I mean, arguing, like, almost every case like that, it's not true. It's basically what the Internet kids would call cope. It's simply not true. And the reason it's just not true is because the amount of data available on the Internet and just generally in the environment is just a million times greater. And so while it may not, you know, while it may not be true that I have your specific medical information, I have so much medical information off the Internet for so many people in so many different scenarios

Starting point is 00:40:25 that it just swamps the value of, quote, data, you know, just, it's just like overwhelming. And so your, your proprietary data as, you know, company X will be a little bit useful in the margin, but it's not actually going to move the needle. And it's not really going to be a barrier entry in most cases. And then let me cite as proof for the, for my belief that this is mostly cope is there has never been, nor is there now, any sort of, basically any level of sort of rich or sophisticated marketplace for data, market for data. There's no, there's no, there's no large marketplace for data. And in fact, in fact, what there are is there are very small markets for data.

Starting point is 00:40:59 So there are these businesses called data brokers that will sell you, you know, large numbers of like, you know, information about users on the Internet or something. And they're just small businesses. Like, they're just not large. It just turns out, like, information on lots of people is just not very valuable. And so if the data actually had value, you know, it would have a market price and you would see a transacting.

Starting point is 00:41:16 And you actually very specifically don't see that, which is sort of a, you know, yeah, sort of quantitative proof that the data actually is not nearly as valuable as people think it is. Where I agree, so I agree that the data, Data, like, just as, here's a bunch of data, and I can sell it without doing anything to the data is, like, massively overrated. Like, I definitely agree with that. And, like, maybe I can imagine some exceptions, like some, you know, special population genomic databases or something that are, that we're very hard to acquire that are useful in some way. That's, you know, that's not just like living on the Internet or something like that, I could imagine.

Starting point is 00:41:59 that's super highly structured, very general purpose, and not widely available. But for most data in companies, it's not like that, and that it tends to not, it's either widely available or not general purpose. It's kind of specific. Having said that, right, like companies have made great use of data, for example, a company that you're familiar with meta, uses its data to kind of great ends itself, feeding it into its own AI systems, optimizing its products in incredible ways. And I think that, you know, us, Andres and Horowitz, actually, you know, so we just raised $7.2 billion. And it's not a huge deal. But we took our data and we put it into an AI system. And our LPs were able, there's a million questions investors have about

Starting point is 00:42:50 everything we've done, our track record, every company we've invested and so forth. And for any of those questions, they could just ask the AI. They could wake up at 3 o'clock in the morning and go, do I really want to trust these guys and go in and ask the AI a question? And boom, they'd get an answer back instantly. They'd have to wait for us and so forth. So we really kind of improved our investor relations product tremendously through use of our data. And I think that almost every company can improve its competitiveness through use of its own data.

Starting point is 00:43:22 But the idea that it's collected some data that it's collected some data that it's, can go like sell or that is oil or what have you. That's, yeah, that's probably not true, I would say. And, you know, it's kind of interesting because a lot of the data that you would think would be the most valuable would be like your own code base. Right, your software that you've written. So much of that lives in GitHub. Nobody is actually, I don't know of any company.

Starting point is 00:43:52 We could work with, you know, whatever, a thousand, software companies. And do we know any that's like building their own programming model on their own code? Like, or, and would that be a good idea? Probably not just because there's so much code out there that the systems have been trained on. So like that's, you know, not so much of advantage. So I think it's a very specific kind of data that would have value. Well, let's ask, let's make it actionable then. If I'm, if I'm running a big company, like if I'm running an insurance company or a banker, a hospital chain or something like that, like how, or you know, a consumer package goods company, Pepsi or something.

Starting point is 00:44:29 Like what, how should I validate? Like, how should I validate that I actually have a valuable proprietary data asset that I should really be focusing on using versus maybe versus in the alternate, by the way, maybe there's other things like maybe I should be taking all the effort I was spent on trying to optimize use of that data. And maybe I should use it entirely trying to build things using Internet data instead. Yeah. So, so I think, I mean, look, if you're, right, if you're in the insurance business,

Starting point is 00:44:53 then like all your actuary. data is both interesting and that I don't know that anybody publishes their actual actuarial data. And so, like, I'm not sure how you would train the model on stuff off of the internet. You know, similarly. That's a good. Let me, can I challenge that one? So that would be a good, good test case.

Starting point is 00:45:13 So I'm an insurance company. I've got records on 10 million people and, you know, the actuarial tables and when they get sick and when they die. Okay, that's great. But, like, there's lots and lots of actuarial, general actuarial data on the internet for large-scale population. because governments collect the data and they process it and they publish reports. And there's lots of academic studies.

Starting point is 00:45:32 And so, like, is your large data set giving you any additional actuarial information that the much larger data set on the Internet isn't already providing you? Like, are your insurance clients actually actuarially any different than just everybody? I think so because on intake on the, you know, when you get insurance, They give you like a blood test. They've got all these things. I know if you're a smoker and so forth. And in the, I think in the general data set, like, yeah, you know who dies, but you don't

Starting point is 00:46:02 know what the fuck they did coming in. And so what you really are looking for is like, okay, for this profile of person with this kind with these kinds of lab results, how long are they live? And that's where the value is. And I think that, you know, interesting, like, you know, I was thinking about like a company like Coinbase where, right, they have. have incredibly valuable assets in the terms of money. They have to stop people from breaking in.

Starting point is 00:46:29 They've done a massive amount of work on that. They've seen all kinds of break-in types. I'm sure they have tons of data on that. It's probably, like, weirdly specific to people trying to break into crypto exchanges. And so, you know, like, I think it could be very useful for them. I don't think they could sell it to anybody. But, you know, I think every company's got data. that if, you know, fed into an intelligent system would help their business.

Starting point is 00:46:57 And I think almost nobody has data that they could just go sell. And then there's this kind of in-between question, which is, what data would you want to let Microsoft or Google or OpenAI or anybody get their grubby little fingers on? And that I'm not sure. That I think is the question that enterprises are wrestling with more than, And it's not so much should we go like sell our data, but it should we train their own model just so we can maximize the value. Or should we feed it into the big model? And if we feed it into the big model, do all of our competitors now have the thing that we just did?

Starting point is 00:47:40 And, you know, or could we trust the big company to not do that to us? Which I kind of think the answer on trusting the big company not to F with your data is probably I won't do that. If your competitiveness depends on that, you probably shouldn't do that. Well, there are at least reports that certain big companies are using all kinds of data that they should be using to train their models already. Yep. I think those reports are very likely true. Right.

Starting point is 00:48:10 Or they have open data, right? Like, we've talked about this before, but you know, the same companies that are saying they're not stealing all the data from people or taking it an unauthorized way. refuse to say open their data. Like, why not tell us where your data came from? And in fact, they're trying to shut down all openness, no open source, no open weights, no open data, no, no open nothing. And go to the government and try and get to do that.

Starting point is 00:48:36 You know, if you're not a thief, then why are you doing that? Right, right, right. What are you heading? By the way, there's other twist and turns here. So, for example, in the insurance example, I kind of deliberately loaded it because you may know it's actually illegal to use genetic data for insurance purposes, right? So there's this thing called the Genial Law, Genetic Information Non-Discrimination Act of 2008.

Starting point is 00:48:57 And basically it basically bansheas insurers in the U.S. from actually using genetic data for the purpose of doing health assessment, actual actual assessment of it, which, by the way, because now that genomics are getting really good, like that data probably actually is among the most accurate data you could have if you were actually trying to predict like when people are going to get sick and die. And they're literally not allowed to use it. Yeah. It is, I think that this is an interesting, like weird misapplication of good intentions

Starting point is 00:49:27 in a policy way that's probably going to kill more people than ever get saved by every kind of health, FDA, et cetera, policy that we have, which is, you know, in a world of AI, having access to data on all human. why they get sick, what their genetics were, et cetera, et cetera, et cetera, is the most, that is, you know, you don't talk about data being the new oil. Like, that is the new oil. That's the health care oil is, you know, if you could match those up, then we'd never not know why we're sick. You know, you could make everybody much healthier, all these kinds of things. But, you know, to kind of stop the insurance company from kind of overcharging people who are more likely to die, we've kind of locked up all this data,

Starting point is 00:50:20 a kind of better idea would be to just go, okay, for the people who are likely to, like we subsidize healthcare, like massively for individuals anyway, just like differentially subsidize. And, you know, and then like you solve the problem and you don't lock up all the data. But, you know, it's typical of politics and policy. I mean, most of them are like that,

Starting point is 00:50:46 I think, yeah. Well, there's this interesting questions like insurance. Like, basically, one of the questions people have asked about insurance is, like, if you had perfectly predictive information on, like, individual outcomes, does the whole concept of insurance actually still work, right? Because the whole theory of insurance is risk pooling, right? It's precisely the fact that you don't know what's going to happen in the specific case that means you build these statistical models and then you risk pool and then you have variable

Starting point is 00:51:09 payouts depending on exactly what happens. But if you literally knew what was going to happen in every case, because, for example, you have all of this predictive genomic data, then all of a sudden it wouldn't make sense to risk pool because you just say, well, no, this person is going to cost X, that person is going to cost Y. There's no. Help insurance already doesn't make sense in that way, right?

Starting point is 00:51:27 Like, in France, the idea of insurance is kind of like the, it started with crop insurance where like, okay, you know, my crop fails. And so we all put money in a pool in case like my crop fails so that, you know, we can cover it. it's kind of designed for it to risk pool for a catastrophic unlikely incident. Like, everybody's got to go to the doctor all the fucking time. And some people get sicker than others and that kind of thing.

Starting point is 00:51:56 But like, the way our health insurance works is like all medical gets, you know, paid for through this insurance systems, which is this layer of loss in bureaucracy and giant companies and all this stuff. when like, if we're going to pay for people's healthcare, just pay for people's health care. Like, what are we doing? Right? Like, and if you want to disincent people from, like,

Starting point is 00:52:20 going for nonsense reasons and just up the co-pay, like, it's, like, what are we doing? Just, well, and then from a justice standpoint, from a fairness standpoint, like, would it make sense for me, you know, what it makes sense for me, to pay more for your health care if I knew that you were going to be more expensive than, like, you know, I'm directly, you know, if you, if everybody,

Starting point is 00:52:40 knows what future health care cost is per person. Yeah. There has a very good predictive model for it. You know, societal willingness to all pool in the way that we do today might really diminish. Yeah, yeah. Well, and then like you could also, if you knew, like there's things that you do genetically and maybe we give everybody a pass on that. It's like you can't control your genetics.

Starting point is 00:52:57 But then like there's things you do behaviorally that like dramatically increases your chance of getting sick. And so maybe, you know, we incentivize people to stay healthy instead of just like paying for them not to die. There's a lot of systemic fixes we could do to the healthcare system. It couldn't be designed in a more ridiculous way, I think. Well, it could be designed a more ridiculous way. It's actually more ridiculous than some other countries, but it's pretty crazy here. Nathan Odie asks, what are the strongest common themes between the current state of AI and Web 1.0? So let me start there. Let me give you a theory, Ben, and see what you think. So I guess it's

Starting point is 00:53:37 question, you know, because of my role in, you know, Ben, you with me at Netscape. You know, we get this question a lot because of our role early on with the internet. And so there's an, you know, the internet boom was like a major, major event in technology. And it's still within a lot of, you know, people's memories. And so, you know, the sort of, you know, people like to reason from analogy. So it's like, okay, the AI boom must be like the internet boom. Starting an AI company must be like starting an internet company. And so, you know, what is this like?

Starting point is 00:54:00 And we actually got a bunch of questions like that, you know, that are kind of analogy questions like that. I actually think, you know, and then, you know, you and I were there for the internet boom. So we, you know, we live through that and the bust and the boom and the bust. So I actually think that the analogy doesn't really work for the most. It works in certain ways, but it doesn't really work for the most part. And the reason is because the Internet, the Internet was a network, whereas AI is a computer. Yep. Okay, yeah.

Starting point is 00:54:27 So people understand what we're saying. It's more like the PC boom. Or the PC boom, even I would say the microprocessor, like my best analogy is to the microprocessor. Yeah. Or even to the like the original computers, like back to the main. frame era. And the reason is because, yeah, look, what the internet did was the internet, you know, obviously it was a network, but the network connected together many existing computers. And then, of course, people built many other new kinds of computers to connect with the

Starting point is 00:54:50 internet. But fundamentally, the internet was a network. And then, and that's important because most of the sort of industry dynamics, competitive dynamics, startup dynamics around the internet had to do with basically building, either building networks or building applications that run on top of networks. And this, you know, the internet generation of startups was very consumed by network effects and, you know, all these positive feedback loops that you get when you connect a lot of people together. And, you know, things like, you know, so-called Metcalf's law, which is sort of the value of a network, you know, expands, you know, kind of the way it expands is you have more people to it. And then, you know, there were all these fights, you know, these fights, you know,

Starting point is 00:55:23 all the social networks or whatever fighting to try to get network effects and try to steal each other's users because of the network effects. And so it's kind of, you know, it's dominated by network effects, which is what you expect from a network business. AI, like there are some networks effects in AI that we can talk about, but it's more like a microprocessor. It's more like a chip. It's more like a computer. It's a system that basically, right, if data

Starting point is 00:55:46 comes in, data gets processed, data comes out, things happen, that's a computer. It's an information processing system. It's a computer. It's a new kind of computer. It's a, you know, we like to say the sort of computers up until now have been what are called Von Neumann machines, which is to say they're deterministic computers, which is they're like, you know,

Starting point is 00:56:02 hyper literal and they do exactly the same thing every time. And if they make a mistake, it's the programmers' fault, but they're very limited in their ability to interact with people and understand the world. We think of AI and large language models as a new kind of computer, a probabilistic computer, a neural network-based computer that, you know, by the way, it's not very accurate and doesn't give you the same result every time and, in fact, might actually argue with you and tell you that it doesn't want to answer your question. Yeah, yeah, which makes it very different in nature than the old computers.

Starting point is 00:56:31 And it makes it kind of compulsibility, you know, the ability to build. things, big things out of little things, more complex. Right. But the capabilities are new and different and valuable and important because it can understand language and images and, you know, that all these, do all these things that you see when you use. All for the other means we can never solve with deterministic computers we can now go after, right? Yeah, exactly.

Starting point is 00:56:56 And so I think, Ben, I think the analogy and I think the less is learned are much more likely to be drawn from the early days of the computer industry or from the early days of the microprocessor than the early days of the internet. Does that sound right? I think so, yeah, I definitely think so. And that doesn't mean there's no boom and bust and all that because that's just the nature of technology. You know, people get too excited and then they get too depressed. So there will be some of that, I'm sure.

Starting point is 00:57:20 There will be over-build-outs, you know, potentially of, eventually of chips and power and that kind of thing. You know, we start with a shortage. But I agree. Like, I think networks are fundamentally different in the nature of how they evolved than computers. and the kind of just the adoption curve and all those kinds of things will be different. Yeah, so then this kind of goes to where how I think the industry is going to unfold.

Starting point is 00:57:43 And so this is kind of my best theory for kind of what happens from here. It's kind of this, you know, this giant question of like, you know, is the industry going to be a few God models or, you know, a very large number of models of different sizes and so forth? So the computer, like famously, you know,

Starting point is 00:57:57 the original computers, like the original IBM mainframes, you know, the big computers, you know, they were very, very large and expensive and there were only a few. few of them. And the prevailing view, actually, for a long time was that's all there would ever be. And there was this famous statement by Thomas Watson Sr. who was the creator of IBM, you know, which was the dominant company for the first, like, you know, 50 years of the computer industry. And he said, he said, I didn't believe this is actually true. He said, I don't know, I don't know that the world will

Starting point is 00:58:22 ever need more than five computers. And I think the reason for that, it was literally, it was like the government's going to have two and then there's like three big insurance companies and then that's it. Who else would need to do all that math? Exactly. Yeah, who else would need to, who else needs to keep track of huge amounts of numbers? Who else needs that level of, you know, calculation capability?

Starting point is 00:58:41 It's just not a relevant, you know, it's just not a relevant concept. And by the way, they were, like, big and expensive. And so who else can afford them, right? And who else can afford all the headcount required to manage them and maintain them? I mean, and this is in the days, I mean, these things were big. These things were so big that you'd have an entire building that got built around a computer. Right?

Starting point is 00:58:56 And they'd have, like, they'd famously have all these guys in white lab coats, literally, taking care of the computer because everything had to be kept super clean or the computer would stop working. And so, you know, it was this thing where, you know, today we have the idea of an AI God model, which is like a big foundation model. Then we had the idea of like a God mainframe. Like there would just be a few of these things. And by the way, if you watch old science fiction, it almost always has this sort of conceit. It's like, okay, there's a big supercomputer and it either is like doing the right thing or doing the wrong thing. And if it's doing the wrong thing, you know, that's often the plot of the science fiction movies is you have to go in and try

Starting point is 00:59:28 to figure how to fix it or defeat it. So it's sort of, it's sort of, you know, it's sort of of this idea of like a single top-down thing. Of course, and that held for a long time. Like, that held for, you know, the first few decades. And then, you know, even when computers, computers started to get smaller. So then you had so-called mini-computers was the next phase. And so that was a computer that, you know, didn't cost $50,000. Instead, it costs, you know, $500,000. But even still, $500,000 is a lot of money. People aren't putting mini-computers in their homes. And so it's like mid-sized companies can buy many computers, but certainly people can't. And then, of course, with the PC, they shrunk down to like $2,500. And then with the smartphone, they

Starting point is 01:00:00 shrunk down to $500. And then, you know, sitting here today, obviously, you have computers of every shape, size description, all the way down to, you know, computers that cost a penny. You know, you've got a computer in your thermostat that, you know, basically controls the temperature in the room, and it, you know, probably cost a penny, and it's probably some embedded arm chip with firmware on it. And there's, you know, many billions of those all around the world. You buy a new car today. It has something, new cars today have something on the order of 200 computers in them, maybe more at this point. And so you just basically assume with the chip today, sitting here today, you just kind of assume that everything has a chip.

Starting point is 01:00:30 in it. You assume that everything, by the way, draws electricity or has a battery because it needs to power the chip. And then increasingly you assume that everything's on the internet because basically all computers are assumed to be on the internet or they will be. And so as a constant, what you have is the computer industry today is this massive pyramid. And you still have a small number of like these supercomputer clusters or these giant mainframes that are like the God model, you know, the God mainframes. And then you've got, you know, a larger number of mini-computers. You've got a larger number of PCs. You've got a much larger number of smartphones. And then you've got a giant number of embedded systems. And it turns out, like, the computer industry is all of those

Starting point is 01:01:03 things. And, you know, what, what is it, what, you know, what size of computer do you want is based on, well, what exactly are you trying to do and who are you and what do you need? And so if that analogy holds, it basically means, actually we are going to have AI models of every conceivable, shape, size, description capability, right, based on trained on lots of different kinds of data at running at very different kinds of scale, very different privacy, different policies, different, you know, have like enormous variability and variety, and it's going to be an entire ecosystem and not just a couple of companies. Yeah, let me see what you think of that. Well, I think that's right. And I also think that the other thing that's interesting about this era of computing, if you look

Starting point is 01:01:42 at prayers of computing from the mainframe to the smartphone, a huge source of lock-in was basically the difficulty of using them. So, you know, nobody ever got fired for buying IBM because, like, You know, you had people trained on them. You know, people knew how to use the operating system. Like it was, you know, it was just kind of like a safe choice due to the massive complexity of like dealing with a computer. And then even with the smartphone, like the, you know, why is the Apple computer smartphone so dominant? You know, what makes it so powerful as well? Because like switching off of it is so expensive and complicated and so forth.

Starting point is 01:02:27 It's an interesting question with AI because AI is the easiest computer to use by far. It speaks English. It's like talking to a person. And so, like, what is the lock-in there? And so are you completely free to use the size, price, choice, speed that you need for your particular task? Or are you locked into the God model? And, you know, I think it's still a bit of an open question, but it's pretty interesting. And that thing could be very different than prior generations.

Starting point is 01:03:01 Yeah, yeah, that makes sense. And then just to complete the question, what would we say? So, Ben, what would you say our lessons learned from the Internet era that we live through that would apply that people should think about? I think a big one is probably just the boom-bust nature of it that, like, you know, the demand, the interest in the Internet, the recognition of what it could be. was so high that money just kind of poured in in buckets. And, you know, and then the underlying thing, which in Internet age was the telecom infrastructure and fiber and so forth, got just unlimited

Starting point is 01:03:39 funding and unlimited fiber was built out. And then eventually we had a fiber glut. And all the telecom companies went bankrupt. And that was great fun. But, you know, like, we entered in a good place. And I think that that's something like that's probably pretty likely to happen in AI, where, where, like, you know, every company is going to get funded. We don't need that many AI companies. So a lot of them are going to bust. There's going to be a huge, you know, huge investor losses. There will be an overbuilt out of chips for sure at some point.

Starting point is 01:04:11 And then, you know, we're going to have too many chips. And, you know, some chip companies will go bankrupt for sure. And then, you know, and I think probably the same thing with data centers and so forth. Like, we'll be behind, behind, behind, and then we'll overbuild at some point. So that all be very interesting. I think that, and that's kind of the, that's every new technology. So Carlotta Perez has a great kind of, has done, you know, amazing work on this where, like, that is just the nature of a new technology is that you overbuild, you underbuild it, then you overbuild. And, you know, and there's a hype cycle that funds the build out.

Starting point is 01:04:48 And a lot of money is lost, but we get the infrastructure and that's awesome because that's when it really gets adopted and changes the world. I want to say, you know, with the Internet, the other kind of big kind of thing is the Internet went through a couple of phases, right? Like it went through a very open phase, which was unbelievably great. It was probably one of the greatest booms to the economy. It, you know, it certainly created tremendous growth in power in America, both, you know, kind of economic power and soft, cultural power and these kinds of things. And then, you know, it became closed with the next generation architecture with, you know, kind of discovery on the internet being owned entirely by Google and, you know, kind of other things, you know, being owned by other companies.

Starting point is 01:05:34 And, you know, AI, I think, could go either way. So it could be very open or like, you know, with kind of misguided regulation. You know, we could actually force our way from something that, you know, is open source, open weights. Anybody can build it. We'll have a plethora of this technology will be like use all of it. American innovation to compete or will, you know, we'll cut it all off, we'll force it into the hands of the companies that kind of own the internet today. And, you know, and we'll put ourselves

Starting point is 01:06:08 at a huge disadvantage, I think, competitively against China in particular, but everybody in the world. So I think that's something that definitely, you know, that we're involved with trying to make sure it doesn't happen, but is a real possibility right now. Yeah. There's sort of an irony is that networks used to be all proprietary and then they opened up. Yeah, yeah, yeah, right. Landman, Apple Talk, NetBooey, NetBioes. Yeah, exactly.

Starting point is 01:06:35 And so these are all the early proprietary networks from all individual-specific vendors. And then the Internet appeared in kind of TCPIP and everything opened up. The AI is trying to go the other. I mean, the big company is trying to take AI the other way. It started out as like open, just like basically just like the research. Everything works open source and AI. Right, right. And now they're trying to, they're trying to lock it down.

Starting point is 01:06:52 So it's a, it's a fairly. Turn of Events. Yeah, very nefarious. You know, like, it's remarkable to me. I mean, it is kind of the darkest side of capitalism when a company is so greedy, they're willing to destroy the country and maybe the world to, like, just get a little extra profit. But, you know, and they do it. Like, the really kind of nasty thing is they claim, oh, it's for safety.

Starting point is 01:07:18 You know, we've created an alien that we can't control. But we're not going to stop working on it. We're going to keep building it as fast as we can. can and we're going to buy every freaking GPU on the planet. But we need the government to come in and stop it from being open. This is literally the current position of Google and Microsoft right now. It's crazy. And we're not going to secure it.

Starting point is 01:07:40 So we're going to make sure that Chinese spies can just steal our chip plans, take them out of the country, and we won't even realize for six months. Yeah, it has nothing to do with security. It only has to do with monopoly. Yes. The other, you know, just been going back on your point of speculation. So there's this critique that we hear a lot, right? Which is like, okay, you idiots.

Starting point is 01:07:57 Basically, it's like you idiots, you idiots, you idiots, entrepreneurs, investors. You idiots, it's like there's a speculative bubble with every new technology. Like, basically, like, when are you people going to learn to not do that? Yeah. And there's an old joke. There's an old joke to relates to this, which is the foremost dangerous words in investing are this time as different. The 12 most dangerous words in investing are, the foremost dangerous words in investing are this time as different. Right?

Starting point is 01:08:18 Like, so like, does history repeat? Does it not repeat? my sense of it, and you referenced Carlotta Perez's book, which I agree is good, although I don't think it works as well anymore we can talk about sometime, but is a good, at least background piece on this. It is just like, it's just incontrovertibly true. Basically, every significant technology advance in history was greeted by some kind of financial level, basically since financial markets had existed. And this, you know, by the way, this includes like everything from, you know, radio and television, the railroads, you know, lots and lots

Starting point is 01:08:46 of prior. By the way, there was actually a so-called, there was an electronics boom bust in the 60s called the tronics. Every company had the name tronics. And so, you know, there was that. So, you know, there was like a laser boom bus cycle. There were all these like boom bus cycles. And so basically it's like any new technology that's what economists call a general purpose technology, which is to say something that can be used in lots of different ways. Like it inspires sort of a speculative mania. And, you know, and look, the critique is like, okay, why do you need to have this speculative mania? Why do you need to have the cycle? Because like, you know, people, some people invest in the things, they lose a lot of money. And then there's this bus cycle. that causes everybody to get depressed, maybe the ways to rollout. And it's like two things. Number one is like, well, you just don't know.

Starting point is 01:09:26 Like if it's a general purpose technology like AI is and it's potentially useful in many ways, like nobody actually knows up front, like what the successful use cases are going to be or what successful companies are going to be. Like you actually have to, you have to learn by doing. You're going to have some misses.

Starting point is 01:09:39 That's venture capital. Yeah. We, yeah, exactly. Yeah, exactly. So, yeah, the true venture capital model kind of wires this in, right? We basically, in core venture capital,

Starting point is 01:09:49 the kind that we do, we sort of assume that half the company fail, half the projects fail. And, you know, if any of us, if we or any of our... Failed completely, like lose money. Yeah. Like lose money. Exactly. Yeah. And so like, and of, and of course, if we or any of our competitors, you know, could figure out how to do the 50% that work without doing the 50% that don't work, we would do that. But, you know, here we sit 60 years into the field and, like, nobody's figured that out. So there is, there is that, that unpredictability to it. And then the other, the other kind of interesting way to think about this is like, okay,

Starting point is 01:10:16 what would it mean I have a society in which a new technology did not inspire speculation? And it would mean having a society that basically is just like inherently like super pessimistic about both the prospects of the new technology but also the prospects of entrepreneurship and, you know, people inventing new things and doing new things. And of course, there are many societies like that on planet Earth, you know, that just like fundamentally like don't have the spirit of invention and adventure that, you know, that a place like Silicon Valley does. And, you know, are they better off or worse off? And, you know, generally speaking, they're worse off. They're just, you know, less future oriented, less, less, less focused on building things.

Starting point is 01:10:50 less focused on figuring out how to get growth. And so I think there's a, at least my sense, there's a comes with the territory thing. Like we, we would all prefer to avoid the downside of a speculative boom-buzz cycle, but like it seems to come with the territory every single time. And I, at least I have not, nobody, I'm aware, no society I'm aware of has ever figured out how to capture the good without also having the bet. Yeah. And like, why would you? I mean, it's kind of like, you know, the whole Western United States was built off the gold rush. and like every kind of treatment in like popular culture of the gold rush kind of focuses on the people who didn't make any money. But there were people who made a lot of money and found gold.

Starting point is 01:11:30 And, you know, in the internet bubble, which, you know, was completely ridiculed by, you know, kind of every movie, if you go back and watch any movie between like 2001 and 2004, they're all like how only morons, did a dot com and this and that and the other, and there were all these funny documentaries and so forth. But like, that's when Amazon got started. You know, that's when eBay got started. That's when Google got started. You know, these companies, you know,

Starting point is 01:12:02 were started in the bubble, in the kind of time of this great speculation. There was gold in those companies. And if you hit any one of those, like you funded, you know, probably the next set of companies, you know, which included things like, you know, you know, Facebook and X and, you know, Snap and all these things. And so, yeah, I mean, like, that's just the nature of it.

Starting point is 01:12:24 I mean, like, that's what makes it exciting. And, you know, it's just a, it's an amazing kind of thing that, you know, look, the transfer of money from people who have excess money to people who are trying to do new things and make the world a better place is the greatest thing in the world. And if we, some of the people with excess money, lose some of that excess money and trying to make the world a better place, like, why are you mad about that? Like, that's the thing that I could never have seen. Like, why would you be mad at, you know, young, ambitious people trying to improve the world getting funded and some of that being misguided? Like, why is that bad?

Starting point is 01:13:08 Right, right. As compared to, yeah, especially as compared to everything else in the world and. the people who are not trying to do that. So you'd rather like, we just buy like, you know, lots of mansions and boats and jets. Right. Like, what do you talk? Right. Right.

Starting point is 01:13:22 Exactly. Donate money to ruin us. Yeah, ruinous causes. Right. Such as ones that are on the news right now. Okay. So, all right. We were at a minute 20.

Starting point is 01:13:32 We made it all the way through four questions. We're doing good. We're doing great. So let's call it here. Thank you, everybody, for joining us. And I believe we should do a part two of this, if not parts three through six, because we have a lot more questions to go, but thanks everybody for joining us today.

Starting point is 01:13:45 All right. Thank you.

The a16z Show - The State of AI with Marc & Ben

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.