No Priors: Artificial Intelligence | Technology | Startups - Listener Q&A: 2024 Tech Market Predictions, Long Term Implications of Today’s GPU Crunch, and Will AI Agents Bring Us Happiness?

Starting point is 00:00:00 Hey, everyone. Welcome to No Pryors. I'm Sarah Gua. I'm a lot, girl. This week on No Pryors, we're back with another episode where we answer your questions about tech, AI, and everything in between. I think we have a lot of different questions that people are brought up this week that they were hoping we could cover and some topics that we thought would be kind of interesting. I want to go to one of our listener questions and I think a topic that's really popular with many of the companies that you and I work with. in terms of access to computing for a much smaller scale experiments. What, uh, what's going on with the GPU crunch? Yeah, the companies that you and I work with, many of them are companies that, you know,

Starting point is 00:00:43 they need to use very specific infrastructure to train and serve large models, right? Um, these work on GPUs. And the structure of the industry is like, um, it's just not very robust, right? So you have a very small number of producers, Nvidia and AMD generally. And then Nvidia is very far ahead on the high-end processors that are most efficient for large-scale training and infrareds. Then you have the pandemic supply disruption,

Starting point is 00:01:14 which we haven't fully recovered for. If you actually look at the supply chain, you go from the actual designers to the reliance on a few major foundries like TSMC. you know, expansion of this capacity is not easy, right? New fabs are billions of dollars. Yield is a very complicated thing. You can think of it as a massive precision manufacturing problem where temperature, pressure,

Starting point is 00:01:42 chemical concentration, tool imperfections, new processes, materials issues, like anything can make production have lower yield or lower quality, right? And so, like, if you think about the speed with which the industry, you know, the industry, driven by both large and small players has decided that they want to do AI, like the physical processes cannot keep up with that demand. It's as if, you know, half the companies in the world over a year-long period decided, like, yeah, we need supercomputers, not superconductors, but gigantic networked GPUs. So, like, what is the actual gap?

Starting point is 00:02:24 So to your point, it sounds like much of the AI world is different. dependent on GPUs in order to train and then do inference on these big AI models. And the big suppliers are basically Nvidia, AMD, and then there's like a long tail of smaller folks. What is the delta between the amount of capacity that exists today and that's needed? Are we off by 2x, 10x, some other number? It's hard to say because right now, there's no way to explore like the price elasticity of these things, right? So, you know, just very specifically, like, the industry is kind of looking at deliveries in small quantity in September, larger quantities in December, January. Most of the

Starting point is 00:03:08 large cloud providers are sold out for any scale for at least through April of next year. And so you have, like, really interesting dynamics, like large cloud players who, you know, are the biggest consumers of these GPUs already, like a Microsoft going and buying from other providers for near-term supply, right? So I think one question that I ask you is like, hey, do you think this is a long-term thing? Do you think it's a very short-term thing? But I think it just goes back to like the fundamental dynamics are, do you expect the demand for these chips to continue increasing at a pace that outcreases the ability to scale a very physical, like, real-world process, right? Just to even be more specific, one of the challenges, like I was talking to Jensen

Starting point is 00:03:59 about this, and a bonder, like not part of the GPU itself, but like a critical tool in the manufacturing and assembly of GPUs is very specialized. And so the ability to build any of these tools as well to enable these processes is a blocker. If you look at the demand, and from large labs today to continue increasing model scale and training time by magnitudes, I think it's hard to see that dynamic going away. What do you think? I feel like there's a couple different sort of second order implications of the fact that we're seeing this giant GPU bottleneck. I think the first one is that we're seeing new sort of models that are dependent on GPU access or ownership is ways to create all sorts of

Starting point is 00:04:45 really interesting monetization and potentially eventually cloud services. So that's things like CoreWeave or FoundryML or other companies that are basically providing now GPUs in different ways, in some cases through aggregation or federating different sources of GPUs. In some cases, it's just having these large GPU clouds and being able to use them in really interesting ways. And one of the interesting, I think, side notes is that GPUs used to be very heavily used for crypto mining.

Starting point is 00:05:11 And while crypto is down, it may actually be more economic to just use them for to rent out for AI training purposes or inference purposes. So I think that's one really interesting almost like sectoral shift in terms of existing GPU capacity. The second is that a lot of the different players that are startups who've built their own semiconductors

Starting point is 00:05:28 specifically for AI training, I think are starting to see a lot of really strong pull. So for example, Cerebrus, and I think we're going to have Andrew from Cerebris on our podcast in a couple weeks, they just signed a $100 million deal with UAE for building nine supercomputers using their chips, which are optimized for AI.

Starting point is 00:05:48 Amazing. And so I think they and Grock and other sort of semiconductor providers are going to find really strong pull during this period where people are desperate for any solution and they're willing to take the extra steps to really be able to utilize other forms of silicon. And so I think it creates a bit of an opening for other players in the market. And so it does seem like it's going to have these really interesting sort of cascading effects on members of the startup ecosystem and, you know, new players that are working against all this.

Starting point is 00:06:16 Two sort of second order things are like, what do you do when scaling is blocked on capacity? Like, you try to be more efficient. It's not been an area of massive focus to date because people have been chasing the state of the art following Chinchilla scaling as the simplest path forward. But there are really interesting lines of research that are undervalued today, unless the hardware supply crunch continues, including in. dynamically figuring out or routing to efficient models. So think of like the frugal GPT work or generally like distillation or even just a more intelligent choice of data for your pre-training or your fine-tuning training mix so you can use less compute, right, for the same or for improved quality.

Starting point is 00:07:07 And I think like everybody's been on this one path and an interesting second order effect is like does it spread people out in too long? of different directions in terms of chasing performance. I personally don't think the supply crunch goes away immediately. And like a part of the dynamic is just, you know, how much more people want to scale. And another part is like, you know, if this stuff is actually useful, then inference, like, inference already dominates opening eye compute usage, right? And so that demand will continue to go up. Yeah, I do think demand will only rocket. from here, at least in the short run.

Starting point is 00:07:46 And so the real question is the degree to which the semiconductor industry adjust to that. And the reality is that people really view Nvidia's chips as the most advanced on the market right now. And so that means that a lot of it is just a bottleneck and how much can Nvidia scale up manufacturing. And there's other players like AMD, there's the startups we mentioned, Cerebra, S. GROC and others. But a lot of the capacity is just going to be how much can Nvidia and maybe AMD scale up in the

Starting point is 00:08:13 short run at least. And so that may just cause some ongoing bottlenecks, assuming again, that we continue to see this very rapid growth in AI and AI applications. I'm working on a blog post right now actually about this because it feels to me that we're still in the very, very early innings of this wave of AI adoption, right? It's not a continuum where we had CNNs and RNNs and now suddenly we have Transformers. Transformers created a whole new capability set. And we're only, you know, eight months since ChatGPT and a few months since five months, I think, since GPT4. And so the only people who've really adopted this technology yet are the AI-native companies like OpenAI and the Journey and a few other folks.

Starting point is 00:08:53 And then we had the first wave of startups come, the perplexities and Harvey's and characters of the world, as well as the first wave of incumbents adopting it, Notion and Zapier and sort of the very, very early founder-driven adopters. And so we've had zero real enterprise adoption in terms of real products at scale or close to zero. And, you know, most enterprises, big businesses take six months, nine months to see their planning cycles. And then they'll spend a year of prototyping. And then finally, they'll launch these AI apps. And so we're probably a year or two years before we really start to see large scale AI applications by existing incumbent enterprises in real

Starting point is 00:09:31 products live everywhere. So from a ramp perspective, one can imagine that a lot of the future ramp and AI is coming in about two years or, you know, one to two years. Or, you know, one to two or something like that. So there's still a lot of room, I think, for the hype cycle, for increasing ongoing excitement, sometimes irrationally so, and then also for sort of adoption of semiconductors and other underlying infrastructure. So there's still a lot to come, it feels like. I agree with you. And I still think we're really early in, let's say, like the collective exploration of applications and constraints, right? Like you had the people who were bleeding edge of just personal interest? Like, I think chat GPT is looked at correctly as the

Starting point is 00:10:15 starting gun for people to begin developing these AI applications generally. But if you think about how long it takes to ship actual interesting products to market, and then the buildup of some collective understanding of, like, how to make these models more useful in different applications, and then, you know, turn them into workflows and then advance the state of the art given a particular workflow if you have a hypothesis on value. Like, that all takes time.

Starting point is 00:10:45 So I think we're in anyone. Yeah, it's all been demos so far, yeah. So I guess related to that, a lot of the interest and excitement right now is around to agents. You know, I spoke recently, there's a group called the AGI House, which, you know, hosts these different hackathons

Starting point is 00:10:59 in the Bay Area and stuff like that. And they had me come and help kick off like an agent hackathon they had. and things like that. What do you think happens with the agent world? Like what form does that take and is it a handful

Starting point is 00:11:11 of very broad agents? Is it highly specialized ones? Like, what do you think is coming there? Yeah. It's such a like powerful broad idea that I think both will happen. Right. And so like the overall idea is

Starting point is 00:11:26 you don't just talk to a chatbot or a query and interface. You have some sort of planning mechanism that is model-driven, that allows you to take asks autonomously, take actions autonomously, and complete a more sophisticated task, often using other tools, and then return that result or report back on your work to an end user, right? And so, you know, I think that is going to range from the pure consumer applications. So things like inflection, which is going to, you know, have personalized that do

Starting point is 00:12:04 more for you. Minion, which is working on, like, web agents. And, and then, you know, I think, like, there's been very recently more attention or just more understanding of how powerful it is to have agents that in some way write executable code, right? Because you can programmatically use many more tools. You can call APIs. And I think if that is, do a task that is not a single query, but requires multiple steps in analytics or an enterprise automation, or even within companies that we work with, like Harvey, like a single legal task is actually a composition of thoughts, planning, attempts of research, like writing that an associate might do. And so I think it's going to be a pretty dominant paradigm. Yeah, it's kind of interesting because if you look at past

Starting point is 00:13:00 technology waves and you ask about specialization versus sort of broadness, you know, are you building a broad-based platform that you can use for anything or a vertical application that really helps you with one or two things well? Most of the things that really work are these vertical applications that help you really well. Now, some of them broaden and grow into the broad-based platform for everything, right? Even in consumer, that's true. Like, Facebook started off as a college network. And in fact, it started with like five colleges, and then they added all colleges, and then later they added the ability to add your work email as a way to register, and then they opened it up to everybody, and then they started building the platforms on top of it in gaming and other

Starting point is 00:13:36 things, right? But it kind of happened sequentially. And there's kind of examples to that, you know, Google would be a very broad-based thing from day one. It helped you discover information on the web, right? You needed a tool for that. But it feels like in the agent world, a lot of the people that I hear talking about ideas have these very broad, sort of abstract ideas. And so an idea would be, I'm going to build an agent that is going to be your assistant. And you're like, okay, well, what is it going to help me with? And they say, everything. It's going to make you happy. And you say, well, I'd love to be happy. But at the same time, you know, starting with a very

Starting point is 00:14:10 targeted, focused initial use case tends to be the best way to build product, A, because you know who you're building it for, B, you can really nail the use case. And there's the old sort of YCism, which I think, which is really good, which is it's better to delight a small number of people than to have a very large number of people indifferent to your product. And so I think my bias for the agent world is if you're building an agent, start with something really

Starting point is 00:14:34 targeted. If it's an assistant to help you, what exactly does the assistant do? Does it do background information searches on all the meetings you have that day? Does it specifically help with certain forms of scheduling? Does it help with other aspects of your day planning or synthesis of what you've done or follow-up action items or whatever it may be, but choose one or two

Starting point is 00:14:52 things and do them very well. versus do everything. And then eventually you may build the thing that you start off that does one thing very well, but then broadens into everything. But usually starting with everything means you're not really doing anything deeply or well. And so I think that's, that to me is one of the main patterns, at least, in terms of prior ways of technology development. I very much feel like this is like a very classic tension between what I consider to be,

Starting point is 00:15:17 like, I don't know, the like infrastructure platform engineering, like even research agenda driven approach that is like, oh, you don't understand. Like, the technology is general. We don't want to be taken off the research path that pollutes our data mix in a way that it is not a general purpose technology anymore, right? Or, you know, it can do anything while limit it. Or even getting feedback from users, because you release this stuff, it is broadly capable that they're doing everything with it. Some things much more successfully than others. And I think more of a like a product engineering, like traditional like startup mindset that is like actually complete the task, right? And I definitely think, um, uh, the overall exploration has been

Starting point is 00:16:05 skewed to one side, not as productively today. Um, and one of the, like, even if you think from the research agenda, one of the reasons it is interesting to, um, think about the, like, uh, the, you know, have more focus. Everybody's thinking about, but have more focus on a accomplishing the specific task is like, you want to be happy a lot. All I want to do is like never write boilerplate code again, right? And so if you think about, that's how I define happiness. Okay, great. Then we're still the same. But like if you think about like, okay, let's like complete one task. If I want to ask, you know, an agent to just like fix all the bugs in my software, then my ability to like successfully. complete that task includes a lot of like bug fixing specific techniques, right? Like you could do test time search and then see if all of the different things that you generated actually execute as one very simplistic example, right? And so like I think there are a lot of ways to

Starting point is 00:17:11 advance in the research in very specific tasks that are much more tractable. But maybe I'm not thinking big enough. No, that makes sense. I think I would add one third piece to that framework you have, which is the research-driven versus product-driven, I think there's a third approach, which is infrastructure or tooling-driven. And that's why you're like, I'm not going to build the agents, but I'm going to build the infrastructure that allows anybody else to build them rapidly. Now, sometimes those types of businesses or approaches work really well, and sometimes those things are solely an outgrowth of a vertical product that works really well, that you then open up the infrastructure for everybody else to use. And it's very case-by-case dependent.

Starting point is 00:17:48 It's a difference between Stripe, where it's just like we need to build payments for everybody, everybody keeps building it over and over again, and the Facebook off platform, which only existed because you got to hundreds and millions of users, you could open up off as like a third-party service. And so I think as people think through that third angle of building infrastructure for others, they need to understand whether that infrastructure will be an outgrowth of an existing product area and benefit from the characteristics of the market liquidity of that product, or whether it's just a piece of infrastructure, everybody keeps building over and over, and therefore it's a really good thing

Starting point is 00:18:19 to just provide to the world. So I think it's kind of an interesting. future topic. We are on a, you know, a couple month bull run at this point. 2024 tech markets. What's coming? Like, will people be able to fundraise? Will funds be able to fundraise? Our customers purchasing.

Starting point is 00:18:37 You know, I think there's going to be basically four markets next year in some sense. One market is just AI, and I think AI will continue to run in different ways. And it'll look very expensive at the time. And a handful of companies will look really cheap in hindsight, just like it with every other technology wave. And I think that's separable from the rest of tech that existed prior to the AI wave. For companies that fundraised in 2021, prior to being like AI companies, a subset of them, I think if I were to sort of divvy up that pie of those companies, sort of mid to late stage private tech companies, not in AI,

Starting point is 00:19:10 and what's going to happen to them next year and in 2025, I think a third of them are just going to go under, or a third of, I should say, unicorns. We'll eventually just go under, be fire sales, whatever. they won't be able to ever raise money again. A third will be at the highest valuation they'll ever be at ever in the lifetime of the company. They'll reach their terminal value. And there's examples from 2014 of companies

Starting point is 00:19:32 that went through that same wave. They raised in 2014, they went public a few years later, and then they never surpassed their market cap again. And then I think lastly, there'll be a third of companies that grow past it. And so I do think there's going to be a lot of carnage next year and a lot of companies going under. And as those companies go under,

Starting point is 00:19:48 three things will happen. one, it'll be much easier to hire people. People are already seeing that at startups. It's easier to hire again. Second, it should have follow-on effects and ramifications for commercial real estate. And we'll see a second shoe drop there. And then third, the venture capital community will be impacted because a lot of the things that they've been using to fundraise new funds or do other things with will

Starting point is 00:20:09 suddenly go to zero. Their big unicorn success will go from a multi-billion dollar or a billion dollar company to basically a company that isn't worth anything. And so I think that's going to have knock-on effects to the venture ecosystem. But I think that'll take like two, three years to play out because all these things are a bit time delayed. But yeah, I think the other shoe still hasn't dropped in private tech markets. And a lot of it is just companies raise so much money in 2021.

Starting point is 00:20:37 They still have lots of money. So everything still feels like it's continuing to go. But at some point that money is going to run out. So I think it's going to be a pretty bumpy 2024 and 2025, potentially. Yeah. My advice to companies that, you know, raised a very healthy valuations during that period of time. And then are actually building businesses is to try to completely disassociate from that valuation. Because people will put themselves into all sorts of contortions to do a flat or up round to a valuation that makes no sense. Right. And if you don't have the historical context of that making no sense, it's an extremely painful sort of realization to have. But if you look at, there's this one analysis of actually the very best technology companies and the ones that endured from the internet bubble and how long it took those companies to reach the valuations they were at before the bubble burst.

Starting point is 00:21:40 And it's a decade, right? And it's like startups don't have a decade to try to, you know, get to at-par valuations. Yeah. I'm actually less worried about valuation. I think valuation is ephemeral, right? Effectively every or roughly every tech company in public markets did a downround over the last year and a half, right? They all lost, or many, many companies lost 30 to 90% of their value, right? And effectively, they just did a downround in public markets because every day you're repricing a public stock. I'm more worried about the people who burn tons of cash and they don't have a lot of revenue to show for it. And then when they're going to go out to raise more money, people say, well, you burn $50 million.

Starting point is 00:22:16 dollars. You burn $100 million to generate five or $10 million of revenue. And so the issue isn't that your valuation is off. We can always reset valuation. It's the fact that you burned all this money and you don't have anything much to show for it. And that's where I think the real issues will happen. Because you can always reprice things and people will be forced to and, you know, it'll just happen. But I think it's the underlying business case and business model that's going to be the real issue. Yeah. I guess like the unforced error there for companies who actually have the time to make the decision is the thing you want to avoid is like not adjusting your cost profile or, you know, holding onto that valuation until it's too late. Yeah. Or just deciding

Starting point is 00:22:55 it's the wrong business and it's not working. And, you know, the most important, precious thing for you as a founder is your time. And I think people forget that. You have this golden period in your life where you don't have hopefully a lot of other complications in terms of sick family members or school-related issues or whatever it is. And you can take risk and you have a low-cost basis and you can do all these things. And that's the moment when you can best take risk to start a company for many people, not for all.

Starting point is 00:23:25 And you're really giving up the best years of your life working on things that potentially may not work. Thanks for the discussion. It's a lot of fun. Yeah, super fun. Thanks to everyone who sent us your questions. Find us on Twitter at No Prior's Pod. Subscribe to our YouTube channel

Starting point is 00:23:41 if you want to see your faces, follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no dash priors.com.

Your Ad Here

No Priors: Artificial Intelligence | Technology | Startups - Listener Q&A: 2024 Tech Market Predictions, Long Term Implications of Today’s GPU Crunch, and Will AI Agents Bring Us Happiness?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.