No Priors: Artificial Intelligence | Technology | Startups - Hyperscaler strategy in AI, the application landscape heats up, and what we know now about agents with Sarah and Elad

Episode Date: April 11, 2024

This week on a host-only episode of No Priors, Sarah and Elad discuss the AI wave as compared to the internet wave, the current state of AI investing, the foundation model landscape, voice and video A...I, advances in agentic systems, prosumer applications, and the Microsoft/Inflection deal. Have a question for our next host-only episode or feedback for our team? Reach out to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil  Show Notes:  (0:00) Intro (0:32) How to think about scaling in 2024 (3:21) Microsoft/Inflection deal (5:28) Voice cloning (7:02) Investing climate (12:50) Whitespace in AI (16:36) AI video landscape (19:54) Agentic user experiences (22:21) Prosumer as the first wave of application AI

Transcript
Discussion (0)
Starting point is 00:00:00 Today on No Pryors, we are going to have a host-only discussion. There's so much going on over the last couple weeks in AI. We've thought it'd be going to take a big, deep breath and a step back and talk through some of the really big changes that seem to be happening in the landscape. Sarah, there's been a lot of new models that have come out over the last, even just a week or two. There's Claude, GROC, Databricks, a variety of folks that launched things. What do you think? what's going on? Yeah, I think it's a huge update for most people's priors versus a year ago, right? I think
Starting point is 00:00:37 it's very, it's very likely at this point that you end this year with a handful of GPT four-level models, and that some of those are open source, right? And so I think Mistral first, but then also data bricks with DBRX, they change the point of view on what you can do with, you know, a relatively small amount of compute, tens of millions of compute, and then also from a scale perspective, the Databricks team in particular just declared a very strong point of view that they call Mosaic's Law, where a model of a certain capability will require a quarter of the, you know, dollar capital investment every year due to a bunch of improvements on the hardware and algorithmic side. And I don't know if that's grounded in any particular technical belief.
Starting point is 00:01:29 but I do think that the model landscape completely shifts versus what people expected to be, I think most people expected to be quite monopolistic or at least oligopolistic a year ago, right? And I think there's still a really big question at the state of the art, because if you go up one level of scale in terms of capital investment, if you're still, you know, the dominant factor is compute scaling. I think that question remains. But there's an awful lot you seem to be able to do. do with the GPT4 level model. So I think like the net impact of that is pretty good from the application or the sort of enterprise adoption side. Yeah, it definitely feels like the most cutting edge smartest models in some sense you're going to end up with an oligopy, at least in the next couple years, just because of the scale of capital needed. But also just how far ahead you start to be as you have a model that can help you build the future models, right? Even just things like data labeling or certain forms of reinforcement learning through eye feedback or other things like that.
Starting point is 00:02:30 And so as you get better and better model capabilities, you start bootstrapping the next generation of models, although obviously you have to do other breakthroughs to get there. And then to your point, I think under that, you have this broader swath of different models and companies and things that are available. And one could argue part of what that's going to do is just kind of flip some of the value capture, the revenue, the margin, the people, whatever metric you want to use over to the clouds. because they're going to be hosting all these things, right? So whether it's Lama or whether it's Claude or whether it's one of these other entrants, there's just going to be a lot of
Starting point is 00:03:05 room, I think, for the clouds to make money over time as well, which I think is a little bit under discussed in terms of, you know, who captures value in this market besides the model providers. Related to the clouds, how do you think about the recent inflection Microsoft deal? I think the first reaction is like they're true believers in AI and Microsoft and saw you as a live player, right? And so I think these sort of obvious observations with Microsoft here would be they both see a product, they see a product opportunity that they need AI aware product leadership and research leadership to go after across Microsoft properties, despite all of the initial real traction around co-pilot in the code domain.
Starting point is 00:03:52 I think we're still far short of what revenue Microsoft actually expects to drive in terms of across its productivity suite and in search. And they're ambitious to go after that. And I think this is a leadership change that supports that. Now, they're clearly still working with Open AI given like direct statements from both companies and the Stargate data center effort. But it's also hard from the outside not to see this as somewhat of a hedge, right? Not in a criticism of open AI, but, you know, if you are a true believer, that this is the most important technical driver for your company and then you're reliant on an outside player, that's not a position that a trillion dollar company likes to have. You know, Mustafa has had more capital and more compute available to him than the vast, vast majority of entrepreneurs and research teams. And I think one big argument you can make from Microsoft is just like you have direct access to that if you're focused on the research, right? And so I think it supports what you said where the spend required at the perhaps not even this generation, but the next generation really requires a certain level of sponsorship that is challenging for most independent players. But it's a directly opposing view to the, for example, the Databricks narrative today.
Starting point is 00:05:16 Can I ask you like a more domain specific question? So Open AI just announced voice cloning, right? And the interesting thing here is you have companies like 11 labs with really great traction, other competitors out there focused on different feature sets like latency that are progressing. But let's go ahead and assume for argument's sake, given the Open AI announcement. of voice cloning that both open AI and deep mind and maybe others have very, very good voice, video, image, song models. And the question has been like, you know, will they release that? And what does that do to the market beyond text APIs? Yeah, I mean, it seems like a lot of the hesitancy, as far as I can tell for these companies to go aggressively after the voice side is just regulatory slash societal concerns, right? I think one of the concerns people have on the voice cloning side is do you end up with different types of deep fakes or other things? where it's much harder to tell with a voice, what's going on. There's obvious ways around that. I think you can do an attestation where when you upload a voice for the first time, the person actually who owns the voice in some sense, his voice it is can actually do some form of attestation or other things, or there's other ways to do verification.
Starting point is 00:06:31 My sense of the market is that multiple players have this technology, but they've been holding it back. And in some cases, it may have had it for a year or two now. Because, you know, there's also been open source versions of this like tortoise and things that the Suna team was working on earlier and things like that. So I'm surprised in some sense by how little competition there's been. Okay, let's characterize like the rest of the investing landscape and then Elad himself, you know, driving the rest of the landscape. Like do investors keep funding in general foundation model training efforts from here or more specialized ones?
Starting point is 00:07:01 Can you talk about what you think those dynamics will be? You know, it's interesting because if you look at the scale of capital that's gone into foundation models, venture capitalists have put hundreds of millions of dollars into individual companies, but then the big cloud providers or big tech companies, including Nvidia, have put billions into companies. And so most of the funding of this market is actually being done by the hyperscalers and a few other big tech companies. And that's true in China as well, right? It's the really big pre-existing internet companies that are funding everything. And so the VCs are almost a bootstrap. And the bootstrap is sometimes tens of millions and sometimes it's hundreds of millions, but to get to real scale, it comes from other places.
Starting point is 00:07:39 And to the points earlier on the cloud side, there's a strong incentive for the clouds to keep funding these things as long as it drives cloud revenue. So for example, Microsoft's last quarter, they mentioned Azure revenue, which was about $25 billion for the quarter, grew by, I think it was 5% due to AI-related products, which would be another billion, billion, and a half a quarter in revenue off of AI. So that's, you know, $6 billion, $5 billion annualized, and it's probably still growing. And so if you look at it from that perspective, there's a strong incentive to fund these things because they're driving so much utilization and usage. So I definitely think we'll see more funding going into the market. I think on the foundation model side from
Starting point is 00:08:17 a venture capital or angel investor perspective, I think we're going to see fewer new language models, but we should see models in a lot of other areas. And we have new things happening in music. We talked a little about text to speech with 11, but then there's a bunch of other areas around video, image gen, physics models, biology models, material science, robotics, etc. and so there's this broad swath of other types of foundation models that are starting to get funded or who are accelerating in terms of the funding cycles there. And so one can anticipate we'll see a similar thing there where we'll probably have venture capitalists do the first set of rounds, and then it'll shift over time to large strategic players who really be that as things that are beneficial.
Starting point is 00:09:02 And then there may be other areas where, you know, people are doing really interesting things. Applied intuition is a good example of a company that's doing simulation software. And, you know, they've been doing really interesting things in terms of like modeling behavior there for years now, right? So I just think there's a lot of, a lot of room to still do lots of interesting things on the foundation model side. But I do think it's going to continue to shift over time. What domains do you find this interesting? Or what's your framework for figuring out which of these things are going to be not only important societally, but also good businesses? One basic way to look at this, which is what are the capabilities that we are still missing or struggling with, right? And so one thing that I've been interested in for a long time is just how do you operate on time series with more general knowledge and reasoning, right? There are so many ways in which being able to better understand time series would be really, really valuable, right? And it's a very unsolved problem. If you look at anomaly detection, anything that is. And infrastructure.
Starting point is 00:10:11 monitoring, a security, a health care, consumer behavior use case. There's domains and then there's sort of other dimensions like context windows and how you handle a particular type of data. That's one domain where I just feel like there's huge commercial applicability and interesting architectural approaches that could allow you to break through. Then I think there's like we take the existing advancements in language model. And all people from all fields applying machine learning are now paying attention to this and, you know, working at some of the best labs. And if they go look at the domains that they're, they've been traditionally focused on. So for example, like robotics and biotech are two areas where I've been spending a bunch of time. You know, there's something in the water where a bunch of very smart people are showing leading results on traditional benchmarks with these approaches. And, you know, the sort of core of it is that, you get several smart teams at once thinking that a domain can be solved with a foundation model that has more generality and then some cleverness and approach, right? And I don't mean
Starting point is 00:11:23 that in a trivial way because, like, for example, in robotics, you know, most machine learning people will look at it as a data collection problem where your internet data, even video data of a bunch of actions just isn't enough. We need embodied action data. Like, you know, controls data in some way. And then people who have very, very different ideas on data collection, both you mentioned simulation, but also like real world efficient collection is kind of a core question for a number of these companies. And then also different ideas in terms of how to split up the value chain from, are you doing software that will apply to many different types of hardware? Are you doing a, you know, a verticalized company? And I am inclined to believe that, you know, a lot of these don't. remains are going to be solved, then it's just a question of, like, picking the product path through. I think there's a set of things that are just intellectually interesting. I think, for example, the biotech models were really cool, where it seemed like in some examples, long context windows make protein folding easier, which is really neat. And then there's the societal
Starting point is 00:12:28 implication side, and there may be some things that never make money, but are incredibly societally useful. And then lastly, there's, like, what are big commercial applications? The thing that I find very fascinating is, you know, there's one or two of these areas where I've seen, you know, a dozen teams all enter at once. And there's so much white space and AI right now. There's so many different things to do. I've actually started incubating a few things again, just because there's, you know, people just are working on certain areas that seem kind of obvious. And I've looked for companies. I'd rather back somebody than, you know, incubate something, which is much harder. But it's surprising to me that half a dozen people or a dozen people were all really smart
Starting point is 00:13:05 and really talented and really vetted in the field. Well, we'll jump on one thing. And then there are these wide open opportunities somewhere else. And so it's a very odd market right now, or you don't see the fast follows for certain things that are clearly working that you'd expect. What's your explanation of that? I think it's a mix of what people view is societally significant and therefore they want to work at it, but also I feel like any set of startup waves always have these memetic actions where there's almost these memes of what to build that spread, and that's happened in prior technology
Starting point is 00:13:35 ways too. And the memes are often correct, not always, but often. great example of that would be in the mobile wave. I knew of literally a dozen different photo apps where people had photo upload from their phone and they'd go viral and they'd grow like crazy and then they'd all die because there was no sustainability. And then Instagram came out with like the compact format, the filters. There's actually a company called Camera Plus the ad filters before that, but they were charging 20 bucks a download because they just wanted to monetize it. It was sort of this indie dev shop. It never wanted to grow very big that was doing
Starting point is 00:14:08 it. And so, you know, Instagram, it had similarly started working on filters, a common format, and then a feed and more of a social product. And that's the thing that became sticky and, you know, sustained. But there was at least a dozen other ones I knew of where I knew the founders who were doing it, right? And so that was memetic but correct. But it took the right product substantiation to do it. In robotics, the correct product substantiation is harder or in biotech, it's harder or in LLMs. It's harder, you know, because these are very big mass markets. And I think people are just excited by the scale of commercial opportunity for these things, too, right? At the same time, there's other markets where there just aren't that many players. And so the question is why is that and what's the difference? And it's fascinating to watch this happen again. And again, memetic things are often correct. So, you know, there's a dozen search engines before Google. There's a dozen social networks before Facebook, etc. It's really interesting because it's an entire market driven by technologists right now, right? Like everybody's getting nerd sniped into a, a few areas. And that, as you said, may be right. But I think driving factors for me is just
Starting point is 00:15:16 like how much cross-pollination there is between people who would understand the edge of research and then like particular domains. Right. The ability to operate on financial or accounting data with these models seems like a very useful commercial capability. And actually, perhaps much more tenable and like a sort of linear path to commercial value than some of the things that we're describing. Like if you go talk to a experienced executive in the last generation of great biotech platforms or, you know, successful robotics players, these are not, they're not easy industries traditionally, right? But there's a lot of accounting and finance software in the world. There's a lot of, you know, professional accounting services. And I just think it's like there's just not that much interaction between accountants and controllers and like, you know, and systems engineers and research scientists.
Starting point is 00:16:19 So that's part of my explanation of it. Yeah. It's also just there's been a few technology breakthroughs across these different fields that suggest that some of these things were more tractable than people thought. And so I think there's also a technology why now or at least some proof points that make people excited to go and, you know, build out these. things that have now been discovered and shown to be possible. What do you think are some of the other areas that, you know, is worth looking at big changes? I mean, video may be an example. I know you're involved with a couple companies there. Do you want to talk about the video landscape and some of the shifts you're seen? Yeah, I actually think that it is kind of a
Starting point is 00:16:59 mistake to look at it as just a modality to be solved. And like we should, we're going to going to have a, and the rest of the sorroutine from Open AI on no priors. So we should certainly ask them. But I think video and the control of video for a commercial application is a little different than video generation understanding as belongs in a general multimodal model. And we'll see like what the, you know, if the persistence of that is two years versus 10 years. This is one of these areas where it is so obvious that the demand is unbounded, right, for commercially viable or just shareable high-quality video generation. And then it's not one form, right? Like many people will cut it into A role and B-roll. And today, despite the extraordinary advances from one
Starting point is 00:17:55 second, you know, very small clip with obvious artifacts to where we are with things like PICA and SORA today, we still have a very, very long way to go, both in terms of interfaces and controllability, like length quality, et cetera. And so I think it's an area that like deserves a lot more investment. But I think the set of things that you might want to make those assets commercially viable is actually like a very deep product problem with like specific research involved. And as you mentioned, one of the companies that I'm an investor in is a company called Hagen that has grown really, really quickly over the last year with sort of prosumer and commercial attraction focusing just on video avatars or the ability to have like, you know,
Starting point is 00:18:49 people, a clone of you or a spokesperson for an organization, speak on film to a camera, uh, film camera, right, generated pixels. But it is very cool to see this, ways that you can progress this if you are focused on it because we get so many requests from end users who have very little idea how or no idea how any of the technology behind the scenes works. And, you know, people are very creative and they want full control. And so like one of the releases from the company this past week was, uh, I, you know, I have video of an avatar moving, walking around, going through gestures and I want to replace what they're, uh, what they were saying. Like one of the things that was very much learned, you mentioned the last
Starting point is 00:19:35 generation of like mobile image apps. One thing you learned from the last generation of video applications was like if you take certain dimensions of creativity away or you just control them, and you lower the required quality of different dimensions, you make it much, much easier for people to create, right? And so I think that's one path that companies like H& are going down. I want to ask you about one thing that I think like Devin and Cognition has really woken up a lot of engineers and product teams too, which is like how much space there is in exploring agents and different model user experience. Yeah, you know, it's interesting. I was actually working on a post on this in terms of it feels like there's this shift.
Starting point is 00:20:30 in terms of agentic UIs and what they look like. Because I think a lot of what people were doing before was modeled on either chat to BT or copilot, where it's either forms of like chat or just like auto-complete of different nature or sort of in line or things like that. And I think the Devon UI was really interesting from the perspective of it was a new way to think
Starting point is 00:20:49 about how you display information in terms of what an agent is doing. And even just seeing, you know, in Devon they have four taps, right? There's the plan that you're doing, the shell, the code that's being written. And then there's a chat interface, right? Here are the steps that I'm doing, and so you know what's coming.
Starting point is 00:21:05 You can see the code that's being written, and you can kind of redirect the agent to do other things if you think it's going down the wrong path or what is browsing. It'll prompt you for things like API keys or tokens, and so you can interact with it and restear it along the way. And I've started to see that UI now pop up in other use cases where people whose products were demoed to me a week or two before have suddenly shipped it more down this direction if they're doing anything agentic, because you realize that most people don't want to just sit there and wait and wonder if the agent is actually doing what they want. They want to be able to see it
Starting point is 00:21:38 and maybe interrogate it or interfere and put things on the right path so it gets it done faster. And it's the way to think about agents today, I feel, is almost like a junior intern. They're very eager. They're trying really hard to please, but they still have a lot to learn or you kind of need to give some direction. And so this is a mechanism by which you kind of almost get that update email from the intern saying, hey, here. is what I'm up to, and you say, oh, actually, could you go do this other thing a little bit different? Or have you considered doing these three things? And so I think there's a lot of really interesting things that'll be coming as we rethink UIs, and then eventually the entire UI will
Starting point is 00:22:10 go away once agents get good enough, right? And so I think this is kind of the intermediary step of human in the loop. And eventually, as agents get smarter and smarter, and that's going to be through breakthroughs in the base models, but also breakthroughs in reasoning and other areas, we'll start to see, actually, I think, some of the UI go away over time. And so I think we're in the sort of early form of this stuff. And it's very exciting to just see these new paradigms being created and happening. You know, one thing that you mentioned that I think is kind of interesting is that a lot of the big use cases in AI surprisingly are prosumer ones, right? That's that's Hey, Jen, that's chat to BT, that's perplexity. That's a variety of things. Soonow. How do you
Starting point is 00:22:52 think about the whole consumer market? Like, do you think the first real wave of AI is prosumer in some sense? I think it structurally has to be, right, just based on like the pace of ability to adopt things in the enterprise. Pursumer applications, they get to carve this path of direct user value, which we think we can create a lot of with AI capabilities. That is neither like, oh, we need to fight existing incredibly strategic and embedded, for example, like consumer social networks with their network effects, but actually have these capabilities that. that generate their own distribution. And it's not that you don't need to be smart about distribution to consumers and prosumers. But really, like, these companies are growing on the backs of just great product that people want. It is very, very hard to get to millions of enterprise users in a year simply because of the decision making and, like, security processes and roadmap involved in getting a large customer to change something internally and the risk tolerance of all of those versus, like, I want to do something that is $10 a month.
Starting point is 00:23:57 valuable to me or makes me more productive is a much faster decision. So I think structurally it's something we should expect. I also think it's just interesting that the Canva numbers are quite public now. And, you know, a billion of its 1.7 billion in ARR is prosumer. And so I think, you know, I'm just sort of respecting the data here in the argument for some of these prosumer companies is that you can often grow into a more professional set of use cases. And the increase in capabilities is creating huge markets where they didn't exist before. Right. And you should see many more of these companies. And I do believe that argument. I think they're going to, I think they're going to be sneaky big markets. Yeah. It's kind of notable because if you look at the first internet wave,
Starting point is 00:24:51 if you look at the 90s, the first wave was all consumer. And then the second wave was B2B. And so they used to talk about it about B2C versus B2B, right? And I think we have this odd parallel or analog here, where the very first adopters of this technology are consumers and proservers, and then there is some enterprise-related stuff like what Harvey's doing or others, but, you know, the initial waves seem to be more driven by people who are using it in their personal and professional lives first, which is very similar to what happened with email on the internet more broadly. And so it's an interesting parallel, and maybe this is just a dynamic of truly fundamental technology ships. And, you know, you could argue that was also a lot of the wave of
Starting point is 00:25:29 mobile, right? All the really big new companies, or most of them were consumer companies, between Uber and Instacard and all the rest, right? And so WhatsApp, et cetera. So very interesting parallels. Find us on Twitter at No Pryor's Pod. Subscribe to our YouTube channel if you want to see our faces, follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no-dash priors.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.