No Priors: Artificial Intelligence | Technology | Startups - DeepSeek, Deep Research, and 2025 Predictions with Sarah and Elad

Starting point is 00:00:00 Hey, listeners, welcome back to No Pryors. This episode marks a special milestone. Today is our 100th show. Thank you so much for tuning in each week with me and Alad. And it's been an exciting last couple of weeks in AI, so we have lots to talk about. Why don't we start with the news of the hour or, you know, really the last month at this point? And Deep Seek. Alad, what's your overall reaction?

Starting point is 00:00:28 Deep Seek is one of those things, which is both. really important in some ways, and then also kind of what you'd expect would happen from a trendline perspective. And I think there was a lot of interest around Deepseek for sort of three reasons. Number one, it was a state-of-the-art Chinese model that seemed to have really caught up with a number of things on the reasoning side and in other areas relative to some of the Western models. And it was open source. Number two, there was a claim that it was done very cheaply. So I think the paper talked about like a $5.5 million run. It's sort of the end.

Starting point is 00:01:02 And then lastly, I think there's this broader narrative of who's really behind it and what's going on and some perception of mystery, which may or may not be real. And as you kind of walk through each one of those things, I think on the first one, you know, state of the art, open source model with some recent capabilities built in. And they actually did some really nice work. You read through the paper, there's some novel techniques in RL that they worked on and that, you know, I know some other labs just starting to adapt. I think some other labs had also come up with some similar things over time, but I think it was clear they'd done some real work there. On the cost side, everybody that I've at least talked to who's savvy to, it basically views every sort of final run for a model of this type to roughly be in that kind of dollar

Starting point is 00:01:42 range, you know, $5 to $10 million or something like that. And really the question is how much work when in behind that before they distilled down this smaller model. And my sense is everybody thinks that they were spending hundreds of millions of dollars on compute leading up to this. And so from that perspective, it wasn't really novel. And I think that sort of 20% drop in invidia stock and everything else that happened is news of this model spread was a bit unwarranted. And then the last one, which is sort of speculation of what's going on. Is it really a hedge fund as something else happening? Like, you know, felt a little bit,

Starting point is 00:02:14 well, speculative. There's all sorts of reasons that it is exactly what they say it is. And then there's some circumstances in which you can interpret things more broadly. So that's kind of my read on I mean, what do you think? Yeah, I think it's interesting sort of the delayed reaction to it. But to your point, it's also like what you might expect, especially given historical precedent with like GPT 335 and then chat GPT. So like deep seek v3, like the base model, big AI model pre-trained on a lot of internet data to predict next tokens.

Starting point is 00:02:45 Like that was out in December, right? And Nvidia stock did not crash based on that news. So I think it's just interesting to recognize that, like, people obviously do not just want raw likelihood of next word in a streaming way and the work of post-training and making it more useful for human feedback or more specific data, like high-quality examples of prompts and responses, just like we've seen with the chat models like chat GPT, the instruction fine-tuning that made this such a breakthrough experience. Like, that really mattered. And then the, as you said, the, like, narrative violation, release of R1, reasoning model as a parallel model to, like, Open AIs 01. I think that was also the breakthrough moment in terms of people's understanding of this. Well, it's also like 20 years of China, America, technology dominance narrative, right? Yes.

Starting point is 00:03:37 Like, I think it was also kind of this, like, iced around U.S. versus China, you know, worse, you know, the West is far ahead. And so, you know, will they ever catch up, et cetera? And this kind of showed that Chinese models can get their. really fast, but I do think the cost things is a huge component of it. And again, I think cost me have been in some sense misstated or misunderstood at least. It's not clear to me that like final model runs at scale are in this price range. But I think what you were saying before, I completely agree with. Experimentation tends to be a multiple. Like you need to have tooling and data work and the experimentation and the pre-training run and data generation cost and post training and

Starting point is 00:04:17 inference, right? I'm sure I'm missing something here. It seems very unlikely that there hasn't been a large multiple of $6 million spent in total. But I think there was also a narrative violation here in that, like, even at a multiple of $6 million, it's not like a multi-billion dollar entry price or a Stargate-sized entry price to go compete. And I think that is something that really shook the market. That should be expected, because if you look at the cost of training a GPT four level model today versus two years ago, it's a massive drop-in cost. And if you look at, for example, inference costs for a GPT4 level model, somebody in my team kind of worked out.

Starting point is 00:04:56 And in the last 18 months, we saw 180x decrease in cost per token for equivalent level models, 180x, not 180%, 180 times. So the cost collapse on these things is already quite clear. That's true in terms of training equivalent models. That's true in terms of inference. And so, again, I kind of just view this roughly on trend and maybe it's a little bit better and they've come up with some advanced techniques, which, you know, they absolutely have. But it does feel to me a little bit overstated from the perspective of how radical it is. I do think it's striking what they did.

Starting point is 00:05:29 And it kind of pushes U.S. open source forward as well, which I think will be really important. But I think people need to really look at the broader picture of these curves that are already happening. Do you think it's proof that models are commoditizing the fact that they are so much cheaper for a given level of capability over? the last 18 months? There's this really great website called Artificial Analysis.A.I that actually allows you to look at the various models and their relative performance

Starting point is 00:05:55 across a variety of different benchmarks. And the people who run this actually do the benchmarks themselves. They'll go ahead and retest it versus just take the paper at face value. And you see that for a variety of different areas, these models have been growing closer and closer in performance.

Starting point is 00:06:11 And there's different aspects of reasoning and knowledge, scientific reasoning and knowledge, quantitative reasoning in math, coding, multilinguality, cost per token, relative to performance. And they kind of graph us all out for you, and they show you by provider, by state of the art model, how do things compare. And things are getting closer over time versus more dispersed over time. So I think in general, the trend line is already in this direction

Starting point is 00:06:33 where it seems like a lot of people have moved closer and closer to Colvency than they were, say, 18 months ago, where I think there was enormous disparities. And obviously there are certain areas where different models are quite a bit ahead still, but on the average, things are starting to net out a little bit more. And that may change, right? Maybe somebody comes out with an amazing breakthrough model and they leapfrog everybody else for a while. But it does seem like the market has gotten closer than it was even just like a year ago. What do you think is the value of being the leader at the frontier?

Starting point is 00:07:01 I think there's three or four different types of value. I mean, one is capturing market share. So do you just get more people using you and then they kind of stick because they're used to it? Or they've optimized prompts for other things for what you're doing or other tooling. I think the second thing is if you're actually using the model to help advance the next model, then having something is dramatically better to make a difference. So that could be data labeling. It could be artificial data generation. It could be other aspects of post-training.

Starting point is 00:07:26 So I think there's lots of things that you could start doing when you have a really good model to help you. It could be coding and sort of coding tools. It could be all sorts of things. There is an argument that some people make that at some point, as you move closer and closer to some form of lift-off, that the more state of the art the model is, the more it bootstraps into the next model faster and then it just accelerates for you and you stay ahead.

Starting point is 00:07:47 I don't know if that's true or not. I'm just saying that's something that some people speculate on sometimes. Are there other things that you can think of? No, I think one thing you mentioned, maybe if I just extend it, is like kind of underpriced or like not yet understood enough the market as a theory, which is the idea that if you have a like high quality enough base model

Starting point is 00:08:06 to be doing synthetic data generation for like a next generation of model, that is actually like a big leveler, right? And if you believe that there will be continued availability of like more and more powerful base models, that that's a big leveler of the playing field in terms of having like, you know, self-improving models. And so that's an interesting thing that people have not really, really talked about. There are different ways to have value from being at the frontier. One of the things that was really interesting to me was that like the deep seek mobile app became like,

Starting point is 00:08:38 you know, cop contender in the app store for a little bit. I think there's one belief that most, like cheapest, most capable model in the market actually matters to consumers

Starting point is 00:08:47 and they can tell and that will drive consumer adoption. And that's what happened. And like, that's why you need to have the soda model to create these new experiences. And there's a competing view

Starting point is 00:08:56 which is just like, well, like this whole drama is quite interesting and people are trying it as much because like they want to see what the leading Chinese AI model

Starting point is 00:09:06 is like if it's as good as Open AI and Anthropic and such. I definitely believe that leading capability can lead to novel product that draws consumer attention. But I think in this case, it's more the latter. Two other things that kind of happened this past week was on the opening eye side, one is they released deep research. So speaking of really interesting advancements and capabilities.

Starting point is 00:09:26 And then secondly, they announced Stargate, which was, you know, a massive series of investments across AI infrastructure that was announced with Trump at the White House. What are your views on those two things that in some sense kind of overlap in terms of open AI really advancing different aspects of state of the art in terms of what's happening right now? Deep research is a really cool product. I encourage everybody to try it. The biggest deal to me is that it immediately raises the bar for a number of different types of knowledge work where I might have hired a median intern or analyst before. I mean, we don't do that here. But where one could hire a median analyst or intern, I'm going to immediately comp a bunch of their work to what you could do with deep research and like your ability to do better with deep research. And like the comp is hard. I'd say it is a really valuable product. I expect other people to adopt this pattern too.

Starting point is 00:10:20 But I think it's a really novel innovation. Kudos to the team. I would say I think it is more useful, at least to me upon first blush. I'm sure they're working on this. In domains, I understand less to do surveying. and to like make sure I have a comprehensive view and understand who the experts are versus like in an area where I feel like I have a lot of depth. I take issue with its implicit authority ranking and its ability to determine like what

Starting point is 00:10:45 ideas out there, what on the web is good and not when it's doing its search. From at least my initial prompting and experimentation in domain, I'm like, oh man, like you're really going to have to audit the outputs here. It will orient you, but you can't take as given like many of the claims here. It's the AI form of Murray Jellman Amnesia, which was coined by the guy who wrote Jurassic Park. I can never remember his name is pronounced Jellman or Gellman. Murray Gellman was a physicist who came up with quarks and a few other things. He was a Nobel Prize winner and was considered widely brilliant.

Starting point is 00:11:19 And it was named after him by Michael Crichton, which was basically the idea is if you're reading a page in the New York Times about something you really understand. And you're like, oh, this is like so dumb and how could they write this? And I don't believe it. And then you just turn the page and you look at something you don't. know anything about and you assume that they get it all right, why would you do that? You instantly forgot that they got everything you know about wrong. Why would they get the other thing right? Maybe they also got that wrong. And so it's this really interesting kind of cognitive dissonance around, you know, what is this thing actually know or not know? And, you know,

Starting point is 00:11:49 if it's getting sort of expertise wrong in a domain, I understand, does that mean? They're also getting it wrong in domains I don't understand, but of course we never apply that as people. We just assume, of course, it's right in the domains that we don't understand, which I think is really interesting psychologically, but it also has real implications in terms of how people will use AI in the future in general, because these things will become the definitive source of a lot of people's primary information, right? It's in some senses really overlapping with some of the search use cases in really deep ways, and you have something where the sources traditionally have been less evident. I know that people are working on different ways to

Starting point is 00:12:24 surface what the primary sources are for some of these things, but it does have really interesting implications for how you think about knowledge in the modern era. As you're using, AI, especially as you're using agents, so they just go and do stuff and then report back and you don't even know what they did. So I think it's a very interesting topic. I'm not sure how you solve that from like a UX perspective, or maybe it's like somewhat unsolvable given, you know, it also reflects like what is knowledge on the web. It really does feel like a really dangerous thing from sort of a propaganda and censorship perspective. And so, you know, social networks were kind of V1 of that or maybe certain aspects of the web were V1 and social networks for B2. And this is kind of

Starting point is 00:12:57 the big version because it's a mix of search. It's like if you mixed Google, with Twitter, with Facebook, with everything else that you're using, with all the media outputs or media outlets, all into one single device that you interrogate. That's kind of where these AIs are going. And so the ability to control the output of these things is extremely powerful, but also very dangerous. So, you know, that's why I'm kind of happy

Starting point is 00:13:19 that we're in this multi-AI world, multi-company world, there's a way to offset that. And that's where open source becomes incredibly important. If you worry about civil liberties, what do you think about Stargate? Maybe there's like a couple different implied questions in like Stargate, right? One is how much does it matter in the race, like to continue to have access to the largest infrastructure?

Starting point is 00:13:44 I'm going to skip the question about like whether or not it's real. Like there's a lot of money involved here. I think another question is how deep are the capital markets to continue funding this stuff? Maybe a final one is like just the involvement of different sovereigns or quasi-sovereigns. in this. Like, I don't know if I have a strong opinion on the latter two. The way I think about the dynamic of like how much does the capital matter and like the implied like how like do we continue to see scaling on pre-training be a dominant factor is I think of it really as like uncertainty rather than than risk, right? Like if you think about capabilities as emergent and

Starting point is 00:14:26 people not being sure what sort of algorithmic efficiencies counteract, like the, you know, improvements that will come for more scale and the things you can do to generate new data to improve in other vectors and what we're going to get out of test time scaling. Like, I just think it's very hard to predict. But I fail to see a scenario where anybody trying to build AGI, any of the large research labs wouldn't want the biggest cluster that they could have if it was free, right? Or if the capital was available to them. and that to me says more than anything else.

Starting point is 00:14:58 Like, we are going to get more out of pre-training. Is it going to be as efficient? Like, I think that's unlikely. We're like a little bit delayed on this, but we'll just give ourselves a free pass given it's episode 100. Predictions for 2025. Happy New Year.

Starting point is 00:15:14 It's February, but I'm talking to anything happy New Year. It's like the Larry David episode. Yeah, basically, there's some statute of limitations of how late at the year you can say, Happy New Year. We're now a month in, so of course, we're way over that. We should probably say, Happy Valentine's Day, even though we're like two weeks early.

Starting point is 00:15:28 No, a lot. You don't, like, what was the vibe? The vibe for 2025 is you can just do things. You can just say, Happy New Year. Happy New Year a lot. Yeah, I can do what the fuck I want a whole year. It's going to be amazing. Yeah, so on 2025, I think there's a few things that are likely to happen.

Starting point is 00:15:42 First, the foundation model market should at least partially consolidate. And it may be in the sort of ancillary areas. So that's image gen, video, voice, you know, a few other areas like that, maybe some secondary LLMs or foundation models will also consolidate. So I do think we're going to see a lot of consolidation, particularly if the FTC is a little bit more friendly than the prior regime. We'll also see some expansion of sort of new races in physics, biology, and materials and the like.

Starting point is 00:16:09 So I think that that will happen alongside just general scaling of foundation models will continue. And that includes reasoning and that includes other things. So that's one big area. I think a second area is we're going to see vertical AI apps continue to work at scale. It's Harvey for legal, Gacogon and Sierra for customer success and, you know, a variety of folks for code gen, for medical scribing, et cetera. So I think it'll be the era of vertical apps. And I think a subset of those will start adding more and more agentic things to them. You know, some folks like cognition already doing that.

Starting point is 00:16:42 Third would be self-driving will get a lot of attention. Obviously, Tesla and Waymo are starting to see really interesting adoption on full-site driving, on robo taxis, etc. Applied intuition, I think, is kind of a dark course to watch. more generally on the automotive stock. And then I guess fourth is that some consumer things, I think, will get really large-scale experiments happening in a way that hasn't happened until now. So I'm starting to see consumer startups.

Starting point is 00:17:06 I'm starting to see more consumer applications from incumbents. Like, I actually think we're going to see a little bit of a resurgence in consumer. I may take a while, but I think that'll happen. And then lastly, I think there's things that we all know will happen. And they're really early. But we may start to see some interesting behavior around agents. maybe some early robot stuff, but it'll be one of those things where it's more going to be the glimmer of how this thing will work versus the whole thing. But I think some of those

Starting point is 00:17:33 developments will be very exciting. So those would be my five predictions for 25. How about you, what do you got? We agree on a number of different things. I think the whole like definition for agent is super fuzzy. But if we just think of it as like do multi-step tasks successfully in some sort of end user environment and take action beyond just generate content. Like, we're already seeing that. I think we're going to see that more broadly as people figure, you know, reasoning models get better and product companies or vertically integrated companies, they get better at handling failure cases and managing state intelligently.

Starting point is 00:18:09 And so we're already seeing that in security and support and SRE. And I think that will continue to happen. This already happened in Kodgen, as you were sort of alluding to. But I think companies doing co-pilot products will naturally extend. to agents. They'll just try to do more, right, and take more on. I think one of the inputs to broader consumer experimentation, as you describe, is just like way more capable, like small, low latency models. I don't think we have like any monotonic movement toward compute at the edge. Like, I think when people are like, edge compute, for the sake of edge compute, I'm like,

Starting point is 00:18:45 nobody cared, right? But if you can make that transparent to the user and it's free, then I think your ability to ship things that are free is obviously unlocked, and I think that's cool. I also think there will be a lot of web apps, you know, so I don't think it has to necessarily be on-device. Consumer products, I undoubtedly to your point, there will be some, but I also just think it's just going to be things running on the internet that just become part of your application stack on your browser that will do really interesting over time.

Starting point is 00:19:14 Yeah, well, stuff in the browser can also use the GPU, but I just think that the ability to run locally might be a big unlock for them. I don't know if you and I disagree on timeline. I think we're going to see technical proof of breakthroughs in robotics and in generalization this year, though not deployments. I think one thing that's like maybe mispriced just because it's very new is like people don't really know how to think about reasoning. I would claim that one thing is as much improvement in reliability as complexity of task. One like mistake that entrepreneurs and investors make that I have made is like you look at something and it's not working. and like the issue is it is like a technical issue

Starting point is 00:19:55 and then you like assume it's not going to work but I think in AI you have to like keep looking again and again because stuff can begin to work really quickly. Maybe one last one with that I'm just, I've seen like small examples of with our like embed program and also broadly in the portfolio is because you have this diffusion of innovation like not just with customers but with the types of entrepreneurs that go take something on and we're like beyond just the tip of the spear now and more and more people

Starting point is 00:20:21 are like, I can do stuff with AI, I think we're going to get more smart data generation strategies for different domains where you need, like, domain knowledge as well as understanding of AI. So examples here could be like biology and material science. Like, you needed the set of scientists who are capable of innovating on data capture, which might literally be like a biotech innovation versus a computer science innovation to understand the potential of deep learning and that that the bottleneck was data. and then the type of data you were looking for. And I think that is happening.

Starting point is 00:20:55 And so I think that's really exciting. This may be the year where we see something really interesting happen on the health side as an example where you need specialized data, but it's not as hard as, you know, the atomic world of, you know, biomolecular design or something. Anything else we should talk about? The facial hair.

Starting point is 00:21:13 Okay, should I bring it back? I liked the beard. I like the beard and hat era. Oh, interesting. Yeah, maybe I should go back to that. The last question for today, we're on episode 100. What do you think the state of the world will be relative to AI when we're at episode 200? I don't think we're part of this anymore.

Starting point is 00:21:29 I think it's just like two agents going back and forth teaching us stuff and like you and I are no longer the hosts or the choosers of topics. We're just like nodes into the network. Will they be as good looking as us? Yeah, and they'll be better computers. We'll see. Still like some art more than some mid journey art of them. There's some beautiful things on there. Okay.

Starting point is 00:21:52 Episode 200, that's like what? Well, it's almost two years if it's weekly, so. I think we're either in the RLHF farm or we're like sitting on a beach in Abiza post-abundance. That's a prediction. You heard it here first. Well, hopefully I'll see it at Episode 200 or in Abiza. I think the third alternative is not as great. Okay.

Starting point is 00:22:08 And all the listeners too. Thanks, guys. All right. Thanks, everybody. Find us on Twitter at No Pryor's Pod. Subscribe to our YouTube channel if you want to see our faces. follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week.

Starting point is 00:22:24 And sign up for emails or find transcripts for every episode at no dash priors.com.

No Priors: Artificial Intelligence | Technology | Startups - DeepSeek, Deep Research, and 2025 Predictions with Sarah and Elad

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.