No Priors: Artificial Intelligence | Technology | Startups - Humans&: Bridging IQ and EQ in Machine Learning with Eric Zelikman

Starting point is 00:00:00 Hi, listeners. Welcome back to No Pryors. Today, we're here with Eric Zellkman, previously of Stanford and X-AI. We're going to talk about the contributions he's made to research, reasoning, and scaling up R.L. As well as his new company, Humans End. Eric, thank you so much for doing this. Thank you. You have had an amazing impact as a researcher, including starting from just your time at Stanford.

Starting point is 00:00:28 I want to hear about that. first background of how you got interested in machine learning at all. I guess going back really far, I've been motivated by this question of like, you have, you know, all of these people out there of like all of these things that they're really talented in, all of these things that people are really passionate about, that you have like so much, like, you know, there's just so much talent out there. And I've always been like a little bit disappointed that like, you know, like so much of that talent doesn't get used just because everyone has like circumstances and like has like

Starting point is 00:01:04 these you know situations where you know they can't actually pursue those things and so for me all of humanity is not living up to their full potential i mean and then you've got to do i i mean it's the thing i've always been excited about is like how do you actually build this technology that frees people up to kind of do the things that they are passionate about like how do you basically you know, allow people to actually focus on those things. You know, originally I thought of automation as kind of like the most natural way of doing that, like, you automate away the parts that like people kind of don't want to do and that, you know, freeze up people to do the things that they do want to do.

Starting point is 00:01:42 But I guess I realized, like, increasingly that that's like, it's actually like pretty complex. You actually have to understand if you want to empower people to do what they want to do, you have to really understand what people actually want to do and building systems that understand kind of people's goals and outcomes is actually really hard yeah did you have like this human-centric perspective when you were choosing research problems to work on originally I guess like at the very beginning I was just like when I was choosing research funds I was just interested in like how do you actually make these things half decent okay so it's more increased capability at all first. Yeah. I think for me, like, you know, when I looked at like AI, like,

Starting point is 00:02:29 or, you know, language models back in like 2021 or whatever, you know, I was like, these things aren't very smart. They can't do that much. And there was some like early work around there. Like that show that, like, for example, you could use like chain of thought to like, you know, get models to answer more smartly. But it was still like only like a small step improvement at that time. Like, there was still the, you know, the benefit of that was, you know, as much as you can really get with just prompting. And so back then I was, like, thinking about, look, how do you actually make them, like, half decent and actually solving these harder problems? Can you give a broad, like we have everything from researcher audience to business person audience here? Can you

Starting point is 00:03:12 give a broad intuition for a star? I guess the intuition is if you have a model and it's, it's able to solve these like these like slightly harder questions by thinking about them, then what if you actually teach it, like, hey, this solution that you came up with that got you to the right answer, good job. Or, you know, if you, or if the model didn't, then you basically like, don't reward it. I guess the original version of SART actually had like, or yeah, there were like no, there wasn't a baseline at the time. We compared it to reinforce, which is this like popular algorithm in, I guess, reinforcement learning, like very simple like policy gradient thing.

Starting point is 00:03:55 But yeah, I guess, you know, at the time, it was like a very simple algorithm. Just, you know, you iteratively generate solutions. If the solutions get you to the right answer, you learn from them. If they don't, you don't. And then you just kind of keep doing this as the model solves harder and harder problems and then learns from harder and harder problems. Did you, at what point in the research, if at all, were you surprised by how well it worked?

Starting point is 00:04:19 Or did you have some intuition for this being like something scalable? There was one experiment that are a number doing, though this was quite a while at this point. But we looked at the, I think it was like N digit addition or multiplication. Sorry, it's been a second. And one thing that was really interesting was that this, back then, this was like a task that was considered like hard for language models. Yeah, of course. It was considered like one of the examples of why they were still so stupid. Yeah.

Starting point is 00:04:46 Yeah, like. Exactly. And I was like, okay. And one of the really interesting things for me was that as you actually trained for more and more iterations, the number of digits that it was actually able to do kept increasing. And I think that this was like one of those big surprises for me. Like, like, oh wow, like there's no obvious plateau here. And did you go directly from that to generally this should scale? I think I was generally like the interest in like, yeah. I think there were a few things though. Like there was one part of it that we introduced. to kind of, we observe that there was a bunch of the data that them all wasn't learning from. And so we proposed another variant of this, where we actually were like, oh, what if you actually take the ones where it fails and you basically like ask it to a reason about like why it should have gotten it right? And then you train as if it got it right. And this version was kind of a way of extending beyond the kind of the parts of the data that it couldn't see.

Starting point is 00:05:43 So if you only train on like the positive examples, then you end up in this kind of like potential minimum where there's just no more data that it can actually solve. And so back then we were like, what if we just show it the problems that it didn't solve and try to teach it from those. But I guess another thing that other work has done since then is, oh, what if you just sample a lot? And that also seems to work in those works. Star has become a broadly used part of the reasoning paradigm since you published. Can you also describe, I think this was like sort of your last published work, like QSTAR? So Quiet Star was kind of the, yeah, the last thing that I did back at Stanford.

Starting point is 00:06:26 And it was really fun. I guess we showed a few things that were kind of cool. One of the main goals of that paper was to show that you could actually scale this up to like pre-training skill by using like basically pre-training style data. I guess now there's like a bunch of these works that have come out recently around like, you know, RL pre-training and stuff. like that. And that's, you know, I guess in some ways similar to some of the, what we showed in the Klystar work. Instead of having question, answer, if you actually just have, like, you know, these arbitrary kind of like chunks of text, for example, and he tries to predict what's going to

Starting point is 00:07:02 come next, which is, like, the standard language modeling objective, can you actually get models that more generally learn to reason? One of the kind of cooler things that I think is kind of overlooked about the original quiet start paper is we showed a bunch of like kind of key improvements to the star paper that were necessary to actually do this kind of thing so that was for example showing that it's really valuable for this algorithm to be online showing that it's really valuable for to have a baseline where you like you know the harder for harder problems you learn more for easier problems you like you don't learn quite as much and I think that There were a bunch of like nuggets in there that even at the time I don't think I fully, you know, thought of as like, oh, wow, that's actually like a cool improvement over the original thing.

Starting point is 00:07:55 So you ended up going to GROC for several years and you, sorry, XAI for several years. And you worked on a bunch of different paradigms. So pre-training data for GROC 2. And then overall, the reasoning recipe for GROC 3. I'm sure I'm missing things, but tool use and agentic infrastructure for Rock 4. I guess when you, if you level set us today, like, how smart are models? They can obviously do n-digit arithmetic at this point. I guess in terms of IQ stuff, I'd say like there's a lot of, and if you're able to pose the problem, like, very well, like some very advanced like physics problem or math problem, I would say they're, they're, they're, they're

Starting point is 00:08:39 reasonably smart. I think, like, a lot of the failures that people see... Give me a human comparison. What is reasonably smart? I think it's hard to compare directly because it's very jagged. Yeah. Like, like, it's true that, like, some of these, for example, some of the HLE questions that these models are able to solve are genuinely things that are, like, non-trivial for, like, actually PhD researchers. I'm not saying they're, like, open problems or anything, but they are, like, pretty non-trivial. trivial. Yeah. Also, a lot of them are like, you know, one interesting category of like these, I spend a lot

Starting point is 00:09:16 of time looking at kind of the HLE questions. One interesting category of them. Sorry, humanity's last exam for anybody who isn't looking at these evils. No, great. Yeah, so, yeah, looking at these humanities last exam questions, I kind of, one kind of category that is like actually quite big are these like tricker, trick questions that require, you know, basically people like if you're familiar with it you'll be like oh they're trying to get you to like assume something but actually like if you think more carefully about this problem

Starting point is 00:09:48 that assumption doesn't hold um and this this turns out to be like a bunch of those kinds of problems so i think it's like it's they're pretty smart but also they're more i think tripped up by some of these like tricky things um but also they don't really i think one of the core things is that they're not smart like emotionally or like they're not smart on the level of like actually understanding kind of what people care about or kind of like how to actually like help people accomplish the things that they care about. I want to talk about this and your next mission, but just on this topic, if even jagged intelligence within like the IQ domain, which I think almost everybody in the industry has been focused on until now, what would you recommend for

Starting point is 00:10:33 people who are not researchers to develop some sort of intuition for that surface. Because that seems very important to making them useful. Yeah. I guess one thing that's that I think is like really important to keep in mind is that like the more kind of context you can give the current generation of models, the better you kind of are. The better off you are. Their answers are super sensitive to like, you know, whatever additional information you can give them. Yeah, I think this is like a really important thing. I would generally say, like, existing malls are particularly good at handling questions that are like easy to answer in kind of like a closed form.

Starting point is 00:11:17 Like, if there's like a, you know, a simple numerical answer to what you're asking or like a simple, like, way of choosing from a set of things, this is something that these malls actually like it obviously it's like all dependent but this is something that makes it easier for them all if you can imagine it being easy to check your answer that actually I think makes it easier for the models what do you think is the most dominant explanation for attempts to use models and very very more verifiable domains like code still failing at sophisticated tasks Is it just like the wrong context has been fed to them? Is it context window is simply not large enough to support the like scratch pad and continual testing?

Starting point is 00:12:05 Like why in those domains, what is the biggest challenge? Part of it is there's, I think, a balance. When people kind of want to give users these models, it's actually important that they're not annoyingly slow. And so I think there's actually like a number of problems where like if you gave the models more time, you know they would actually be able to answer better but for example in the kind of coding context you kind of have to be reasonably responsive at least it depends it depends on the kind of setup right like if you look at products like you know opening eyes codex which you know is kind of this longer running background thing versus like a cursor which is like you know more

Starting point is 00:12:49 interactive. You have a bit more luxury with those more background approaches to tackle harder problems, I'd say. Yeah, I think it's a tricky question. A lot of things depend on how far the distribution of what you're asking is from the distribution that the models were actually trained with. So, you know, if you happen to be asking a problem that's very similar to the kind of problems that it's seen before, then, you know, it'll do great. And if you're asking a problem, that's, like, very, yeah, out of domain. So, like, to some extent this question is kind of hard to answer concretely without, unless you know, like, basically what the, what the URL data for a lot of these, you know, specific tasks is. Right. And today, obviously, none of the model or code

Starting point is 00:13:45 agent code interface companies are going to release like a capability map for you of what their URL data looks like, which would be very useful because, I mean, intuitively, unless if you just look outside of the pre-training internet data sets, right, there are types of problems and types of code bases that are much further out of distribution. And so when engineers try in those scenarios. Obviously, like, a dumb, dumb agent back, right? And, and, you know, also, like, the, another thing that matters a lot is just, like, how verifiable are the things that you're trying to get them all to do. I mean, obviously, there's been, you know, a ton of work out there on making models, like, less dependent on verifiable words. Lots of cool published papers.

Starting point is 00:14:39 I believe most people would say that there's still a gap between how well these models. perform on verifiable tasks versus not-verifiable tasks. Yeah, absolutely. Last real question on IQ, but because it is where 90-plus percent of industry energy, literally energy and compute, is focused, how would you characterize where we are in scaling and the obvious opportunity to improve from here?

Starting point is 00:15:06 There's still meaningful dimensions of scaling that haven't been, I think, fully explored in terms of, you know, IQ. I think there's a lot of cool efforts out there. There's a lot of cool stuff that can, you know, that can still be done on the capability to access. I do think that one, as you start thinking about some of these new kind of axes of scaling, it's actually very natural to realize that, like, there are ways to do them

Starting point is 00:15:36 in ways that incorporate people, and there's ways to do them in ways that kind of leave people. out more and more. And being very mindful of, oh, hey, I'm designing this new algorithm. And it's going to scale IQ, you know, of this model by X amount. If you effectively, like, to effectively keep people, to effectively keep people in the loop is actually like a very active decision. And so, you know, I think in general, if you're thinking about these things, that's important. Wouldn't it be fair to claim that the instinct of many labs is to like try to get people out of the loop as much as possible from a scaling perspective? Because it's very messy, right? If I want to recruit people to, for example, take complex reasoning traces off them in tasks that are not in distribution for me yet.

Starting point is 00:16:31 That is not as simple to execute on for an organization as like more. rollouts, right? And so why is that important at all from a capability's perspective? Yeah. That's a good transition to like, what are you doing? Yeah. I'd say that it, the main thing is just that like, as you kind of have these models that, you know, expand in terms of like the horizon that they're automating, you know, you have these models, the recent or recent-ish, IMO results are like a kind of a good example of this. You have these models that go on for like, you know, hours of, you know, reasoning without any kind of human intervention. And this has kind of been an increasing measure of success, I would say, for these labs.

Starting point is 00:17:20 So, for example, you know, there's this METR, like, meter, like a benchmark that everyone likes to share whenever there's a new model. And it's like, oh, we went from being able to have these models work for, like, complete two-hour tasks autonomously without having been intervention to 2.5 hour task without human intervention. And obviously there's like questions of like what do those numbers actually mean and how should we take them like kind of at face value. But regardless, this is kind of like the metric that, you know, people are looking at more and more to measure progress. But you know, as we kind of get these models that increasingly, you know, remove people from the interaction, you end up with basically people having less say in kind of the

Starting point is 00:18:13 things that get built. You end up with like, you know, I think if you have a model that goes off and does its own thing for like eight hours and comes back to you with like something that like is somewhat there, I think this is like a weird regime where like people probably feel less like real agency over the things that they're building. And I think also, I kind of anticipate that people feel like they don't really understand the things that are being built. That's already true. I think it's already true. 20,000 lines of generated code looks good to me.

Starting point is 00:18:46 Yeah, it's just like you make these PRs and they're like 100,000 lines of like, you know, like. And I think in general this is kind of going to be part of the trend. So do you think that it's important to have humans in the loop of, you know, producing the output or the reasoning? because the ceiling is higher with humans who are in the loop, because it is more efficient because we can error correct when models are off path or philosophically because people want that or like some combination of all three. Yeah, I think it's probably some combination.

Starting point is 00:19:19 I think another thing that I kind of think about is like, you know, the most natural thing to do as you kind of automate away the existing set of tasks is, you know, you kind of look at the world GDP, you like carve out the parts that are like, you know, most easy to replace with these models. And, you know, that's kind of the things that you target. Like, oh, wow, you know, coding is like an X billion dollar market. Let's automate all of that.

Starting point is 00:19:45 Or, like, you know, this other segment is like X billion dollar market. Let's automate all of that. But I actually think, like, if you kind of empower people, if you have models that really understand what people are trying to accomplish and really support them in accomplishing those things, you have the potential to actually grow that pie instead of basically replacing all of those segments. And I think in general, if the purpose of these models is to kind of replace the person for like this chunk of work, you end up with a lot less, I think, real innovation on kind of what's possible.

Starting point is 00:20:23 Yeah, I think if you actually have models that really understand what people's goals are and really empower them more, you end up in a very different situation. Because we're going to push those capabilities into areas that are out of distribution for them. Okay. Yeah. I think... Is that accurate? I'm just...

Starting point is 00:20:41 Yeah, no, I'd say so. I think it's like when I say that, you know, I'd like to work on models that, like, empower people instead of replacing them, people are like, oh, yeah, sure. Like, but I'd rather, like, you know, work on curing cancer or something. Obviously, that's a really important goal, right? building models that are able to kind of solve, you know, humanity's most difficult and most fundamental problems is, like, incredibly important. But I also think that, like, and, you know, I'm sure that many of the researchers in the field disagree, I guess in the long run, we'll see kind of what plays out.

Starting point is 00:21:18 But I personally strongly believe that we're much more likely to solve a lot of these fundamental human problems by building models that are really good at collaborating with large groups of people that are really good at understanding different people's goals, different people's ambitions, different people's values, understanding different people's weaknesses and how to kind of coordinate with these large groups of people to make everyone more effective. And I think the vision of this AI that goes off on its own for like 20 hours, does its own thing and kind of like, you know, comes back with like, you know, the answer to life, the universe, and everything. I think that this is like less likely. I think it's, you know, this is like a,

Starting point is 00:22:04 I guess we'll have to see, but I think it's less likely. So that goes to you are starting a new company, humans and. I remember being like actually quite fundamentally surprised given all of your work on IQ and reasoning and coding and scale that you were interested in essentially EQ. And you also thought of EQ and tell me if this is a wrong characterization as like the the emotional or the interactive capabilities of models today have really shown up in things like character or like companionship tools only and you thought of it as also like enablement from a productivity perspective right so tell me about like where this thread came from yeah i guess i've been thinking about this kind of stuff for some time now.

Starting point is 00:22:54 Like even back in my PhD, I think one of my, I guess, less well-known works was actually about, we show that you can train language models to simulate different kinds of students. Right. For tests. Yeah, yeah.

Starting point is 00:23:07 And by simulating students, you can actually design better tests for those students. And that was like a really cool finding. Like, hey, if you have models that are really good at modeling people, you can actually design systems that are better for people. And like this was something that like I found really cool and and kind of as we move towards the current kind of capabilities frontier, it became one more obvious that like the, you know, we have these incredibly smart models that are like capable of so much, but they're not used for anywhere near what they're capable of. Like the role that they play in people's lives like is a lot less deep, a lot less positive than it could be.

Starting point is 00:23:51 And I spent a lot of time thinking about, like, okay, why is that? Like, why are these models not, like, more, like I said, deeply, positively integrate into people's lives. And it seemed like a really big part of it is, like, that fundamentally these models don't really understand people. They don't understand people's goals. They're trained, I would say part of it is like the general kind of training paradigm that the field is in.

Starting point is 00:24:15 It's very, I would say, single task focused or task-centric. It's ludicrous that all the benchmen. are still oriented this way. Yeah. Yeah. I mean, like, like, or most of them. You know, I mean, even, even the ones that are like, like, there's very few benchmarks out there that actually try to consider like, oh, what if you actually have like a person

Starting point is 00:24:32 that's interacting with this model? Like, you know, at best you have like some, you know, multi-turned benchmarks that, like, try to simulate what an environment would respond in different, you know, to different inputs. But even that is like still, like, far from, you know, considering, hey, if you actually have this model that interacts with a person for, like, you know, some amount of time, like, how does it actually affect that person's life? It's really remarkable that the field is kind of, like, so stuck in this kind of task-centric regime. And I think it, but it makes a lot of sense. One thing that I was told by some folks at, you know, at Google is that one of the reasons is

Starting point is 00:25:15 that, like, it's actually very useful for, like, credit assignment. So, like, being able to have, like, these benchmarks that are very easy to quantify and very easy to like relate to some like immediate thing means that you can kind of say like oh yeah this like you know this this team did like 2% better than this team so they deserve like all of the resources or you know this team like improved the benchmark by like 10% while this team improved it by 5% so you know let's allocate accordingly and I think in general like that's that's part of it I think another part of is like kind of more aligned with the easiest ways to train these models. It's not easy to, you know, have these oral environments and stuff.

Starting point is 00:25:56 You have lots of these companies popping up, obviously, that are trying to sell, you know, environments to different people. But... The most popular are, of course, in coding and computer use. Yeah. Rather than anything that requires simulating people. Yeah, it's not that surprising that we're kind of in this current regime. So what do models need to know about people? Or like what capabilities are they either missing or have not been elicited from them?

Starting point is 00:26:28 The most fundamental thing is that the models kind of don't understand the long-term implications of the things that they do and say. When you treat every turn of a conversation as kind of its own game and you, you know, you basically think of it as like, okay, you had this interaction, you're done, you need to make sure that this one response has all of the possible answers, all the possible content. You don't ever like ask questions. You don't ever like try to clarify things. You don't really tend to express uncertainty. You don't tend to be proactive. You don't tend to think about the long term. Like you see a lot of like even single turn side effects of this kind of regime. Like at most of them are treated as kind of their own problems to solve. You see issues around like that people highlight around like sycifancy. You see issues that you know, there was recent news around, like, you know, the psychosis stuff. There's a lot of

Starting point is 00:27:22 these, like, harmful effects that you get if you think about things in this very single task or, like, task-centric way. But if you have models that actually consider, you know, the long-term implications of, oh, hey, if I tell this person to start, like, you know, a company that, you know, sells gloves for catching ice cream. If I like tell them that that sounds like a good business idea, they might actually go and they might actually build that business and they might realize that it was not actually a good business idea. Having a model that can kind of roll out the long-term implications of the things that are said. And then they won't trust me anymore and then they won't pay for my compute. Exactly. Exactly. No, I'm kidding. I think that's really interesting. Like one of

Starting point is 00:28:08 the very core principles we have at conviction for how we make decisions. decisions is, well, what is the very long-term thing we want, right? And like, if that is the customer or the founder in this case or an LP or even for us, like it actually simplifies things quite a bit if you say like we're optimizing for like a decade plus versus like this interaction. And so being single term versus multi-turn seems like a very different way to make decisions. It seems very hard to collect data about multi-turn human interactions, especially when you get to times, you know, it's actually like analogous to a problem in biology of

Starting point is 00:28:48 how do you study diseases that just take time to progress? I think it's a really fundamental question. I think there is actually like some good academic work that has started to explore some of this. Yeah, there's some work recently around like, you know, RL from human interaction. There's some, there's a cool paper called a CoLab LM, you know, that trains against like, you know, simulation. There's a lot of very cool work kind of, starting to explore this in academia. But in general, I would say there's a lot less attention being paid to this kind of stuff in industry.

Starting point is 00:29:22 Because I would say for most labs, and maybe this is a strong statement, but I say for most labs, the human is kind of the intermediate until you have this fully automated system. And so spending a lot of time optimizing things for being really good understanding and really good interacting and really good at collaborating with it. It's kind of like almost like an intermediate thing you have to do until you get to this like, you know, fully automated point.

Starting point is 00:29:50 Mm-hmm. Can you paint a picture of like if we have models that better understand human objectives over different timescales than are good at interacting with humans? How is that more integrated into like your life five years from now? Yeah. I think you don't you don't need to go that far out. Two years. But, yeah, I think you get a lot of behaviors that you currently don't really see in these models.

Starting point is 00:30:19 I think you have models that are much better at understanding how the things that you say and ask fit in to the overall, like, context of the stuff that you're doing. Like, for example, like, if the model knows that you're going to, like, you know, some wedding, for example, and then you ask you about, like, booking, you know, hotels in Paris, it might, you know, you know, consider, oh, hey, like, around the time of this event, you know, I know that this user, like, has all of these things that are true about them, like, a model that's generally able to kind of think about how every thing that you say fits into your understanding of that person would just be, like, I think, a very fundamentally different interaction. Because right now, if you want to ask a question like that, you kind of have to dump all of this context in, you have to tell, like, oh, you know, I, can you have a,

Starting point is 00:31:11 find a hotel in Paris. This is because, you know, I'm going to, like, a wedding. I have, like, you know, these constraints. I, you know, I have, like, these people who need to be with me. I have, like, you know, it needs to do this. It needs to be, you know, you have, you basically just dump all of the context that's relevant to yourself into the model. And. Which is also an expensive interaction. And something that most people won't do. Imagine if you had a friend where you had to re-explain everything about yourself to them. Every time you spoke, yeah. Like, can you imagine if every time you interacted with someone you basically, like, they remember, like, your name and, like, you know, maybe what you do and, like, just, like, the really high-level sketch of your life, like, it would be, that friendship probably would not last very long.

Starting point is 00:31:59 Yeah, I think that's kind of what the current models are. So you'd argue that the, like, any investment in memory that today's models have is not, it's not that interesting or that. that core to their capabilities today? I would say that memory is definitely like a feature that has been under invested in by the field. But I would say that it is kind of difficult to invest in memory in this very like task-centric regime. Because if you have like a bunch of these like independent tasks, the amount of information that each of those needs from other things that you've discussed is not all that high. Like, because of the current paradigm, memory doesn't end up being super useful in the training.

Starting point is 00:32:45 And so these malls are not particularly good at doing it. So one other thing I said to you, I think out of like fear instinct than anything else, but I feel like other people will have this reaction as well is, I'm a unique snowflake. You can't possibly simulate, you know, me and all of my self-consistency. issues between like, I want to learn this today, but I don't actually want to do the work. I want to eat cake, but I want to be in shape as well. Like, you know, we have different time skills and change our minds. I'm just constant distribution shift.

Starting point is 00:33:20 Like, and then you can't possibly bring all of us under distribution. Like, what, how do you react to that? I think to a certain extent, it's probably a little bit true. It's not easy to build these like really good models of people. But I do think that the task for the model needs to be. that it should be trying to do that. Like the model needs to actually be like trying to learn all of these, like trying to learn about you, trying to learn about, you know, the things that you care about.

Starting point is 00:33:48 Like the actual objective of the model needs to be to kind of understand you. And like it probably won't be perfect. Like, but boy, you know, like you can be a lot better than the current models. Like that seems totally reasonable actually. Yeah. I think. And you know, it's it's something that I. I think as a field, we will probably get better at. I'm not going to pretend that, you know,

Starting point is 00:34:13 I'm going to one-shot this problem. But I think even like any serious effort that gets you quite a long way. So there is like a cult sci-fi series about the culture where you have, you know, these super intelligent minds. And essentially all of the human and human-like races live in a society where the minds make most of the decisions. And there's like, I forget the total humanoid population. But let's say there are 30 or 40 minds that are still relevant as people in terms of perhaps being out of distribution or providing reasoning that the minds cannot. And everybody else just lives in a world of abundance where they're like rock climbing and hang out or whatever. And they do not produce. How is your view of abundance different?

Starting point is 00:34:59 Everyone kind of has things that they're passionate about and given the opportunity. I think people can do like really cool things. I think the role of the model should be to allow people to do those really cool things that everyone kind of wants to do and accomplish those things that everyone kind of wants to accomplish. And I think like, you know, we shouldn't outsource all of the thinking and all of the, you know, everything to these, you know, AI overlords or whatever.

Starting point is 00:35:28 I think what we really want are models that are able to empower us. Amazing. Okay, super unique mission, amazing research work, you're hiring an early team, getting a lot of compute. Who are you looking for on the recruiting side? One thing that I think is actually probably a good thing that my previous company did is thinking of everyone kind of, to some extent, as like engineers. I think I'm looking for really strong info folks who can build stuff. I'm looking for really strong researchers who can build stuff. I'm looking for really strong product folks who can build stuff.

Starting point is 00:36:02 I'm looking for people who, like, have thought a lot about, like, users who've thought a lot about, like, memory, you know, on the research side, I'm looking for, you know, on the infraside for people who've thought a lot about, like, distribute systems, really fast inference, people who've, you know, been there to skill really big projects up. On the product side, I think people who are, like, you know, really creative about, like, new modes of interaction, people who have, who really deeply care about building beautiful, tasteful products. Awesome. Thanks so much, Eric. Thank you so much. Congrats on the new company. Thank you so much. Find us on Twitter at No PryorsPod.

Starting point is 00:36:43 Subscribe to our YouTube channel if you want to see our faces. Follow the show on Apple Podcasts, Spotify, or wherever you listen. That way you get a new episode every week. And sign up for emails or find transcripts for every episode at no dash priors.com.

No Priors: Artificial Intelligence | Technology | Startups - Humans&: Bridging IQ and EQ in Machine Learning with Eric Zelikman

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.