The AI Daily Brief: Artificial Intelligence News and Analysis - How AI Is Shifting with Nathan Labenz

Starting point is 00:00:00 Today on the AI Breakdown, we are talking to Nathan LeBenz of the Cognitive Revolution podcast. The AI Breakdown is a daily podcast and video about the most important news and discussions in AI. Go to Breakdown.network for more information about our YouTube, our newsletter, and our Discord. Hello, friends. Welcome back to the AI breakdown. Today we have on a multi-diverse threat in the AI space. He's a podcaster. He was a red teamer for GPT4. He works with a company called Waymark. He advises other AI companies. He really is all over the space. and can move pretty effortlessly from big picture macro-type questions down to really interesting technical things.

Starting point is 00:00:47 Hello, friends, quick note before we get to the rest of the episode, you have probably heard me talk about the AI education beta over the past few months. We've had a ton of you participate, which has been amazing. And now we're almost ready to announce something big and something new. If you want to be one of the first to hear about our new approach to learning AI that is hyper-practical, hands-on, immediately relevant, continuously upgrading, and, anchored by community, go to B-Supert.AI, and sign up to be notified when the project goes live. We're getting there in just a few weeks, and I want all of you along for the journey.

Starting point is 00:01:21 Once again, that's B-Super.a-I. In this conversation, we kind of cover the full breadth of those issues with an eye to understanding what the biggest trends in AI are right now, and really what everyone needs to know about the space. It's a great conversation. I encourage you to go check out his Cognitive Revolution podcast. Without any further ado, let's get to it. All right, Nathan, welcome to the AI breakdown, sir. How are you doing? Doing great. Excited to be here.

Starting point is 00:01:47 Yeah. So, you know, we're talking about this a little bit. I am a gallivanting around Mexico right now, as the listener is hearing this for my wife's 40th. And what I wanted to do is invite a couple people on who are in the space thinking about some of these same issues, have a bunch of different types of conversations. And the one that I thought would be fun to have with you is almost sort of like an AI state of the union sort of from a very broad level. Like you are, you know, you have a bunch of different intersections with this space. You think about it holistically in the context of your own podcast. You do. stuff in it as well, you know, in a variety of different ways. So I thought you'd be a great partner for this conversation. So that's the idea. And I think where I want to start is just like, what the state, what is your perception of the state of the labs and their big tech partners right now? And, you know, I think notably we're recording this on Tuesday right after the two of the three co-founders of inflection have announced that they're going off to join Microsoft to be, you know, in Mustafa Soleiman's case, the CEO of Microsoft AI, which is a new thing. So what's your sense of kind of, you know, labs and their competition right now?

Starting point is 00:02:56 Well, I guess for starters, I would say there are three clear leaders in the space today. The obvious number one position I think still has to go to Open AI, though Anthropic with Claude 3 has arguably taken the lead in terms of having the best model in public for people to use. and I've always been a big believer that Google DeepMind has the strongest bench, you know, and just the most thorough research agenda. So even though they are a bit behind in terms of the polish of the products, as we've seen with some recent episodes, I absolutely think that they have it where it counts. I think, you know, those are really, and then you could say, well, who else, you know, might be next.

Starting point is 00:03:41 Meta would be an obvious candidate if you were going to expand, the circle. Mistral might be a good candidate. Infliction I would have recently said was a candidate, but now I'm not really sure what to make of that. And beyond that, you could start to think about companies in China and maybe even, you know, go look for somebody in India. But I really do think the top three are a cut above the rest. I think where they are right now is, you know, it's a little bit hard to tell to what degree they are racing with the technology versus trying to proceed with caution. Certainly they're all saying that they're proceeding with caution, but looking back just two years, and it was only early 2022 when the very first instruct model came out, and now

Starting point is 00:04:26 all three of those companies are basically at a GPT4 class and promising to continue. So I think my best guess is that we're still in the steep part of the S curve. It doesn't seem like it will be probably that long at this point until OpenAI makes their next move. And we're already at the point where the frontier models are closing in on expert performance on most routine tasks, even tasks that are very cognitively demanding, like medical diagnosis and, you know, things that people go to school a long time for. So there's not too much room left before the AIs will start to really compete with humans, I think, in a very meaningful way. And overall, I think things are about to get weird. What did you make of, if you caught this, Sam Altman's sort of discourse around GPT5 on Lex Friedman recently, there are a couple things that were sort of notable that he said. Like, one, he certainly played it off. He said that they have a lot of important things to release before it.

Starting point is 00:05:32 Two, he sort of like was a little bit like almost dismissive of sort of where they are now. Like he certainly seemed unbothered by like competition. I mean, what was your perception? of that. I haven't seen the whole interview yet, but I have seen those clips that you're referring to. For context, I did have an interesting opportunity to evaluate Sam Altman's public comments for a six-month period between when I participated in the GPT-4 Red Team program and when it was released finally in March of 2023. So I had this window where I was among a very small group of people that knew what GPT-4 was

Starting point is 00:06:11 and what was coming. and had then the opportunity to watch his public comments with that knowledge and kind of assess, like, is he being honest? You know, how should I interpret him? Basically, what I concluded in that window is that he is pretty honest and that his statements are a pretty good guide to what is to come. He's obviously leaving a lot to the imagination, but I found him to be pretty literally saying what was going to happen, albeit again in a very high-level abstracted way. So I would assume that that's probably still his style. When he says things like GPD-5 is going to be a similar leap compared to four as four was to three,

Starting point is 00:06:54 you know, exactly what that looks like is a little bit hard to say, obviously, but I would expect genuinely a very big leap to be coming. And in terms of, you know, things that they are going to launch in the meantime, Yeah, again, hard to say. I mean, I think GPTs have been a major disappointment really in my experience. So I do think there's probably something coming there to make the agent side of kind of the current level of intelligence more useful. It seems like they have this general sense that they want to be ahead of the curve privately, but kind of dole out the power in a strategic way so that people have a chance to sort of. adopt and adjust. And I do think there is opportunity to make the agent paradigm work a lot better

Starting point is 00:07:45 than it does today without necessarily getting to, you know, that super next level of model that, you know, whether that's Q-Star or some sort of planning or, you know, something that's capable of doing new science. I think they're going to want to see more action in the world before they maybe turn up that intelligence dial to, you know, the next order of magnitude of scaling. So I'm speculating, obviously, with those comments, but I do think he's generally sitting on something and speaking about something pretty concrete, even when he's making these sort of, you know, rough, very loose guide to the future sorts of statements. So I would not at all bet against GPT5 being a huge leap from what we've seen today. GBT5 being a huge leap would actually be

Starting point is 00:08:29 one of the only things that explains how utterly unbothered they seem by all of the, like, like bluster in the press and the competition and Claude 3 finally being like, you know, if they're sitting there with Sam feeling as dismissive as he was, you know, another set of comments from that interview is he basically said that GPT4 sucked. And if that's, if GPT5 is so good that it makes it feel like GPT4 sucks, it would potentially be sort of, you know, explanatory for how they're acting. I mean, certainly if to the extent that SORA's innovate, you know, how much better SORA is than everything else that we had seen in sort of the video generation space is reflective at all of their sort of behind the scenes aheadness. I think that,

Starting point is 00:09:11 you know, they could really just be waiting for whenever they decide is appropriate to sort of reclaim the state of the art title. Yeah, I mean, they have had a year and a half since GPT4 training was complete to figure out what to do next. They said for a while that they weren't training GPT5 yet, but we're working on the ideas that they would need for it. Certainly to scale up another order of magnitude beyond GPT4 would take, you know, a lot of compute and a lot of data. And so no doubt there's been a lot of preparatory work that has gone into that. But yeah, it does seem like they are probably using tools internally that significantly exceed what we have on the outside.

Starting point is 00:09:55 You know, even just the base model, you know, from GPT4, I think is a notable improvement on what is in ChadGPT today. for multiple reasons, you know, the RLHF process kind of hurts it a bit on performance in a number of ways. The training to like give short answers, you know, in this sort of laziness thing. I'm quite confident that they are not using a lazy version internally at OpenAI. So I would, you know, strongly suspect that they do have a, even if it's not GPT5, that they do have a notably better set of tools for themselves than they make available to the public. I want to come to this, to the agent thing. It's just sort of another part of conversation. You mentioned that GPTs have been sort of disappointing in their context as like baby proto

Starting point is 00:10:44 agents. And I think that that's true. So for me, I haven't been disappointed with GPTs only because it seemed very apparent to me right away that they were just advanced custom instructions. You know, like it was, it's, they're a great way to not have to enter the same sort of prompts that you use over and over and over again, right? So the way that I'd use GPTs is like things like creating images for presentations that are standardized and things like that, right? So it's really just a different way of prompting it.

Starting point is 00:11:10 But I think that you are right that they were sort of presented as step zero, let's call it, towards an agent future. And in that they've, you know, people haven't been that stoked on them. How much do you think agents and sort of this focus on agents is like the next big bet in terms of reigniting consumer imaginations for all these labs? I think it could be, it's a pretty good candidate. You know, I guess I kind of think of our modes of interaction with AI as being today, on the one hand, you've got your productivity assistant chatbot. And the interaction there is real time, I'm doing something I need help. You know, help me draft this. Give me some feedback on this.

Starting point is 00:11:50 Help me write some code. I've got a bug in my code. Whatever. But in that scenario, you are doing your thing. and you sort of have the AI alongside available to help and you occasionally loop it in to help you. That's probably the most common mode of interaction today. And then on the other end is what I sometimes call delegation mode. And that is where you're really setting up a workflow with the goal of achieving high enough performance that you don't have to supervise every single task or every bit of output that the AI gives you back.

Starting point is 00:12:24 And that also can work. you know, if you really dial it in, certainly at, you know, my company Waymark, like we're not, you know, it's real time experience and people often get very good stuff. And I've built a ton of these workflows. You can build them in Zapier. You can, you know, custom code them. You can turn it into a mature app. But, you know, you're dialing in sort of expected inputs. What are they going to be? What are the steps? You validate it pretty extensively while you're setting it up. But if you're successful, you can get to the point where you can trust it to at least a decent degree because you have kind of controlled the environment and, you know, dialed in the performance through prompt engineering validation, e-vals, whatever. What's missing in between is the ability to delegate on the fly. It's like you don't really have the ability to say, I want you to go kind of do this for me and have any confidence that it's actually going to happen in a way that you would be pleased with. You can talk about it in real time or you can go set up a really structured, scaffolded system, but you don't really have anything in between.

Starting point is 00:13:23 and I think that that is really what people want. You know, when you think about, and I'm an AI advisor to an executive assistant business called Athena, and really what people want is very often just the ability to take something that's on their plate and put it on somebody else's plate. You know, and it's sort of, we have a voice memo app, and the idea is like you open the app. The mic is immediately recording. You say your thing and then, you know, it goes away. And the app actually self-closes. And this whole paradigm is around like, we want to make it so kind of quick and easy for you to get things off your plates.

Starting point is 00:13:56 You can go back to doing what you really want to do. I think if AIs can start to accept that kind of delegation, it will be a major moment because it takes you out of that sort of, I have to be real time with this. And it's kind of, you know, and that could be super useful. But it doesn't require you to go so far as to set up all this scaffolding and, you know, dial in the performance and so on and so forth. So I think for a lot of people that would just be. supernatural. I'm here. I'm doing this right now. Take this off my plate. Book this. Research that. Give me a report back on this. Integrate, you know, this API call into wherever. If you could actually just send those ad hoc tasks and get them done, I think it would be a really big deal. And the main

Starting point is 00:14:38 reason that hasn't happened yet is just that they just don't work well enough. The core models that we have just haven't been trained on probably enough of that sort of use case to really get there. So going back to, you know, what is, what is Sam going to launch? What is the OpenAI team going to launch before a GPT5? If you think of GPT5 as like a 10x or maybe even 100x compute scale up relative to GPT4, they could probably do a lot on the behavioral margin with just training on these sorts of tasks. So they don't need to be orders of magnitude bigger to really dial in like this is how you, you know, reliably make an API call or this is how you like reliably choose which button to click on a

Starting point is 00:15:19 website. I think the models are smart enough to do that, but they just need to be dialed in relative to where they are today. So that's why I sort of imagine something like that being a potential intermediate step. Yeah, it's interesting. I think that there's a one of the challenges for the agents to come to fruition is that on the one hand, you have to sort of nudge up their capacity and get specific about how you train them. But that also inherently involves guessing at what people are going to find them most useful for, which is really, really difficult. I personally think that a lot of the things that people are building towards for agents as sort of like early use cases, like, I don't know, faster DoorDash ordering are just not going to be that useful in practice,

Starting point is 00:15:58 or at least not so much more useful than the ways that we do it now that is going to cause sort of sufficient behavior change. And so it's just, it's very difficult because you sort of like, what would be ideal is a generalist agent system. But to get there, you kind of have to have specific agents. And to get to specific agents, you got to guess at what those specific agent use cases are going to be. So it's just sort of like, it's a little bit chicken or egg. certainly I would say that from the evidence of just YouTube views and download numbers, agents are very, very top of the heap in terms of people's anticipation and excitement and sort of what they imagine AI turning into in the future.

Starting point is 00:16:30 I find them fascinating, you know, just, and I find them kind of entertaining to watch, even. I like just experimenting to see if I can get multi-on to do stuff for me. And, you know, more often it doesn't. But it is getting close, you know, and I'm starting to see. see the, you know, there's still a little fog between here and there, but starting to see how you could spin up really all sorts of workflows. You know, I always ask people, this is still more for delegation mode today where you are going to get structured about it, but like, is there a

Starting point is 00:17:02 task in your business that either you can't keep up with or that, you know, you would love to scale 10 or 100x beyond what you can do today, but you just, you know, always kind of have assumed that would not be possible. The most common things that people come back with are like lead generation, you know, if I could make really high quality personalized outreach to the right targets, that would be extremely valuable. And similarly for recruiting, you know, if I could identify the right profiles and send them a really good message, every startup founder, you know, they would love to do more of those kinds of things. But they're just time limited. And it's also kind of hard to delegate. And it's also kind of hard to set up. Like,

Starting point is 00:17:46 You got to first go get the bulk stuff from LinkedIn and then you put it into a spreadsheet and then you got to go through the Zapier or whatever. But the dream future is like go find 50 people on LinkedIn for this job description, draft each one a personalized note, come back to me when that's ready and I'll review it. And eventually maybe even send it directly. But for now I would definitely advise a human review before you send that stuff. But we're not too far from it being able, you know, from these systems. being able to take on a task like that. And that number could be 50. It could be 100. It could be

Starting point is 00:18:21 a thousand. You could delegate. You know, you often take somebody quite competent to do that. And the executive assistance that we work with, they can do it. But, you know, a lot of times they lack context and their judgment isn't necessarily awesome as to exactly which profiles are the best. And, you know, their writing isn't necessarily, you know, what the CEO would want to send, certainly in their own name. So I do think there is an opportunity for AI's to be on some of these tasks just a better option, you know, than what people have at their disposal. And so, again, if they can just kind of fire it off with a two-line note and say, hey, go do this and come back to me when it's done, that will be a real kind of phase change for how a lot of people work.

Starting point is 00:19:02 Yep, absolutely. I want to ask you sort of a little bit more sort of technical speculation around where emphasis is going to be this year. So SWIX and the folks at Layden Space did this sort of like four AI wars defining the AI space right? now. And one of them was sort of like generalized models versus specialized models, right? Like how much are sort of the West going to be won by, you know, Gemini Ultra or GPT5 or Claude 4 or whatever versus lots of different functions having highly specialized models, right? Like, you know, is it going to be Dolly 3 inside GPT or mid-jurney that wins? So I'm interested in your kind of take on that. And maybe as a subsection of that, just where you think we're going to see, you know, sort of emphasis from an experimentation

Starting point is 00:19:45 and innovation standpoint this year, you know, are we going to just keep trying to plumb kind of ever larger data sets and, you know, make our large language models larger and larger and larger? Or are we going to see more and more emphasis on, you know, sort of smaller models that can do more with less, you know, or it's both. But I'm interested in kind of your take on some of those sort of, you know, technical prioritization questions that, you know, individual companies are facing right now. One of my mantras is everything, everywhere all at once. So I do think, I think it's definitely both. There's not a, you know, I'm not one for extreme positions on questions like this because I definitely see a lot of value in both. Right now, for me, Claude 3 has recently

Starting point is 00:20:27 really crossed an important threshold where it can write as me in a way that is compelling to me. And I had previously seen that a little bit with Gemini 1.5, which I was able to talk my way into the private preview on, but no other earlier model was really able to do that, pretty much at all. And I had even tried fine-tuning some advanced models and, you know, just couldn't get anything to really put a draft together that I felt was better than a blank page. You know, you could, with Claude 2, I would do it and I would sometimes convince myself it was helpful, but really I ended up rewriting everything. So with Claude 3, though, it is really a notable difference.

Starting point is 00:21:07 And one of the things that is notable about that is I'm dumping in huge amounts of context. So I think, you know, the sort of rag versus long context window debate long term, you know, probably, again, it's both because there's a lot more context out there than, you know, even the long context that I'm dumping into Cloud 3 right now. But I recently compiled all the intro essays that I've done for my podcast. I do one it for every episode, you know, a three to five minute little opening monologue. and now with a bunch of those dumped into Claude 3 and then the transcript of the new episode, I'm able to get a pretty good first draft. And the line of like, what did I write and what did Cloud 3 write is actually starting to blur? The same thing with putting stuff on Twitter.

Starting point is 00:21:52 I recently exported all my Twitter data for the express purpose of being able to put all of my tweets into Claude 3 so it could help me write in the style that I write in Twitter. And it really is very compelling. I'm still editing, but I feel like, you know, it's, I'm entering into this sort of cyborg author mode and that seems to only be possible with the biggest models. You know, so there are these kind of qualitatively different things that open up with further scaling. And I don't think we've seen the end of that.

Starting point is 00:22:23 And I certainly don't see anybody making that miniature, you know, in the immediate term. Like, that's probably going to be the, the territory for big models for a long time. Maybe not a long time, long time and AI time, which might not be that long in other frames of time. At the same time, though, Claude Haiku is also awesome. So, for example, we have had this task forever around identifying which of a user's images are the right ones to feature in the video that they're making. And we've approached it in many different ways, many different models, blah, blah, blah, blah, blah, blah. GPD4V basically solved that problem for us with the one caveat. out that it's kind of slow and a bit expensive, although honestly, it's still pretty cheap,

Starting point is 00:23:09 but it can add up to like, you know, a dollar or even a couple dollars for a user that has a lot of images, so that's not nothing. Haiku can do the same task and takes that price down like an order of magnitude, 90% plus savings. So that's exciting, and it's faster. I would say Haiku is definitely going to be a major unlock because it's really good and extremely cheap at just a quarter per million tokens. it's the kind of thing where

Starting point is 00:23:36 one of the things that agents have really struggled with over time is like what's relevant to this situation if you're looking through my email I've got an unbelievable amount of email what is actually relevant you know you can do a keyword search and you pull stuff up but then you kind of got

Starting point is 00:23:50 a scan down the page and figure out what's relevant and that's either too expensive, too slow or just like not effective with most of the models but with haiku I can really start to see how you could just plow your way through a lot of that stuff, you know, and would I pay a nickel for a actually really good search, you know, underlying search process that would find all the context to maybe then put into a Claude 3 to

Starting point is 00:24:16 then write a draft as me for another 10 cents? Like, yeah, you know, I mean, that stuff is extremely, extremely valuable if it actually works. So I think orchestration is probably where a lot of stuff goes, and this kind of gets back to agent, you know, as well. This is essentially shaping up to be something like an email agent, right? go find all this stuff, run all the way through it, figure out what's relevant, then come back, then draft. I mean, that's sort of the agent cycle. And breaking down the key parts of that into what can only be accomplished with the frontier model, aka right as me, and what can be accomplished with the cheapest model now that there is one that is super fast, super cheap, and long context. I think that's where a lot of

Starting point is 00:25:00 the sort of application development tinkering is going to happen. that will ultimately make these things really useful for people. Yeah. Super interesting. I tend to be on the same sort of, it's going to be everything, you know, kind of tip.

Starting point is 00:25:15 I think there's just so many reasons for there to be innovation and emphasis at the smaller scale, but to your point, that's not going to change the desire to have sort of state of the art continue to improve. So it just feels like we're inevitably in kind of both and territory. I want to talk a little bit about sort of broader societal level stuff. I'm interested in your perception of sort of like where we are in terms of public opinion,

Starting point is 00:25:41 particularly around the safety discourse and maybe how it relates to sort of some other parts of the conversation like, you know, open source and things like that. But, you know, for me, I think the SORA was a really interesting moment and sort of inflection point in that conversation a little bit. But I'd love to hear, you know, kind of, you know, what your perception of how people are thinking and feeling about AI is right now. I think there's a massive capabilities overhang, for one thing. You know, there are most people could be using it a lot more than they are. And I sort of am confused a lot of times by why people are not more eager to adopt, just given the incredible daily value that I get from it all the time.

Starting point is 00:26:17 But it does seem like there's a need for education, introduction, you know, better form factors, and just new habits. You know, but one person I know that's in the AI kind of training and education space said the most common cause of failure is the failure to form new habits. It's usually not that the AI can't do what they want to help with, but rather just that they maybe struggle a little bit and give up and never come back or, you know, do it once and, again, failure to form a habit. So I think there is a massive capability overhang. If development stopped now, then we would have a long road in front of us to go, you know, plum GPD4 and Claude 3 into kind of every context. And I think people are mostly kind of still sleeping on

Starting point is 00:27:04 just how transformative the current technology can be once it's really properly implemented. You know, I just did an episode with Katya Grace, who is the founder of AI Impacts, and they're the ones that did the survey of the 2,700 machine learning researchers, all of which had published in one of the top six conferences in just like the last two years. So this is very sort of current PhDs, you know, current active publishing professors, people out of the big labs, whatever, but that you had to publish in these leading conferences to be eligible for the survey. And the results of that survey, I think, are pretty striking.

Starting point is 00:27:42 There is not a consensus view in the field as to what is going to happen. Something like, you know, the big middle of the respondents, as individuals expressed very high uncertainty. They give people like, you know, five buckets, one through five. you can wait what is the probability that it's going to be like very bad all the way up to very good. And like two thirds of the people have like a very sort of even distribution where they're expressing like I really don't know what to expect. And then you have kind of, you know, maybe a sixth on each side that are confident that it's going to be very good or a very bad future for us. So I would take from that, you know, that the field itself does not have a consensus basically has radical uncertainty about where we're headed.

Starting point is 00:28:21 and then interpret the public's sort of skepticism in light of that as essentially a fairly rational concern that, you know, it's like, wait a second, okay, so you're telling me the people who are building this expect that it's going to be more powerful than humans, expect that it, you know, the stated goal of the leading developer right now is to make something that is able to do all tasks better than humans. They themselves don't have any agreement that it's going to be a good or bad thing. As many people think it's going to be terrible, I think it's going to be terrible, I think it's it's going to be good and most of them in the middle are just wildly uncertain. And I'm supposed to be okay with this as somebody who, you know, doesn't even understand how it works and has no say and how it's going to be developed on what time scale it's going to be developed, how it's going to be deployed. So I'm actually pretty sympathetic to the, you know, even if it is a relatively ignorant perspective and obviously in many cases it is, I am pretty sympathetic to the just sort of general vibe that like, you know, the Snoop Dog clip is like the very best of this, right? He's like, what is going on? You're telling me this thing can do this. And these things got their own minds and they don't understand how they work.

Starting point is 00:29:27 So I think that is ultimately kind of a pretty reasonable outlook. And unfortunately, I don't think that more education about the actual state of the technology is really going to reassure people all that much. We don't understand really how they work. Notably from that survey, too, there is not a mechanistic interpretability. breakthrough expected. That was one of the few questions where people were like broadly agreed that they don't expect a sudden advance in our ability to really understand the internal workings of these systems. So yeah, I don't know. With all that, I kind of think I'm not one who really worries about jobs.

Starting point is 00:30:07 You know, I think I do think we should be getting to work on a new social contract and have that ready, you know, as potentially economic, economically transformative AI really hits and potentially, very quickly, but I'm pretty confident, you know, that I can continue to find meaning in life without necessarily having to work for money. And I think most people, you know, probably can. So I'm not, like, concerned about lack of employment, meaning, you know, everything's going to go bad. I'm not so sympathetic to that concern. But I am sympathetic to the big picture idea that, yikes, like, this seems like it's going really fast. It seems like it's getting really powerful. And you're telling us that you really don't know how it works or what the outcome is going to be.

Starting point is 00:30:47 And we also really haven't even heard a vision of that. I mean, that's sometimes I say the thing that's in shortest supply right now is a positive vision for the future. It's like, what is daily life supposed to look like? You know, we hear these sort of, oh, we'll cure all the diseases. Well, that would definitely be nice. And I certainly hope that happens. But like, what am I supposed to envision, you know, for the future? Like, we don't hear much of that at all from really anyone, including the leaders of the frontier developers.

Starting point is 00:31:13 So I think in that gap, it does make a lot of sense to me that people. are broadly pretty skeptical. Yeah, I completely agree with the vision deficit. You don't have any of the labs, literally any of them, articulating the beneficent vision. You have them sort of clarifying constantly about all the things that could go wrong and why they maybe think it's going to be a little bit better than that. But in the absence of the vision, all you have is a sort of extremely attractive to media headlines about what's terrifying and what's scary.

Starting point is 00:31:43 I think it's one of the things that I'm watching right now is how, how, you know, bifurcating, we're starting to see it between developed countries and developing countries, where the developing world is so much more enthusiastic about this stuff as a way to sort of catch up and leapfrog than people are in Western countries. And I think that that's very, very telling, although that's the subject of an entire, you know, series of shows, much less one. I guess as we wrap, I would love to hear kind of just like, if you could sort of, you know, have a wish for how the next few months plays out or, you know, what you'd like to see. It could be a technical thing. it could be something around sort of, you know, a new innovation.

Starting point is 00:32:19 Like what's on your sort of wish list for, you know, call it the late spring, early summer in AI 2024? I guess in the big picture, I sometimes describe myself as an adoption accelerationist and hyper-scaling pauser, which is to say, I think that the tools that we have are extremely useful. And almost everybody stands to get a ton of benefit from learning how to use them, using them well deploying some of these workflows. I do think there is substantially more value to be unlocked with kind of another half turn on getting the agent workflows to actually succeed at,

Starting point is 00:32:56 you know, multi-step tasks. So I would definitely be excited to see that. I think we are starting to see that. But at the same time, I would be pretty pleased to see us kind of collectively say, wait, do we really need to 100x, you know, the compute that goes into the next model as quickly as we possibly can? I'm not so sure that we do. And I wonder if we, you know, it's clear to me that we're playing with fire. And I don't know that we have the collective wisdom to manage that, especially given how poorly we understand a lot of the inner workings of the things that we're developing. So that's sort of a weird position that puts me in kind of a lonely corner sometimes. But it's very easy for me to kind of channel both of those vibes at the same time.

Starting point is 00:33:47 I do, I feel like I do three times as much work as I used to. And I'm looking forward to making that five times as much. And Claude three, you know, might actually be the thing that takes me from three to five. So it is super exciting. But at the same time, you know, it seems like the people who understand it best often have the healthiest fear of just how crazy things could be. So I would like to see us get a lot more serious about figuring out how we make sure that we do set ourselves up for a positive long-term future. I think that there is radically more space than it seems from Twitter between the polls of,

Starting point is 00:34:24 you know, EACC on the one end and EA on the other. And I think you just articulated a version of that that I think a lot of people could get on board with. Nathan, super awesome to talk to you about all this stuff. It could go for much longer, but really appreciate you taking some time today. Where can people find you if they want, want to hear from you more often? I'm on Twitter at Labenz, L-A-N-Z, and the podcast is The Cognitive Revolution, which is at Cognitive Revolution.a.I. Awesome, man. Appreciate it. Thank you. This has been fun.

The AI Daily Brief: Artificial Intelligence News and Analysis - How AI Is Shifting with Nathan Labenz

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.