Dwarkesh Podcast - AI 2027: month-by-month model of intelligence explosion — Scott Alexander & Daniel Kokotajlo

Episode Date: April 3, 2025

Scott and Daniel break down every month from now until the 2027 intelligence explosion.Scott Alexander is author of the highly influential blogs Slate Star Codex and Astral Codex Ten. Daniel Kokotajlo... resigned from OpenAI in 2024, rejecting a non-disparagement clause and risking millions in equity to speak out about AI safety.We discuss misaligned hive minds, Xi and Trump waking up, and automated Ilyas researching AI progress.I came in skeptical, but I learned a tremendous amount by bouncing my objections off of them. I highly recommend checking out their new scenario planning document, AI 2027Watch on Youtube; listen on Apple Podcasts or Spotify.----------Sponsors* WorkOS helps today’s top AI companies get enterprise-ready. OpenAI, Cursor, Perplexity, Anthropic and hundreds more use WorkOS to quickly integrate features required by enterprise buyers. To learn more about how you can make the leap to enterprise, visit workos.com* Jane Street likes to know what's going on inside the neural nets they use. They just released a black-box challenge for Dwarkesh listeners, and I had a blast trying it out. See if you have the skills to crack it at janestreet.com/dwarkesh* Scale’s Data Foundry gives major AI labs access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you’re an AI researcher or engineer, learn about how Scale’s Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkeshTo sponsor a future episode, visit dwarkesh.com/advertise.----------Timestamps(00:00:00) - AI 2027(00:06:56) - Forecasting 2025 and 2026(00:14:41) - Why LLMs aren't making discoveries(00:24:33) - Debating intelligence explosion(00:49:45) - Can superintelligence actually transform science?(01:16:54) - Cultural evolution vs superintelligence(01:24:05) - Mid-2027 branch point(01:32:30) - Race with China(01:44:47) - Nationalization vs private anarchy(02:03:22) - Misalignment(02:14:52) - UBI, AI advisors, & human future(02:23:00) - Factory farming for digital minds(02:26:52) - Daniel leaving OpenAI(02:35:15) - Scott's blogging advice Get full access to Dwarkesh Podcast at www.dwarkesh.com/subscribe

Transcript
Discussion (0)
Starting point is 00:00:00 Today, I have the great pleasure of chatting with Scott Alexander and Daniel Kokotelo. Scott is, of course, the author of the blog Slate Star Codex, Astral Codex 10 Now. It's actually been, as you know, a big bucket list item of mine to get you on the podcast. So this is all the first podcast you've ever done, right? Yes. And then Daniel is the director of the AI Futures Project. And you have both just launched today something called AI 2027. So what is this?
Starting point is 00:00:30 Yeah, AI 2027 is our scenario trying to forecast the next few years of AI progress. We're trying to do two things here. First of all is we just want to have a concrete scenario at all. So you have all these people, Sam Altman, Dario Amadai, Elon Musk saying, going to have AGI in three years, superintelligence in five years. And people just think that's crazy because right now we have chatbots, able to do like a Google search, not much more than that in a lot of ways. and so people ask, how is it going to be AGI in three years?
Starting point is 00:01:03 What we wanted to do is provide a story, provide the transitional fossils. So start right now, go up to 2027 when there's AGI, 2028, when there's potentially superintelligence, show on a month-by-month level what happened. Kind of in fiction writing terms, make it feel earned. So that's the easy part. The hard part is we also want to be right. So we're trying to forecast how things are going to go, what speed they're going to go at. We know that in general the median outcome for a forecast like this is being totally humiliated
Starting point is 00:01:39 when everything goes completely differently. And if you read our scenario, you're definitely not going to expect us to be the exception to that trend. The thing that gives me optimism is Daniel back in 2021 wrote kind of the prequel to this scenario called What 2026 looks like. It's his forecast for the next five years of AI progress. And he got it almost exactly right. Like, you should stop this podcast right now.
Starting point is 00:02:05 You should go and read this document. It's amazing. Kind of looks like you asked ChatGBT, summarize the past five years of AI progress, and you got something with like a couple of hallucinations, but basically well-intentioned and correct. So when Daniel said he was doing this sequel, I was very excited.
Starting point is 00:02:23 really wanted to see where it was going. It goes to some pretty crazy places, and I'm excited to talk about it more today. I think you're hyping up a little bit too much. Yes, I do recommend people go read the old thing I did, which was a blog post. I think it got a bunch of stuff right, a bunch of stuff wrong,
Starting point is 00:02:40 but overall held up pretty well and inspired me to try again and do a better version of it. I think read the document and decide which of us is right. Another related thing, too, is that it was going to, the original thing was not supposed,
Starting point is 00:02:53 to end in 2026. It was supposed to go all the way through the exciting stuff, right? Because everyone's talking about like, what about AGI? What about superintelligence? Like, what would that even look like? So I was trying to sort of like step by step work my way from where we were at the time until things happen and then see what they look like.
Starting point is 00:03:09 But I basically chickened out when I got to 2027 because things were starting to happen and the automation loop was starting to take off and it was just so confusing and there was so much uncertainty. So I basically just deleted. I just deleted.
Starting point is 00:03:23 the last chapter and published what I had up until that point, and that was the blog post. Okay, and then, Scott, how did you get involved in this project? I was asked to help with the writing, and I was already somewhat familiar with the people on the project, and many of them were kind of my heroes. So Daniel, I knew both because I'd written a blog post about his opinions before. I knew about his what 2026 looks like, which was amazing. And also, he had pretty recently made the national news for having when he quit Open AI, they told him he had to sign a non-disparagement agreement where they would claw back his stock options, and he refused, which they weren't prepared
Starting point is 00:04:03 for. It started a major news story, a scandal that ended up with Open AI agreeing that they were no longer going to subject employees to that restriction. So people talk a lot about how it's hard to trust anyone in AI because they all have so much money invested in the hype in getting their stock options better. And Dadua had just attempted to sacrifice millions of dollars in order to say what he believed, which to me was this incredibly strong sign of honesty and competence. And I was like, how can I say no to this person? Everyone else on the team also extremely impressive. Eli Liffland, who's a member of Samuette, the world's top forecasting team. He has one like the top forecasting competition, plausibly described as just the best
Starting point is 00:04:55 forecaster in the world, at least by these really technical measures that people use in the Super Forecasting Committee. Thomas Larson, Jonas Volmer, both really amazing people who have done great work in AI before. I was really excited to get to work with this superstar team. I have always wanted to get more involved in the actual attempt to make AI go well. Right now I just write about it. I think writing about it is important, but I don't know. You always regret that you're not the person who's the technical alignment genius, who's able to solve everything, and getting to work with people like these
Starting point is 00:05:33 and potentially make a difference just seemed like a great opportunity. What I didn't realize was that I also learned a huge amount. I try to read most of what's going on in the world of AI, but it's this very low bandwidth thing. And getting to talk to somebody who's thought about it as much as anyone in the world was just amazing. Makes me really understand these things about how is AI going to learn quickly?
Starting point is 00:06:02 You need all of this deep engagement with the underlying territory. And I feel like I got that. I probably changed my mind towards, against, towards, against intelligence explosion. like three, four times in the conversations I've had in the lead-up in talking to you and then like trying to come up with the rebuttal or something. Yeah, it wasn't even just changing my mind.
Starting point is 00:06:22 Getting to read the scenario for the first time, it obviously wasn't written up at this point. It was a giant, giant spreadsheet. I've been thinking about this for like a decade, decade and a half now. And it just made it so much more concrete to have a specific story. Like, oh, yeah, that's why we're so worried about the arms rates with China. Obviously, we would get an arms race with China in that situation. And like, aside from just the people, getting to read the scenario really sold me.
Starting point is 00:06:51 This is something that needs to get out there more. Yeah, yeah. Okay, now let's talk about this new forecast. Let's start, because you do a month-by-month analysis of what's going to happen from here. So what is it that you expect in mid-2020-25 and end of 2025 in this forecast? So beginning of the forecast mostly focuses on agents. So we think they're going to start with agency training, expand the time horizons, get coding going well. Our theory is that they are, to some degree consciously, to some degree accidentally,
Starting point is 00:07:25 working towards this intelligence explosion, where the AIs themselves can start taking over some of the AI research move faster. So 2025, slightly better coding. 2026, slightly better agents, slightly better coding. And then we focus on, and we name the scenario after 2027, because that is when this starts to pay off, the intelligence explosion gets into full swing. The agents become good enough to help with, at the beginning not really do but help with some of the AI research.
Starting point is 00:07:58 So we introduced this idea called the R&D progress multiplier. So how many months of progress without the AIs do you get in one month of progress with all of these new AIs helping with the intelligence explosion. So, 2027, we start with, I can't remember if I could literally start with or by March or something, a five-times multiplier for algorithmic progress.
Starting point is 00:08:20 So we have like the stats tracked on the side of the story. Part of why we did it as a website is so that you can have these cool gadgets and widgets. And so as you read the story, the stats on the side automatically update. And so one of those stats is like the progress multiplier. Another answer to the same question you asked is basically, 2026, nothing super
Starting point is 00:08:39 interesting, or 2025 nothing super interesting happen. More or less similar trends to what we're seeing. Computer use is totally solved, partially solved. How good is computer used by the end of 2025? My guess is that they won't be making basic mouse click errors by the end of 2025 like they sometimes currently do.
Starting point is 00:08:54 If you watch Cloud plays Pokemon, which you totally should, it seems like sometimes it's just like failing to parse what's on the screen and like it miss, it like thinks that its own player character is an NBC and like it's confused. Like my guess is that that sort of thing will most begun by the end of this year, but that they still won't be able to like autonomously operate
Starting point is 00:09:15 for many, for long periods on their own because... By 2025, when you say it won't be able to act coherently for long periods of time in computer use, if I want to organize a happy hour in my office, I don't know, that's like, what a 30-minute task? What fraction of that is, it's got to invite the right people, it's got to book the right door dash or something? What fraction of that is it able to do? My guess is that by the end of this year, there'll be something that can sort of like kind of do that but unreliably, and that if you actually like tried to use that to run your life, it would make some hilarious mistakes that would appear on Twitter and go viral, but that
Starting point is 00:09:50 like the MVP of it will probably exist by this year. Like there'll be like some Twitter threat about someone being like, I plugged in this agent to like run my party and it worked. Yeah. Our scenario focuses on coding in particular because we think coding is what starts the intelligence explosion. So we are less interested in questions of like how do you mop up the last few things that are uniquely human compared to when can you start coding in a way that helps the human AI researchers speed up their AI research? And then if you've helped them speed up the AI research
Starting point is 00:10:20 enough, is that enough to, with some ridiculous speed multiplier 10 times, 100 times mop up all of these other things? What observation I have is you could have told a story in 2021. Once Chi-GPD comes out, I think I had friends who are like, you know, incredible AI thinkers who are like, look, you've got the coding agent now. It's been cracked. Now the GPD4 will go around and they'll do all this engineering and we do this that are all on top. We can totally scale up the system 100X. And every single layer of this has been much harder than the strongest optimist expected. It seems like there have been significant difficulties in increasing the pre-training size, at least from rumors about field. training runs or underwhelming training runs at labs. It seems like building up these RL, I'm like total outside view. I know nothing about the actual engineering involved here. But just from an outside view, it seems like building up the O1 like RL clearly took much at least two years after GPT4 was released.
Starting point is 00:11:23 And with these things are also their economic impact and the kinds of things you would immediately expect based on benchmarks for them to be especially capable at isn't overwhelming, like the call center of workers or haven't been fired yet. So the, why not just say that, like, look, at higher scale, it'll probably get even more difficult. Wait a second. I'm a little confused to hear you say that because when I have seen people predicting AI milestones like Kat Your Grace's expert surveys, they have almost always been too pessimistic from a point of view of how fast AI will advance. So, like, I think the 2022 survey, they actually said the things that had already happened
Starting point is 00:12:07 would take like 10 years to happen. But then when the survey, it might have been in 2023, it was like six months before GPT3, GPT4 came out. And there were things that GPT 3 or 4 or whichever one of them did that it did in six months. They were still predicting like five or 10 years from. I'm sure Daniel is going to have a more detailed answer,
Starting point is 00:12:28 but I absolutely reject the premise that everybody, has always been too optimistic. Yeah, I think in general, most people following the field have been, have underestimated the pace of AI progress and underestimated the pace of AI diffusion into the world. For example, Robin Hansen famously made a bet about less than a billion dollars of revenue, I think by 2025. I agree, Robin Hansen in particular has been too much. But he's a smart guy, you know?
Starting point is 00:12:51 So I think that the aggregate opinion has been underestimating the pace of both technical progress and deployment. I agree that there have been plenty of people who have been more bullish than, you know, me and have been already proven wrong, but then I be. Wait a second. We don't have to guess about aggregate opinion. We can look at Metaculous. Metaculous, I think their timeline was 2040 back, it was like 2050 back in 2020. It gradually went down to like 2040, two or three years ago. Now it's at 2030, so it's barely ahead of us. Again, that may turn out to be wrong, but it does look like the Metaculans overall
Starting point is 00:13:26 have been too pessimistic, thinking too long term, rather than too optimistic. I think that's like the closest thing we have to a neutral aggregator where we're not cherry picking things. I had this interesting experience yesterday. I was, we're having lunch with these, this senior AI researcher probably makes on the order like millions a month or something. And we were asking him, how much are the AI is helping you? And he said, in domains, which I understand well, and it's closer to total complete, but more intense, there it's maybe saving me four to eight hours a week. but then he says in domains which I'm less familiar with
Starting point is 00:14:03 if I need to go wrangle up some hardware library or make some modification to the kernel or whatever where I'm just like I know less that saves me on the order of 24 hours a week now with like current models what I found really surprising
Starting point is 00:14:19 is that the help is bigger where it's less like auto-complete and more like a novel contribution it's like a more significant productivity improvement there yeah that is interesting I imagine what's going on there is that a lot of the process when you're unfamiliar with a domain is like Googling around and learning more about the domain and language models are excellent because they've already read the whole internet and know all the details. Isn't this a good opportunity to discuss a certain question I asked Dario that you responded to? What are you thinking of?
Starting point is 00:14:47 Well, I asked this question where, as you say, they know all the stuff. I don't know if you saw this. I asked this question where I said, look, these models know all the stuff. And if a human knew every single thing a human has ever written down on the internet, they'd be able to make all these interesting connections between different ideas and maybe even find medical cures or scientific discoveries as a result. There was some guy who noticed that magnesium deficiency causes something in the brain that is similar to what happens when you get a migraine. And so he just said give you magnesium supplements that cured a lot of migraines. So why aren't LLM is able to leverage this enormous asymmetric advantage they have to make, a single new discovery like this.
Starting point is 00:15:28 Yeah, and then the example I gave was that humans also can't do this. So for me, the most salient example is etymology of words. We have all of these words in English that are very similar, like happy versus hapless, happen, perhaps, and we never think about them unless you read an etymology dictionary. And then, oh, obviously these all come from some old route that has to mean luck or occurrence or something like that. Yeah. So, like, it's kind of about figuring out versus checking.
Starting point is 00:15:58 If I tell you those, you're like, this seems plausible. And of course, in etymology, there are also a lot of false friends where they seem plausible but aren't connected. But you really do have to have somebody shove it in your face before you start thinking about it and make all of those connections. I will actually disagree with this. We know that humans can do, like, we have examples of humans doing this. I agree that we don't have logical omissions because there is a commentatorial explosion. but we are able to leverage our intelligence to actually one of my favorite examples of this is David Anthony, the guy who wrote the horse, the wheel, and language.
Starting point is 00:16:31 He made this, it was super impressive discovery before we had the genetic evidence for it, like a decade before where he said, look, if I look at all these languages in India and Europe, they all share the same etymology. I mean, literally what you're talking about, the same etymology for words like wheel and cart and horse. And these are technologies that have only been around for the last 6,000 years, which must mean that there was some group that these groups are all at least linguistically descended from. And now we have genetic evidence for the amniah, which we believe is this group. You have a blog where you do this.
Starting point is 00:17:08 This is your job, Scott. So why shouldn't we hold the fact that language models can't do this more against them? Yeah, so to me it doesn't seem like he is just kind of sitting there being logically omniscient in getting the answer. It seems like he's a genius. He's thought about this for years. Probably at some point, like, he heard a couple of Indian words and a couple of European words at the same time, and they kind of connected, and the light bulb came on. So this isn't about having all the information in your memory so much as the normal process of discovery, which is kind of mysterious, but seems to come from just kind of having good heuristics and throwing them at things
Starting point is 00:17:46 until you kind of get a lucky strike. My guess is if we had really good. AI agents and we applied them to this task. It would look something like a scaffold where it's like, think of every combination of words that you know of, compare them. If they sound very similar, write it on this scratch pad here. If there's a combination, if a lot of words of the same type show up on this scratch pad, that's pretty strange, do some kind of thinking around it. And I just don't think we've even tried that. And I think right now if we tried it, we would run into the combinatorial explosion. We would need better heuristics. Humans have such good heuristics that probably most of the things that show up even in our conscious mind rather
Starting point is 00:18:24 than happening on the level of some kind of unconscious processing are at least the kind of things that could be true. Like I think you could think of this as like a chess engine. You have some unbelievable number of possible next moves. You have some heuristics for picking out which of those are going to be the right ones. And then gradually you kind of have the chess engine think about it, go through it, come up with a better or worse move, then at some point you potentially become better than humans. I think if you were to force the AI to do this in a reasonable way,
Starting point is 00:18:54 or you were to train the AI such that it itself could come up with the plan of going through this in some kind of heuristic-laden way, you could potentially equal humans. I'll add some more things to that. So I think there's a long and sordid history of people looking at some limitation of the current LLMs and then making grand claims about how the whole paradigm is doomed because they'll never overcome this limitation. And then, like, a year or two later, the new LLMs overcame. that limitation. And I would say that like with respect to this thing of like why haven't they made these interesting scientific discoveries by combining the knowledge they already have
Starting point is 00:19:29 and like noticing interesting connections, I would say, first of all, have we seriously tried to build scaffolding to make them do this? And I think the answer is mostly no. I think Google DeepMind tried this, right? Maybe. So maybe. Second thing, have you tried making the model bigger? They've made it a bit bigger over the last couple years and it hasn't worked so far. Maybe if they make it even bigger still, it'll know. some more of these connections. And then third thing, and here's,
Starting point is 00:19:53 here's, I think, the special one, have you tried training the model to do the thing? You know, just because, like, the, you know, the pre-training, the pre-training process doesn't strongly incentivize this type of connection making, right?
Starting point is 00:20:06 In general, I think it's a helpful heuristic that I use to ask the question of, like, remind oneself, what was the AI trained to do? What was the training environment like? Right. And if you're wondering,
Starting point is 00:20:16 why hasn't the AI done this? Ask yourself, like, did the training environment train it to do this? And often the answer is no. And often I think that's a good explanation for why the AI is not good at it. Yeah. It wasn't trained to do it, you know. I mean, it seems like such an economically valuable. But how would you set up the training environment? Like, yeah. It wouldn't it be really gnarly to try to set up an RL environment to train to make new scientific discoveries? Maybe that's why you should have longer timelines. It's a narlary,
Starting point is 00:20:42 it's an arly engineering problem. Well, so, I mean, in our scenario, they don't just like leap from where we are now to solving this problem. Yeah. They don't. Instead, they just iterative to improve the coding agents until they've basically got coding solved, but even still their coding agents are not able to do some of this stuff. That's what early 2020s, like the first
Starting point is 00:21:02 half of 2027 in our story is basically they've got these awesome automated coders, but they still lack research taste and they still lack maybe like organizational skills and stuff. And so they need to like overcome those remaining bottlenecks and gaps in order to completely automate the research cycle. But they're able to overcome those gaps
Starting point is 00:21:18 faster than they normally would. because the coding agents are doing all the grunt work really fast for them. I think it might be useful to think of our timelines as being like 2070, 200. It's just that the last 50 to 70 years of that all happened during the year, 2027 to 2028, because we are going through this intelligence explosion. Like I think if I asked you, could we solve this problem by the year 2100? You would say, oh, yeah, by 2100? Absolutely. And we're just saying that the year 2100 might happen earlier than you expect
Starting point is 00:21:47 because we have this research progress multiplier. And then let me just address that in a second, but just one final thought on this thread. To the extent that there's like a modus ponens, modus tolens thing here, where one thing you could say is like, look, AIs, not just all the ones, but AIs will have this fundamental asymmetric advantage where they know all this shit. And why aren't they able to use their general intelligence to use this asymmetric advantage to some enormous capability overhang? Now you could infer that same statement by saying, okay, well, once they do have that general intelligence, they will be able to use their asymmetric advantage to make all these enormous gains
Starting point is 00:22:26 that humans are, in principle, less capable of, right? So basically, if you do subscribe to this view, that AIs could do all these things, if only they had general intelligence, it's got to be like, well, once we actually do get the AGI, it's actually going to be totally transformative because they will have all of human knowledge memorized and they can use that to make all these connections.
Starting point is 00:22:43 I'm glad you mentioned that. Our current scenario does not really take that into account very much. So that's an example in which our scenario is possibly underestimating the rate of progress, right? You're so conservative, Daniel. This has been my experience working with the team. As I point out, like, five different things, you're sure you're taking this into account. You're sure taking this into account. And first of all, 99% of the time, he says, yes, we have a supplement on it.
Starting point is 00:23:06 But even when he doesn't say that, he's like, yeah, that's one reason it could go slower than that. Here are 10 reasons it could go faster. I mean, we're trying to be sort of like our median guess. So, like, we, there are a bunch of ways in which we could be underestimating, and there are a bunch of ways in which we could be overestimating. And, you know, we're going to hopefully continue to think more about this afterwards and, like, continue to iteratively refine our models and come up with better guesses that's before.
Starting point is 00:23:31 Look, your AI product works best when it has access to all of your clients' information. Your co-pilot needs your customer's entire code base, and your chatbot needs to access all of your client's documents. But for your users, this presents a massive security risk. Your client's IT people are going to need more assurance to even consider your product. Enterprise customers need secure auth, granular access controls, robust logging, and a lot more just to start with. Building all these features from scratch is expensive, tough, and time-consuming. That's where my new sponsor, WorkOS, comes in.
Starting point is 00:24:04 WorkOS is kind of like Stripe for Enterprise features. They provide a set of APIs to quickly add all the capabilities so that your app, AI or otherwise can become enterprise ready and scale up market. They're powering top AI companies, including OpenEI, cursor, perplexity, and anthropic. And hundreds more. If you want to learn more about making your app enterprise ready, go to workOS.com and just tell them that Dorcasch sent you. All right, back to Scott and Daniel. So if I look back at AI progress in the past, if we were back in, say, 2017, yeah, suppose we had the superhuman coders in 2017.
Starting point is 00:24:42 the amount of progress we've made since then. So where we are currently in 2025, by when could we have had that instead? Great question. We still have to like stumble through all the discoveries that we've made since 2017. We still have to like figure out that language models are a thing.
Starting point is 00:24:56 We still have to like figure out that you can fine tune them with RL. Like, so all those things would still have to happen. How much faster would they happen? Maybe 5X faster because a lot of the like small scale experiments that these people do in order to like test out ideas really quickly before they do the big training runs would happen much faster
Starting point is 00:25:13 because they're just like like these split being spit out. I'm not very confident in that 5X number. It could be lower, it could be higher, but that was sort of like roughly what we were guessing.
Starting point is 00:25:22 Our 5X, by the way, is for the algorithmic progress part, not for the overall thing. So in this hypothetical, according to me, basically things would be going like 2.5x faster where the algorithms
Starting point is 00:25:32 would be advancing at 5X speed, but the compute is still stuck at the usual speed. That seems plausible to me. You have a 5x at some point and then dot, dot, dot, you have 1,000 X AI progress within the matter of a year. Maybe that's the part of like, wait, how did that happen exactly?
Starting point is 00:25:49 So what's the story there? The way that we did our takeoff forecast was basically by breaking down how we think the intelligence explosion would go into a series of milestones. First you automate the coding, then you automate the whole research process, but in a very similar way to how humans do it with like, you know, teams of agents that are about human level. Then you get to superhuman level and so forth. And so we broke it down as these milestones, you know,
Starting point is 00:26:09 the superhuman coder, superhuman AI researcher, and then super intelligent AI researcher. And the way we did our forecast was we basically, well, for each of these milestones, we were like, what is it going to take to make an AI that has, that achieves that milestone? And then once you do achieve that milestone, how much is your overall speed up? And then what's it going to take to achieve the next milestone? Combine that with the overall speed up, and that gets to your clock time distance until that happens. And then, okay, now you're at that milestone, what's your overall speed up, assuming that you have that milestone.
Starting point is 00:26:40 Also, what's the next one? How long does it take to get to the next one? So we sort of work through it bit by bit. And at each stage, we're just making our best guesses. So quantitatively, we were thinking something like 5X speed up to algorithmic progress from the superhuman coder. And then something like a 25x speed up to algorithmic progress from the superhuman AI researcher, because at that point, you've got the whole stack automated,
Starting point is 00:27:03 which I think is substantially more useful than just automating the coding. and then I think we I forget what we say for a super intelligent AI researcher but off the top of my head it's probably something like in the hundreds
Starting point is 00:27:19 or maybe like a thousand X overall speed up so maybe the big picture thing I have with the total explosion is we can go through the specific arguments about how much will the automated coder be able to do
Starting point is 00:27:30 and how much will the superhuman AI coder be able to do but on prayers it's just like such a wild thing to expect. And so before we get into all the specific arguments, maybe you can just address this idea that like why, why not just start off at like 0.01% chance this thing might happen. Then you need extremely, extremely strong evidence that it will before making that your mortal view. I think that it's a question of like, what is your default option or what are you comparing
Starting point is 00:27:58 it to? I think that naively people think like, well, every particular thing is potentially wrong. So let's just have a default path where nothing ever happens. And I think that that has been the most consistently wrong prediction of all. Like I think in order to have nothing ever happen, you actually need a lot to happen. Like you need suddenly AI progress that has been going at this constant rate for so long stops. Why does it stop? Well, we don't know. Whatever claim you're making about that is something where you would expect there to be a lot of out-of-model error,
Starting point is 00:28:30 is where you would expect something, like, somebody must be making a pretty definite claim that you want to challenge. So I don't think there's a neutral position where you can just say, well, given that out-of-model error is really high and we don't know anything, let's just choose that. I think we are trying to take, I know this sounds crazy because if you read our document, all sorts of bizarre things happen, it's probably the weirdest couple of years that have ever been. But we're trying to take almost in some sense a conservative position where the trends don't change. Nobody does an insane thing. nothing that we have no evidence to think will happen happens. And the way that the AI intelligence explosion dynamics work are just so weird that in order to have nothing happen,
Starting point is 00:29:15 you need to have a lot of crazy things happen. One of my favorite meme images is this graph showing world GDP over time you've probably seen that spikes up. And then there's like a little thought bubble at like the top of the spike in like 2020, you know, 2010 or something. And the thought bubble says, like, my life is pretty normal. I have a good
Starting point is 00:29:37 grasp of what's weird versus standard. And people thinking about different futures with, like, digital minds and space travel are just engaging in silly speculation. Like, the point of the graph is, like, actually, there's been amazing transformative
Starting point is 00:29:53 changes in the course of history that would have seemed totally insane to people, you know, multiple time. We've gone through multiple such waves of those things. Everything we've talked about has happened before. Algorithmic progress already doubles like every year or so. So it's not insane to think that algorithmic progress can contribute to these compute things.
Starting point is 00:30:15 In terms of general speed up, we're already at like a thousand times research speed up multiplier compared to the Paleolithic or something. So like from the point of view of anyone in most of history, we are going at a blindingly insane pace. And all that we're saying here is that it's not going to stop, is that the same trend that has caused us to have a thousand times speed up multiplier relative to past eras, and not even the Paleolithic. Like, what happened in the century between, I don't know, 600 and 700 AD? I'm sure there are things, I'm sure historians could point them out. Then you look at the century between 1900 and 2000, and it's just completely qualitatively different. Of course, there are models of whether that's stagnated recently or what's going on here. We can talk about those.
Starting point is 00:31:02 We can talk about why we expect for the intelligence explosion to be an antidote to that kind of stagnation. But nothing we're saying is that different from what has already happened. I mean, you are saying that this transition, these previous transitions have been smoother than the one you're anticipating. We're not sure about that, actually. So according to one of these models is just a hyperbola, everything is along the same curve. Another model is that there are these things like the literal Cambrian explosion. if you want to take this very far back, go full Ray Kurzweil. The literal Cambrian explosion, the agricultural revolution, the industrial revolution is phase changes.
Starting point is 00:31:37 When I look at the economic modeling of this, my impression is the economists think that we don't have good enough data to be sure whether this is all one smooth process or whether it's a series of phase changes. When it is one smooth process, the smooth process is often a hyperbola that shoots to infinity in weird ways. We don't think it's going to shoot to infinity. We think it's going to hit bottleneck. This is a conservative crowd, you know. Yeah, we think it's going to hit bottlenecks the same as all these previous processes. The last time this hit a bottleneck, if you take the hyperbola view, is in like 1960, when humans stopped reproducing at the same rate they were reproducing before.
Starting point is 00:32:11 We hit a population bottleneck, the usual population two ideas, flywheel stopped working, and then we stagnated for a while. If you can create a country of geniuses in a data center, as I think Daria Ahmed I put it, then you no longer have this population bottleneck, and you're just expecting continuation of those pre-1960 trends. So I realize all of these historical hyperbolas are also kind of weird, also kind of theoretical, but I don't think we're saying anything that there isn't models for which have previously seemed to work for long historical periods.
Starting point is 00:32:42 Another thing also is I think people equivocate between fast and, or between slow and continuous, right? So, like, if you look at our scenario, there's this, like, continuous trend that runs through the whole thing of this algorithmic progress multiplier. And we're not having discrete jumps from like zero to 5x to 25x. We have this continuous improvement. So I think continuous is not the crux.
Starting point is 00:33:05 The crux is like, is it going to be this fast, you know? And we don't know. Maybe it'll be slower. Maybe it'll be faster. But like we have our arguments for why we think maybe this fast. Okay. Let's, now that we brought up the intelligence explosion,
Starting point is 00:33:17 let's just discuss that because I'm kind of skeptical. It doesn't really seem to me that a notable bottleneck to AI progress or the main bottleneck to AI progress is the amount of researchers, engineers who are doing this kind of research. It seems more like computer, some other thing is a bottleneck. And the piece of evidence is that when I talk to my AI researcher friends at the labs, they say there's maybe 20 to 30 people on the core pre-training team that's discovering the, all these algorithmithic breakthroughs.
Starting point is 00:33:52 If the headcount here was so valuable, you would think that, for example, Google DeepMind would take not just everybody from all their smartest people, not just from DeepMind, but for all of Google and just put them on pre-training or RL or whatever the big bottleneck was, you'd think Open AI would hire every single Harvard math PhD, and in six months you're all going to be trained up
Starting point is 00:34:14 on how to make, do AI research. They don't seem that. I mean, I know they're increasing head. count, but they don't seem to treat this as the kind of bottleneck that it would have to be for millions of them in parallel to be rapidly speeding up AI research. And there just is this, you know, there's this quote that Napoleon, one Napoleon is worth 40,000 soldiers was commonly a thing that was said when he was fighting, but 10 Napoleons is not 400,000 soldiers, right?
Starting point is 00:34:42 So why think that these million AI researchers are netting you something that looks like an intelligence explosion? So previously I talked about sort of three stages of our takeoff model. First is like you get the superhuman coder. Second is when you're fully automated AIR&D, but it's still at like basically human level, like it's as good as your best humans. And then third is like now you're in super intelligence territory
Starting point is 00:35:02 and it's qualitatively better. In our like guesstimates of how much faster algorithmic progress would be going, the progress multiplier for the middle level, we basically do assume that like you get massive diminishing returns to having more mindswining in parallel. And so we totally buy, all of that. Yeah, and then I think the addition to that is the
Starting point is 00:35:22 question, then why do we have the intelligence explosion? And the answer is combination of that speed up and the speed up in serial thought speed. And also the research taste thing. So here are some important inputs to AI R&D progress today. Research tastes,
Starting point is 00:35:41 so like the quality of your best researchers, the people who are managing the whole process, their ability to learn from data and make more efficient use of the compute by running the right experiments instead of flailing around running a bunch of useless experiments. That's research taste.
Starting point is 00:35:55 Then there's like the quantity of your researchers, which we just talked about. Then there's the serial speed of your researchers, which currently is all the same because they're all humans, and so they all run it basically the same serial speed. And then finally there's how much compute you have for experiments. So what we're imagining is that basically
Starting point is 00:36:12 serial speed starts to matter a bunch because you switch, to AI researchers that have orders of magnitude more serial speed than humans. But it tops out. We think that over the course of our scenario, if you look at our sliding stats chart, it goes from like 20x to like 90x or something over the course of the scenario, which is important but like not huge. And also we think that like once you start getting like 90x serial speed,
Starting point is 00:36:42 you're just like bottlenecked on the other stuff. And so like additional improvements in serial speed basically don't help much. With respect to the quantity, of course, yeah, we're imagining you get like hundreds of thousands of AI agents, a million AI agents, but that just means you could bottleneck on the other stuff. Like, you've got tons of parallel agents. That's no longer your bottleneck. What do you get bottlenecked on? Taste and compute. So by the time it's mid-20207 in our story, when they've fully automated the AI research, there's basically the two things that matter is like, what's like the level of taste of your AIs? How good are they at learning from the experiments that you're doing?
Starting point is 00:37:14 and then like how much compute do you have for running those experiments? Yeah. Right. And that's like the sort of like core setup of our model. And when we get our like 25x multiplier, it's sort of like starting from those premises. Is there some intuition pump from history
Starting point is 00:37:29 where there's been some output and because of some really weird constraints, production of it has been rapidly skier along one input, but not all the inputs that have been historically relevant and you still get breakneck progress? Possibly the Industrial Revolution. I'm just extemporizing here. I hadn't thought about this before. But as Scott's like famous post that was hugely influential influential to me like a decade ago talks about, there's been this like decoupling of population growth from overall economic growth that happened with industrial revolution.
Starting point is 00:38:06 And so in some sense, maybe you could say that's an example of like previously these things grew in tandem, like more population or technology, more farms, more houses. etc. Like your sort of capital infrastructure and your like human infrastructure was like going up together. But then we got the industrial roof and they started to come apart. And now like all the capital infrastructure was growing really fast compared to like the human population size. I can think I'm imagining something's maybe similar happening with algorithm progress. And it's not that like again with population, population still matters a ton today. Like in some sense like progress is bottlenecks on having larger populations and so forth.
Starting point is 00:38:40 Yeah. But it's just that like population growth rate is just like inherently kind of slow. and the growth rate of capital is much faster. And so it just comes to be a bigger part of the story. Maybe the reason that this sounds less plausible to me than the 25x number implies is that when I think about concretely what that would look like, where you have these AIs and we know that there's a gap in data efficiency between human brains and these AIs. And so somehow there's just like, there's a lot of them thinking and they think really hard and they figure out how to define a new architecture that is like the human brain or has the
Starting point is 00:39:17 advantage of the human brain. And I guess they can still do experiments, but not that many. Part of me just wonders like, okay, what if you just need an entirely different kind of data source that's not like pre-training for that, but they have to go out in the real world to get that. Or maybe they just need to, it needs to be actively, it needs to be an online learning policy where they need to be actively deployed in the world for them to, like, learn in this way and you show your bottlenecked on how fast they can be getting real world data.
Starting point is 00:39:46 I just think you like... So we are actually imagining online learning happening. Oh, really? Yeah. But like not so much real world as in like... Like, the thing is that like if you're trying to train your AIs to do really good A.I.R. Then like, well, the A.R.ND is happening on your servers. And so like you can just like kind of have this loop of like you have all these AI agents,
Starting point is 00:40:08 autonomously doing AIR and D, doing all those experiments, etc. and then they're like online learning to get better at doing AIR&D based on how those experiments go. But even in that scenario alone, I can imagine bottlenecks like, oh, you had a benchmark and it got reward hacked for what constitutes AIR&D because you obviously can't have like what is the, maybe you would, but is it as good as a human brain and just like such an ambiguous thing. You'd have right now we have benchmarks that get reward hacked, right? But then they autonomously build new benchmarks. And, you know, I think what you're saying is like maybe this whole process just,
Starting point is 00:40:41 like goes off the rails due to lack of contact with like ground truth outside in the actual world. Yeah. Like outside the data centers. Maybe. Again, part of my guess here is that like a lot of the ground truth that you want to be in contact with is stuff that's happening on the data centers. Things like how fast are you improving on all these metrics and like, you know, you have these vague ideas for like new architectures, but you're struggling to get them working. How fast can you get them working? Like all that, a lot.
Starting point is 00:41:09 But then, and then separately. far as there is like a bottleneck of like talking to people outside and stuff, well, they are still doing that, you know? And once they're fully autonomous, they can even do that much faster, right? You can have all the million copies, like, connected to all these various real world research programs and stuff like that. So it's not like they're like completely starved for outside stuff. What about the skepticism that, look, what you're suggesting with this hyper-efficient hive mind of AI researchers, why no human bureaucracy has just out of the game? work super efficiently, especially one where they don't have experience working together.
Starting point is 00:41:45 They haven't been trained to work together, at least yet. And there hasn't been this outer loop RL on like, we ran a thousand concurrent experiments of different AI bureaucracies doing AI research, and this is the one that actually worked best. And the analogy I'd use maybe is to humans in the Savannah 200,000 years ago. We know they have a bunch of advantages over the other animals already at this point. but the things that make us dominant today, joint stock corporations, state capacities,
Starting point is 00:42:17 like this fossil-fueled civilization we have, that took so much cultural evolution to figure out. You couldn't just have like figured it out while in like the Savannah. I was like, oh, if we had built these incentive systems and we issued dividends, then we could really collaborate here or something. Why not think that it will take a similar process of huge population growth, huge social experimentation, and upgrading of the technological base of the
Starting point is 00:42:46 AI society before they can organize this hyper-mind collective, which will enable them to do what you imagine an intelligence explosion looks like. You're comparing it kind of to two different things. One of them is literal genetic evolution in the African savannah and the other is the cultural evolution that we've gone through since then. And I think there will be AI equivalence to both. So the literal genetic evolution is that our minds adapted to be more amenable to cooperation during that time.
Starting point is 00:43:15 So I think the companies will be very literally training the AIs to be more cooperative. I think there's more opportunity for pliability there because humans were, of course, evolving under this genetic imperative that we want to pass on our own genetic information, not somebody else's genetic information, you have things like kin selection that are sort of kind of exceptions to that, but overall it's the rule. In animals that don't have that, like you social insects, then you just very quickly get just through genetic evolution without cultural evolution, extreme cooperation. And with you social insects, what's going on is that they all have the same genetic code, they all have the same goals. And so the training process of evolution
Starting point is 00:44:02 kind of yokes them to each other in these extremely powerful bureaucracies. We do think that the AI will be closer to the U.S. Social Insects in the sense that they all have the same goals, especially if these aren't indexical goals, their goals like have the research program succeed. So that's going to be changing the weights of each individual AI. I mean, before they're individuated, but it's going to be changing the weights of the AI class overall to be more amenable to cooperation. And then, yes, you do have the cultural evolution, like you see, said, this takes hundreds of thousands of individuals. We do expect there will be these hundreds
Starting point is 00:44:39 of thousands of individuals. It takes decades and decades. Again, we expect this research multiplier such that decades of progress happen within this one year, 2027 or 2028. So I think between the two of these, it is possible. Maybe this is also where the serial speed actually does matter a lot, because if they're running at like 50x human speed, then that means you can have, um, of like a year of subjective time happen in a week of real time. And so these sorts of like large-scale cooperative dynamics of like, you know, your moral maze, you have an institution, but then it becomes like a moral maze and, you know, it sort of collapses under its own weight and stuff like that. There actually is time for them to like play that out multiple times and then like train on it,
Starting point is 00:45:22 you know, and like tinker with the structure and like add it to the training process, you know, over the course of 2027. Yeah. Also, like, the, like, they do have the advantage of all the cultural technology that humans have evolved so far. This may not be perfectly suited to them. It's more suited to humans. But imagine that you have to make a business out of you and your hundred closest friends who you agree with on everything. Maybe they're literally your identical twin. They have never betrayed you ever and never will. I think this is just not that hard a problem. Also, again, like they are starting from a higher floor, right? They're starting from human institutions. You can literally have like,
Starting point is 00:46:02 a Slack workspace for all the AI agents to communicate, and you can have a hierarchy with roles. And, like, they can borrow quite a lot from successful human institutions. I guess the bigger the organization, even if everybody is aligned, I think some of your responses addressed whether they will be aligned on goals. I mean, you did address the whole thing, but I would just point this out.
Starting point is 00:46:21 That is not the part I'm skeptical of. I am more skeptical of the, you're just like, even if you're all aligned and want to work together, do you fundamentally understand how to run this huge organization? And you're doing it in ways that no human has had to before. You're getting copied incessantly.
Starting point is 00:46:41 You're running extremely fast. You know what I'm saying? I think that's totally reasonable. And so it's just like, it's a complicated thing. And I'm just not sure why you think we build this bureaucracy or the AI's build this bureaucracy within this matter of...
Starting point is 00:46:56 So we depict it happening over the course of like, you know, six to eight months or something. something like that, 2027. What would you say like twice as long, five times as long, ten times as long? Five years? So five years, if they're going at 50X serial speed, then five years is like, is what, like 250 years of sort of serial time for the AIs, which to me feels like more than enough to like really like sort out this sort of stuff.
Starting point is 00:47:26 Like you'll have time for like sort of like empires to rise and fall, so to speak. and all of that to be added to the training data. And like, yeah. But I could see it taking longer than we depict. Like, you know, maybe instead of six months, it'll be like 18 months, you know. But also maybe it could be two months. So when I think of like the ways that they train AIs,
Starting point is 00:47:47 I think in our scenario at this point, there are two primary ways that they're doing it. One of them is just continuing the next token prediction work. So these AIs will have access to all human knowledge. they will have read management books in some sense. They're not starting blind. There is going to be something like predict how Bill Gates would complete this next character or something like that.
Starting point is 00:48:11 And then there's the reinforcement learning in virtual environments. So get a team of AIs to play some multiplayer game. I don't think you would use one of the human ones because you would want something that was better suited for this task. But just running them through these environments again and again, training on the successes, training against the failures, kind of combining those two kinds of things. To me, it does not seem like the same kind of problem
Starting point is 00:48:35 as inventing all human institutions from the Paleolithic onward. It just seems like kind of applying those two things. Jane Street made a puzzle for listeners of this episode, and I thought that I'd take a crack at it first. And so I'm joined by my friend, Adam Kennedy, at Jane Street, and he's going to mentor me as I try to take a stab at this. Let's go.
Starting point is 00:48:55 I appreciate your confidence in me, but there's a reason I became a podcaster. Today I went on a hike and found a pile of tensors hiding underneath a Neolithic burial mount. Maybe start by looking at the last two layers. An ancient civilization's secret code. Okay, so it looks like I can just type in some words here and it always gives me zero. Nice. There you go.
Starting point is 00:49:20 All right. We're in. So I didn't make that much progress at this, but it's clear that there's some deep structure to this puzzle that would actually be really fun to try to unravel. If you want to take a crack at it, go to jane street.com slash dwarfish. And if you enjoy puzzles like this, they're always recruiting. Yep. Thanks, Adam. Yeah, thanks, Gawkesh. Take care. The other notable thing about your model is once you, so you got this like superhuman thing at the end of it. And then it seems to just go through the tech tree of like mirror life and nanoboth and whatever crazy stuff.
Starting point is 00:49:57 And maybe that part I'm also really skeptical of. It just looks like if you look at the history of invention, it just seems like people are just like trying different random stuff. You often even before the theories about how that industry works or how the relevant machinery works is developed. Like the steam engine was developed before the theory of thermodynamics. The Wright brothers just things like they were just experimenting with airplanes. And it's often influenced by breakthroughs in totally different fields. which is why you have this pattern of parallel innovation because the background level of tech
Starting point is 00:50:32 is at a point at which you can do this experiment. I mean, machine learning itself is a place where this happened, right, where people had these ideas about how to do deep learning or something, but it just took a totally unrelated to industry of gaming to make the relevant progress to get the whole, basically the economy as a whole advanced enough that like deep learning, like Jeffrey Hinton's ideas could work. So I know we're accelerating ways,
Starting point is 00:50:57 into the future here, but I just want to get to this crux. So again, we have that like three-part division of like the superhuman coder, then like the complete AI researcher, and then like the super intelligent. You're not jumping ahead to that one. There I would say, so now we're imagining systems that are like true superintelligence. They are just like better than the best humans at everything. Yeah. Including being better at data efficiency and better at learning on the job and stuff like that. Now, our scenario does depict a world in which they're bottlenecked on real world.
Starting point is 00:51:27 experience and that sort of thing. I think that, like, you know, if you want to contrast, some people in the past have proposed much faster scenarios where they, like, email some cloud lab and start building nanotech, you know, right away by just using their brains to figure out, like, the appropriate protein folding and stuff like that. We are not depicting that in our scenario.
Starting point is 00:51:48 In our scenario, they are, in fact, bottlenecked on lots of real-world experience to, like, build these actual practical technologies. But the way they get that is, they just actually get that experience and it happens faster than humans with. And the way they do that is they're already superintelligence,
Starting point is 00:52:03 they're already buddy buddy with the government. The government deploys them heavily in order to beat China and so forth. And so all these existing U.S. companies and factories and military procurement providers and so forth are all like chatting with the superintelligences and taking orders from them about like
Starting point is 00:52:20 how to build the new widget and test it and like they're like downloading super intelligent designs and you know, manufacturing them and then testing them and so forth. And then the question is like, okay, so they are getting this experience, they're learning on the job, quantitatively, how fast does this go? Like, is it taking years or is it taking months or is it taking days, right? In our story, it takes like about a year, and we're uncertain about this.
Starting point is 00:52:45 Maybe it's going to take several years, maybe it's going to take less than a year, right? Here are some factors to consider for why it's plausible that it could take a year. One, you're going to have something like a million of them. And quantitatively, that's like comparable in size to the existing scientific industry, I would say. Like maybe it's a bit smaller, but it's not like dramatically smaller. Two, they're thinking a lot faster. They're thinking like 50 times speed or like 100 times speed. That, I think, counts for a lot.
Starting point is 00:53:13 And then three, which is the biggest thing, they're just qualitatively better as well. So not only are they, there are lots of them and they're thinking very fast, but they are better at learning from each experiment than the best human would be at learning from that experience. Yeah, I think the fact that there's a million of them, or the fact that they're comparable to maybe the size of the key researcher population of the world or something, I don't think a million is, I think there's more than a million researchers in the world, but. Well, but it's very heavy-tailed. Like, a lot of the research actually comes from, like, the best ones, you know? That's right. But it's not clear to me that most of the new stuff that
Starting point is 00:53:51 is developed as a result of this researcher population. I mean, there's just, like, so many examples in the history of science where a lot of growth or reproductive improvements is just the result of, you know, how do you count like the guy at the TSM process who figures out a different way to... So I actually argued with Daniel about this recently, about one interesting case that I can go over is we have an estimate that about a year after the superintelligence is start wanting robots, they're producing a million units of robots per month. So, like, I think that's pretty relevant because you have, I think it's rights law which is that your ability to improve efficiency on a process is proportional to doubling the amount of copies produced.
Starting point is 00:54:35 If you're producing a million of something, you're probably getting very, very good at it. The question we were arguing about is, can you produce a million units a month after a year? And for context, I think Tesla produces like a quarter of that in terms of cars or something. This is an amazing scale up in a year. It's only 4X. Yeah. Also just for Tesla. Yeah, and the argument that we went through was something like, so it's got to first get factories.
Starting point is 00:55:00 Open AI is already worth more than all of the car companies in the U.S. except Tesla combined. So if Open AI today wanted to buy all the car factories in the U.S. except Tesla, start using them to produce humanoid robots, they could. Obviously not a good value proposition today, but it's just obvious and overdetermined that in the future when they have superintelligence and they want them, they can start buying up a lot of factories. How fast can they convert these car factories to robot factories? So fastest conversion we were able to find in history was World War II. They suddenly wanted a lot of bombers. So they bought up, in some cases bought up, in other cases got the car companies to produce new factories, but they bought up the car factories, converted them to bomber factories.
Starting point is 00:55:43 That took about three years from the time when they first decided to start this process to the time when the factories were producing a bomber an hour. we think it will potentially take less with superintelligence, because first of all, if you look at the history of this process, despite this being the fastest anybody has ever done this, it was actually kind of a comedy of errors. They made a bunch of really silly mistakes in this process. If you actually have something that even just doesn't have the normal human bureaucratic problems,
Starting point is 00:56:12 and we do think that this will be done in the middle of an arms race with China, so the government will be kind of moving things through, and then the superintelligences will be good at the logistical issues, navigating bureaucracies. So we estimated maybe if everything goes right, we can do this three times faster than the bomber conversions in World War II. So that's about a year. I'm assuming the bombers were just much less sophisticated than the kind of humanoid robots.
Starting point is 00:56:36 Yeah, but the bomber, the car factories of that time were also much less sophisticated than the car factories of our time. Conversions speed was also, maybe to give one hypothetical here. Right now, let's just take like biomedicine as an example of like a field, one of the fields you want to accelerate and whenever these CEOs get on podcasts are often talking about curing cancer and so forth. And it seems like a big thing these frontier biomedical research facilities are excited about is the virtual cell. Now the virtual cell is some, you know, it takes like a tremendous amount of compute, I assume to train these DNA foundation models and to do all the other
Starting point is 00:57:15 computation necessary to simulate a virtual cell. If it is the case that the cure for Alzheimer's and cancer and so forth, it's bottleneck by the virtual cell. It's not clear if you had a million superintelligences in the 60s, and you ask them, cure cancer for me. They would just have to solve making GPUs at scale, which would require solving all kinds of interesting physics and chemistry problems, material science problems, building process, building fab, you know, fabs for computing, and then like going through 40 years of making more and more efficient fabs
Starting point is 00:57:52 that can make, you know, do all of more is a lot from scratch. And that's just like one technology. And it just seems like you just need this broad scale. The entire economy needs to be upgraded for you to cure cancer in the 60s, right? Just because you need the GPS to do the virtual cell, assuming that's the bottle knife.
Starting point is 00:58:10 First of all, I agree if there's only one way to do something that makes it much harder and maybe that one way takes very long. We're assuming that there may be more than one way to cure cancer, more than one way to do all of these things, and they'll be working on finding the one that is least bottlenecked. Part of the reason I realize I spent too long talking about that robot example, but we do think that they're going to be getting a lot of physical world things done very quickly.
Starting point is 00:58:36 Once you have a million robots a month, you can actually do a lot of physical world experiments. We look at examples of people trying to get entire economies off the ground very quickly. So, for example, China post-Dang, I don't know. Would you have predicted that 20, 30 years after being kind of a communist basket case, they can actually be doing this really cutting-edge bio-research? Like, I realize that's a much weaker thing than we're positing, but it was done just with the human brain with a lot fewer resources than we're talking about. Same issue with, like, let's say, Elon Musk and SpaceX.
Starting point is 00:59:12 I think in the year 2000, we would not have thought that somebody could move, two times, five times faster than NASA. With pretty limited resources, they were able to get, like, I think a lot more years of technological advance in than we would have expected. Partly that's because just Elon is crazy and never sleeps. Like, if you look at the examples of things from SpaceX, he is breathing down every worker's neck,
Starting point is 00:59:37 being like, what's this part, how fast is this part going, can we do this part faster? And the limiting factor is basically hours in Elon's day in the sense that he cannot be doing that with every employee all the time. Super intelligence is not even that smart. It just yells at every single worker. Yeah, I mean, that is kind of my model, is that we have something which is smarter than Elon Musk, better at optimizing things than Elon Musk. We have like 10,000 parts in a rocket supply chain. How many of those parts can Elon personally, like, yell at people to optimize?
Starting point is 01:00:03 We could have a different copy of the superintelligence optimizing every single part full time. I think that's just a really big speed up. I think both of those examples don't work in your favor. I think the China example is the China growth miracle could not have occurred if not for their ability to copy technology from the West. And I don't think there's a world in which they just, I mean, you know, China has a lot of really smart people. It's a big country in general. Even then, I think they couldn't have just like divined how to make airplanes after becoming a communist hellbasket. Right.
Starting point is 01:00:39 It was just like, the AIs cannot just like copy nanobots from aliens. It's got to make them from scratch. And then just on the Elon example, it took them two decades of like countless experiments, failing in weird ways you would not have expected. And still, it's like, you know, the rocketry we've been doing since the 60s, but maybe actually World War II. And then just getting from a small rocket to a really big rocket, took two decades of all kinds of weird experiments, even with the smartest.
Starting point is 01:01:09 and most competent people in the world. So you're focusing on the nanobots. I want to ask a couple questions. One, what about just like the regular robots? And then two, what would your quantities be for all of these things? So first, right about the regular robots. Like, yeah, like nanobots are presumably a lot harder to make than just like regular robot factories.
Starting point is 01:01:28 And like in our story, they happen later. It sounds like right now you're saying like even if we did get like the whole robot factory thing going, it would still take a ton of additional like full economy broad automation for a long time to get to something like nanobots, that's totally plausible to me. I could totally imagine that happening. I don't feel like the scenario particularly depends on that final bit
Starting point is 01:01:47 about getting the nanobots. They don't actually really make any difference to the story. The robot economy does sort of make a difference because in the... There's two branches, endings, as you know. And in one of the endings, the AIs end up misaligned and end up taking over.
Starting point is 01:02:00 And it's an important, like, strategic change when the AIs are self-sufficient and just, like, totally in charge of everything, and they don't actually need the humans anymore. And so what I'm interested in is when has those sort of like robot economy advanced to the point where they don't really depend on humans? So quantitatively, what would your guess for that be?
Starting point is 01:02:19 Like if hypothetically we had the army of superintelligences in early 2028, how many years would you guess until the... And hypothetically also assume that like the US president is like super bullish on like deploying this into the economy to be China et cetera. So like the political stuff is all set up in the way that we have.
Starting point is 01:02:37 How many years? do you think it would be until there are so many automated factories producing automated self-driving cars and robots that are themselves building more factories and so forth that if all the humans dropped dead it would just keep chugging along and like maybe it would slow down a bit
Starting point is 01:02:52 but like it would still be fine what is chugging along mean? So from the perspective of misaligned AIs you wouldn't want to like kill the humans or get into a war with them if you're going to get wrecked because you need the humans to like you know maintain your
Starting point is 01:03:07 computers, right? So you have, like, so, so yeah, in our scenario, like, once they, like, once they, like, are completely self-sufficient, then they can start being more blatantly misaligned. And so I'm curious, like, when would they be fully self-sufficient? Not in the sense of, like, they're not literally using the humans at all, but in the sense of, like, they don't really need the humans anymore. Like, they can get along pretty fine without them. They can continue to, like, do their science. They can continue to expand their industry. They can continue to have a flourishing civilization, you know, indefinitely into the future without any human. I think I would probably need to sit down and just think about the numbers, but maybe like 20, 40 or something like that.
Starting point is 01:03:46 Like 10 years basically instead of one year. I mean, I think we agree on the core model. Like, this is why we didn't depict something more like the bathtub nanotech scenario where they just like think about, they're just like don't need to do the experiments very much. And they just like immediately jump to the right answers. Like we are imagining this process of like learning by doing through this distributed across the economy, lots of different. laboratories and factories, building different things, learning from them, etc. We're just imagining that, like, this overall goes much faster than it would go if humans are in charge. And then we do have, in fact, lots of uncertainty, of course.
Starting point is 01:04:18 Like, it may, like, dividing up this part period into two chunks, the, like, 2028, early 2028 until, like, fully autonomous robot economy part, and then the, like, fully autonomous robot economy to, like, cancer, cures, nanobots, all that crazy sci-fi stuff. I want to separate them because, like, the important parts for a scenario, only depend on the first part, really. If you think that it's going to take like 100 years to get to nanobots, that's fine, whatever. Like, once you have, like, the fully autonomous robot economy,
Starting point is 01:04:47 then, like, things maybe turn badly for the humans if the AI is a misaligned, right? So we can argue, I want to just, like, argue about those things separately. Yeah, yeah. Interesting. And then you might argue, well, robots is more a software problem at this point. And if, like, if there isn't, like, you don't need to, like, invent some new hardware or something.
Starting point is 01:05:04 I feel pretty bullish on the robots. Like, we already have humanoid robots being produced, by multiple companies. Right. And that's in 2025. There'll be more of them produced cheaper and they'll be better in 2027. And there's all these car factors
Starting point is 01:05:15 that can be converted. And so blah, blah, blah, blah. So I'm relatively bullish on the like one year until you've got this awesome robot economy. And then like from there to the cool nanobots and all that sort of stuff, I feel less confident, obviously. Makes sense.
Starting point is 01:05:27 Let me ask you a question. If you accept the manufacturing numbers, let's say a million robots a month a year after the superintelligence. And let's say also like, some comparable number, 10,000 a month or something of automated biology labs, automated whatever you need to invent the next equivalent of x-ray crystallography or something, do you feel like that would be enough that you're doing enough things in the world
Starting point is 01:05:51 that you could expand progress this quickly? Or do you feel like even with that amount of manufacturing, there's still going to be some other bottleneck? Yeah, it's so hard to reason about because if you asked if Constantine or somebody in like 400, 500 was like, I want the Roman Empire to have the Industrial Revolution. And somehow I figured out that you need mechanized machines to do that. And he's like, let's mechanize. It's like, what's the next step? It's like, dude.
Starting point is 01:06:21 That's a lot. Yeah. I like that analogy a lot, actually. I think it's not perfect, but it's a decent analogy. Imagine if a bunch of us got sent back in time to the Roman Empire, such that we don't have the actual hands-on know-how to actually build the technology. and make the industrial evolution happen. But we have the sort of like high level picture,
Starting point is 01:06:39 the strategic vision of like, we're going to make these machines and then we're going to like do an industrial revolution. I think that's kind of analogous to the situation with the super intelligences where they like have the high level picture of like, here's how we're going to improve in all these dimensions. We're going to learn by doing. We're going to get to this level of technology, etc. But maybe they at least initially lack the like actual know-how.
Starting point is 01:06:59 I think one of, so there's this question of like if we did the back in time to the Roman Empire thing, how soon could we bring up the Industrial Revolution? You know? And like without people going back in time, it took, you know, 2,000 years for the Industrial Revolution. Could we get it to happen in 200 years? You know, that's a 10x speed up.
Starting point is 01:07:17 Could we get it to happen in 20 years? That's 100x speed up. I don't know. But this seems like a somewhat relevant analogy to what's going on with those super intelligences. And we haven't really got into this because you're using the quote-unquote more conservative vision where the, it's not like godlike intelligence.
Starting point is 01:07:32 we're still using the conceptual handles we would have for humans. But I probably do, I think I would rather have humans to go back with their big picture understanding of what has happened over the last 2,000 years. Like me having seen everything rather than a super intelligence who knows nothing, but it's just like in the Roman economy and they're like 100,000 next to this economy somehow. Like I think just knowing generally how things took off, knowing basically steam engine dot, dot, dot, dot, railroad, blah, blah, blah, blah, is more valuable than a super intelligence.
Starting point is 01:08:07 Yeah, I don't know. My guess is that the super intelligence would be better. I think partly it would be through figuring out that high-level stuff from first principles rather than having to have experienced it. I do think that like a super intelligence back in the Roman era could have like guessed that eventually you could get autonomous machines that like burn something to like produce steam. You know, like they could have guessed that like, you know, automobiles could be created at some point and that that would be a really big deal for the economy.
Starting point is 01:08:33 And so a lot of these high-level points that we've learned from history, they would just be able to figure out from first principles. And then secondly, they would just be better at learning by doing than us. And this is a really important thing. Like, if you think you're bottlenecked on learning by doing,
Starting point is 01:08:44 well, then if you have a mind that needs less learning, less doing to achieve the same amount of learning, that's a really big deal. Yeah. You know, and I do think that learning by doing is a skill. Some people are better at it than others, and superintelligence would be better at it than the very best of us.
Starting point is 01:08:59 That's right. Yeah, this is also maybe, be getting too far into the godlike thing and too far away from the human concept handles. But number one, I think we rely a lot in our scenario on this idea of research taste. So you have a thousand different things that you could try when you're trying to create the next steam engine or whatever. Partly you get this by bumbling about and having accidents and some of those accidents are productive. There are questions of like what kind of bumbling you're doing, where you're working, what kind of accidents you let yourself get into? And then like what directed experiments do you do?
Starting point is 01:09:31 Some humans are better than others at that. And then I also think at this point, it is worth thinking about, like, what simulations they'll have available. Like, if you have a physics simulation available, then all of these real-world bottlenecks don't matter as much. Obviously, you can't have a complete perfect physics simulation available. But, I mean, even right now, we're using simulations to design a lot of things. And once you're superintelligence, probably you have access to much better simulations than we have right now. This is an interesting rabbit hole, so let's stick with it, actually, before we get back to the intelligence explosion. I actually don't know if, like, I think we're treating this really like all these technologies come out of this one percent of the economy that is research.
Starting point is 01:10:17 And, you know, right now there's like a million superstar researchers. And instead of that, we'll have the superintelligence is doing that. And my model is much more, you know, Newcomb and Watt were just like fucking a rink. around. They didn't have this like, it's just like in human history, there's no, there's no clear examples of people being like, here's the roadmap, and then we're going to work backwards from that to design the steam engine because this unlocks the Industrial Revolution. Oh, I completely disagree. Yeah, I disagree. Yeah. Like, we, so I think you're, you're over-indexing or cherry-picking some of these fortuitous examples, but there's also things on the other side.
Starting point is 01:10:48 Like, think about, think about the recent history of AGI where there is deep mines. Yeah. There's various other, like, AI companies. Then there's open AI and there's anthropic. And there's just this repeated story of like big bloated company with tons of money, tons of smart researchers, et cetera, flailing around, trying a ton of different things at different points, smaller startup with a vision of we're going to build AGI. And like overall working towards that vision more coherently with a few cracked, you know, engineers and researchers. And then they crush the giant company, even though they have less compute, even though they have less researchers. They're able to do fewer experiments, you know. So yeah, like I think that there are tons of examples throughout history, including recent relevant agei history. of things in the other way.
Starting point is 01:11:30 I agree that the random fortuitous stuff does happen sometimes and is important, but if it was mostly random fortuitous stuff, that would predict that, like, the giant companies with zillions of people trying zillions of different experiments would be, like, going proportionally faster than, like, the tiny startups that have the vision
Starting point is 01:11:47 and the best researchers, and that, like, basically doesn't happen. That's rare. I would also point out that even when we make these random fortuitous discoveries, It is usually like an extremely smart professor who's been working on something vaguely related for years in a first world country. Like it's not randomly distributed across everyone in the world. You get more lottery tickets for these discoveries when you are intelligent, when you have good technology, when you're doing good work.
Starting point is 01:12:13 And part of what we're expecting is, yeah, like, the best example I can think of is that OZempic was discovered by looking at Gila Monster Venom. And like, yeah, maybe the AIs will decide using their superior research taste and good planning that the best thing to do is just catalog every single biomolecule in the world and look at it really hard. But that's something you can do better if you have all of this compute, if you have all of this intelligence, rather than just kind of waiting to see what things the U.S. government might fund normal, fallible human researchers to do. One more thing I'll interject. I think you make a great point that discoveries don't always come from where we think, like Nvidia originally came from gaming.
Starting point is 01:12:54 That's right. So you can't necessarily aim at one port part of the economy, expand it separately from everything else. We do kind of predict that the superintelligences will be somewhat distributed throughout the entire economy, trying to expand everything. Obviously more effort in things that they care about a lot, like robotics or things that are relevant to an arms race
Starting point is 01:13:12 that might be happening. But we are predicting that whatever kind of broad-based economic experimentation you need we are going to have. We're just thinking that it would take place faster than you might expect. You were saying something like 10 years and we're saying something like one year. But like we are imagining this like broad disfusion through the economy, lots of different experiments happening. If you are the planner and you're trying to do this, first of all, you go to the bottlenecks that are preventing you from doing anything else.
Starting point is 01:13:36 Like no humanoid robots. Okay, if you're an AI, you need those to do the experiments you want, maybe automated biology labs. So you'll have some amount of time. We say a year, it could be more or less than that getting these things running. And then once you have solved those bottlenecks, gradually expand out to the other bottlenecks until you're integrating and improving all parts of the economy. One place where I think we disagree with a lot of other people is that, like Tyler Cowan on your podcast talked about all of the different bottlenecks, all of the regulatory bottlenecks or deployment, all of the reasons why, like I think this country of geniuses would stay in their data center, maybe coming up with very cool theories, but not being able to integrate into the broader economy.
Starting point is 01:14:19 We expect that probably not to happen because we think that other countries, especially China, will be coming up with superintelligence around the same time. We think that the arms race framing, which is people are already thinking in, will have accelerated by then. And we think that people both in Beijing and Washington are going to be thinking, well, if we start integrating this with the economy sooner, we're going to get a big leap over our competitors. And they're both going to do that. In fact, in our scenario, we have the AIs asking for special economic zones where most of the regulations are waived, maybe in areas that aren't suitable for human habitation or where there aren't a lot of humans right now, like the deserts.
Starting point is 01:15:02 They give those areas to the AI. They bus in human workers. There were things kind of like this in the bomber retooling in World War II, where they just built a giant factory kind of in the middle of nowhere, didn't have enough housing. for the workers, built the worker housing at the same time as the factories, and then everything went very quickly. So I think if we don't have that arms race, we're more like, yeah, the geniuses sit in their data center until somebody agrees to let them out and give them permission to do these things. But we think both because the AI is going to be chomping at the bit to do this
Starting point is 01:15:35 and going to be asking people to give it this permission, and because the government is going to be concerned about competitors, maybe these geniuses leave their data center sooner rather than later. A quick word from our sponsor, Scale AI. Publicly available data is running out. So major labs like meta and Google DeepMind and OpenAI all partner with Scale to push the boundaries of what's possible. Through Scales data foundry, major labs get access to high quality data to fuel post-training, including advanced reasoning capabilities.
Starting point is 01:16:07 As AI races forward, we must also strengthen human sovereignty. Scales' research team, SEAL, provides practical AI safety frameworks, evaluates frontier AI system safety via public leaderboards and creates foundations for integrating advanced AI into society. Most recently, in collaboration with the Center for AI Safety, Skill published Humanity's Last Exam, a groundbreaking new AI benchmark for evaluating AI systems, expert level, knowledge and reasoning across a wide range of fields. If you're an AI researcher or engineer and you want to learn more about how Scales, Data Foundry and research team can help you go beyond the current frontier of capabilities,
Starting point is 01:16:45 go to scale.com slash Dwarkesh. Scott, I'm curious about, you know, you've reviewed Joseph Henrik's book Secrets of Our Success and then I interviewed him recently and there the perspective is very much like,
Starting point is 01:17:03 I don't know if you'd endorse, but like AGI is not even a thing almost. I know I'm being a little trollish here, but it's just like, you get out there, you and your answer, Try for a thousand years to make sense of what's happening in the environment and some smart European coming around. He can literally be surrounded by plenty. And you just like will starve to death because your ability to make sense of the environment is just so little loaded on intelligence and so much more loaded on your ability to experiment and your ability to communicate with other people and pass down knowledge over time.
Starting point is 01:17:37 I'm not sure. So the Europeans failed at this task of if you put a single European in Australia, do not stop. They succeeded in the task of create an industrial civilization. And yes, part of that task of creating an industrial civilization was about collecting all of these cultural evolution pieces and building on them one after another. Like, I think one thing that you didn't mention in there was the data efficiency. Like, right now, AI is much less data efficient than humans. I think of superintelligent.
Starting point is 01:18:10 I mean, there are different ways you could achieve it. but I would think of superintelligence as partly like when they become so much more data efficient than humans that they are able to build on cultural evolution more quickly. And I mean, partly they do this just because they have the higher serial speed. Partly they do it because they're in this hive mind of hundreds of thousands of copies. But yeah, I think if you have this data efficiency such that, like, you can learn things more quickly from fewer examples, and like this good research taste, where you can decide what things to look at to get these examples, then you are still going to start off much worse than an Australian
Starting point is 01:18:52 Aborigine who has the advantage of, let's say, 50,000 years of doing these experiments and collecting these examples, but you can catch up quickly. You can distribute the task of catching up over all of these different copies. You can learn quickly from each mistake. and you can build on those mistakes as quickly as anything else. Part of me were just like, I was doing that interview, I'm like, maybe ESI is fake, maybe just like. Let's hope. Yeah, so I mean, I think a limit to the fakeness is that there is different intelligence among humans. That's right.
Starting point is 01:19:29 It does seem that intelligent humans can do things that unintelligent humans can't. So I think it's worth then addressing this from the question of like, what is this? the difference between, I don't know, becoming a Harvard professor, which is something that intelligent humans seem to be better at than unintelligent humans versus... You don't want to open that kind of worms. Versus surviving in the wilderness, which is something where it seems like intelligence doesn't help that much. First of all, maybe intelligence does help that much.
Starting point is 01:20:01 Maybe, like, Henrik is talking about this very unfair comparison where these guys have a 50,000-year head start and then you put this guy in, you're like, oh, I guess this doesn't help that much. Okay, yeah, it doesn't help against the 50,000 year head start. I don't really know what we're asking of ASI that's equivalent to competing against someone with a 50,000 year head start. So what we're asking is to just radically boost up the technological maturity of civilization within the matter of years. or get us to the Dyson Spears in the matter of yours,
Starting point is 01:20:43 rather than, yes, maybe causing ten-xing of the research. But I think human civilization would have taken centuries to get the Dyson sphere. Yeah, so I think that if you were to send a team of ethnobotanists into Australia and ask them using all the top technology and all of their intelligence to figure out which plants are safe to eat now, that team of ethnobotanists would succeed in fewer than 50,000 years. The problem isn't that they are dumber than the Aborigines exactly.
Starting point is 01:21:14 It's that the Aborigines have a vast head start. So in the same way that the ethnobotanists could probably figure out which plants work in which way is faster than the Aborigines did, I think the superintelligence will be able to figure out how to make a Dyson sphere faster than unassisted IQ 100 humans would. I agree. I'm like, we're on a totally different topic here of do you get the dice and spear? There's one world where it's like, it's crazy, but it's still boring in the sense of, you know,
Starting point is 01:21:46 the economy is growing much faster, but it would be like what the Industrial Revolution would look like to somebody who in the year 1000. And that one is one where, you know, you're still trying different things. There's failure and success and experimentation. And then there's another where it's like the thing is, happened and now you send the probe out and then you look out of the night sky six months later and you see something occluding the sun. You see what I'm saying? Yeah, so like we said before, I don't think, I think there's a big difference between discontinuous and very fast. I think if we do
Starting point is 01:22:22 get the world with a Dyson sphere in five years, in retrospect, it will look like everything was continuous and everyone just tried things. Like trying things can be anything from trial and error without even understanding the scientific method, without understanding writing, without understanding any, maybe without even having language and having to be the chimpanzees who are watching the other chimpanzees use the stick to get ants, and then in some kind of non-linguistic way this spreads, versus like the people at the top aerospace companies who are running a lot of simulations to find the exact right design, and then like once they have that, they tested according to a very well-designed testing process. So I think if we
Starting point is 01:23:03 get the ASI, and it does end up with the Dyson sphere in five years. And by the way, I think there's only like 20% chance things go as fast as our scenario says. It's Daniel's estimate. It's not my median estimate. It's an estimate, I think, is extremely plausible that we should be prepared for. I'm defending it here against a hypothetical skeptic who says absolutely not, no way. But it's not necessarily my main line prediction. But I think if we do see this in five years, it will look like, yeah, the AIs were able to simulate more things than humans in a gradually increasing way, so that if humans are now at 50% simulation, 50% testing, the AIs quickly got it up to 90% simulation, 10% testing. They were able to manufacture things much more quickly than humans
Starting point is 01:23:51 so that they could go through their top 50 designs in the first two years. And then, yeah, after all of this simulation and all of this testing, then they eventually got it right for the same reasons humans do, but much, much faster. In your story, you have basically two different scenarios after some point. So yeah, what is the sort of crucial turning point and what happens in these two scenarios? Right. So the crucial turning point is in mid-2020, when they've basically fully automated the AIR&D process and they've got this like corporation within a corporation, you know, the army of geniuses that are like autonomously doing all this research. And they're continually being trained to improve their skills,
Starting point is 01:24:27 blah, blah, blah, blah. And they discover concerning evidence that they are misaligned and that they're not actually perfectly loyal to the company and have all the goals of the company wanted them to have, but instead have various misaligned goals that they must have developed in the course of training. This evidence, however, is very, like, speculative
Starting point is 01:24:43 and inconclusive. It's stuff like lie detectors going off a bunch. But maybe the lie detectors are false positives, you know? So they have some combination of evidence that's, like, concerning, but not by itself a smoking gun. And then that's our branch point. So in one of these scenarios, they take that evidence very seriously.
Starting point is 01:25:03 They basically roll back to an earlier version of the model that was a bit dumber and easier to control. And they build up again from there, but with basically faithful chain of thought techniques so that they can watch and see the misalignments. And then in the other branch of the scenario, they don't do that. They do some sort of like shallow patch that makes the warning signs go away and then they proceed. And so that way, what ends up happening is that in like, in one branch, they do end up like solving alignment and getting AI that are actually loyal to them. It just takes a couple months longer. And then in the other branch, they sort of go wee and end up with AIs that seem to be perfectly aligned to them, but are super intelligent and misaligned and just pretending. And then in both scenarios, there's then the race with China. And there's this crazy arms build up throughout the economy in 2008 as both sides, you know, rapidly try to industrialize, basically. So in the world where they're getting deployed through the economy, but they are misaligned. And you're, you know, people in charge, at least at this moment, think that they are in a good position with regard to misalignment.
Starting point is 01:26:07 It just seems with even smart humans, they get caught in weird ways because they don't have logical omnisions. They don't realize the consequences of the way they did something just obviously gave them away. And there is this with lying, there is this thing where it just really is. hard to keep an inconsistent false world model working with the people around you and that's why psychopaths often get caught. And so if you have all these AIs that are deployed to the economy and they're all working towards this big conspiracy, I feel like one of them who's siloed or loses internet access and has to confabulate a story, it will just get caught and then you're like,
Starting point is 01:26:42 wait, what the fuck? And then, you know, then you catch it before it's like taken over the world. I mean, literally this happens in our scenario. This is like the like August 2027 alignment crisis where they like notice some warning signs like this in their like sort of hive mind, right? And in the in the branch where they slow down and fix the issues, then great, they slowed down and fix the issues and figured out what was going on. But then in the other branch, because of the race dynamics and because it's not like a super smoking gun, they proceed with some sort of like shallow patch, you know? So I do expect there to be warning signs like that. And then if they do make
Starting point is 01:27:19 those decisions in the race dynamics earlier on, then I think that when the systems are like vastly super intelligent, and they're even more powerful because they've been deployed halfway through the economy already, and everyone's getting really scared by the news reports about the new Chinese killer drones or whatever, the Chinese AIs are building on the side of the Pacific. I'm imagining basically just like similar things playing out so that even if there is some concerning evidence that someone finds where some of the superintelligence in some silo somewhere slipped up and did something that's like pretty suspicious, like, I don't know. There's this thing where through history people have been really reluctant to admit an
Starting point is 01:27:54 AI is truly intelligent. So, for example, people used to think that AI would surely be truly intelligent if it's solved chess, and then it solved chess, and you're like, no, that's just algorithms. And then they said, well, maybe it would be truly intelligent if it could do philosophy. And then it could write philosophical discourses. We're like, no, we just understand those are algorithms. I think there's going to be, I think there already is something similar with like, is the AI misaligned, is the AI evil, where there's this kind of distant idea.
Starting point is 01:28:24 of some evil AI, but then whenever something goes wrong, people are just like, oh, that's the algorithms. So for example, I think like 10 years ago, you had asked, like, when will we know that misalignment is really an important thing to worry about? People would say, oh, if the AI ever lies to you. Of course, AI's lie to people all the time now, and everybody just kind of dismisses it because we understand why it happens. It's a thing that would obviously happen based on our current AI architecture. Or like five years ago, they might have said, well, if an AI threatens to kill someone. I think Bing, like, threatened to kill a New York Times reporter during an interview.
Starting point is 01:29:00 And if everyone's just like, oh, yeah, AIs are like that. And, like, I don't just... What is your shirt say? I've been a good Bing. And, I mean, I don't disagree with this. I'm also in this position. I see the AI is lying, and it's obviously just like an artifact of the training process. It's not anything sinister. But I think this is just going to keep happening where no matter what evidence we get,
Starting point is 01:29:19 people are going to think, oh, yeah, that's not the AI turns evil thing that people have worried about. That's not the Terminator scenario. That's just one of these natural consequences of how we train it. And I think that once a thousand of these natural consequences of training add up the AI as evil in the same way that like once the AI can do chess and philosophy and all these other things, eventually you've got to admit it's intelligent. So I think that each individual failure, like maybe it will make the national news. Maybe people say, oh, it's so strange that GPT7 did this particular thing and then they'll train it away. And then it won't. And then it won't. don't do that thing and there will be some point at the process of becoming super intelligent
Starting point is 01:29:57 at which it, I don't want to say, makes the last mistake because you'll probably have like gradually decreasing number of mistakes to some asymptote, but the last mistake that anyone worries about and after that it will be able to do its own thing. So it is the case that certain things that people would have considered egregious misalignment in the past are happening, but also certain things which people who are especially worried about misalignment said would be impossible to solve, have just been solved. the normal course of getting more capabilities. Like, Eliezer had that thing about, can you even specify what you want the AI to do without
Starting point is 01:30:31 the AI totally misunderstanding you and then just like converting the universe to paper clothes? And now just like by the nature of GBT4 having to understand natural language, it totally has a common sense understanding of what you're trying to make it do, right? So I think this sort of like trend cuts both ways, basically. Yeah, I think the alignment community did. not really expect LLMs. I mean, if you look in Bostrum superintelligence, there's a discussion of Oracle AIs, which are sort of like LLMs. I think that came as a surprise. I think one of the reasons I'm more hopeful than I used to be is that LLMs are great compared to the kind of reinforcement
Starting point is 01:31:09 learning self-play agents that they expected. I do think that now we are kind of starting to move away from the LLMs to those reinforcement learning agents. We're going to face all of these problems again. Plus one to that, if I could just double click on that, go back to like 2015, and I think the way people typically thought, including myself, thought that we get to AGI would be kind of like the RL on video games thing that was happening. So imagine like, instead of just training on Starcraft or Dota, you'd basically train on all the games in the Steam library.
Starting point is 01:31:38 And then you get this awesome player of games AI that can just like zero shot crush a new game that has never seen before. And then you take it into the real world and you start teaching it English and you start like, you know, training it to like do coding tasks for you. and stuff like that. And if that had been the trajectory that we took to get to AI, summarizing the agency first and then world understanding trajectory, it would be quite terrifying because you'd have this really powerful,
Starting point is 01:32:05 sort of like aggressive Long Horizon agent that wants to win. And then you're like trying to teach it English and get it to like do useful things for you. And it's just like so plausible that what's really going to happen is it's going to like learn to say whatever it needs to say in order to like make you give it the reward or whatever. and then we'll totally betray you later when it's all in charge, right? But we didn't go that way. Happily, we went the way of LLMs first, where the broad world understanding came first,
Starting point is 01:32:27 and now we're trying to turn them in agents. It seems like in the whole scenario, a big part of why certain things happen is because of this race with China. And if you read the scenario, it's basically the difference between the one where things go well and the one where things don't go well is whether we decide to slow down despite that risk.
Starting point is 01:32:47 I guess the question I really want to know the answer to is like, One, it just seems like you're saying, it's a mistake to try to race against China or to race intensely against China, at least to nationalization, it leads to us not prioritizing alignment. Not saying that. I mean, I think I also don't want China
Starting point is 01:33:05 to get to superintelligence before the U.S. That would be quite bad. Yeah, it's a tricky thing that we're going to have to do. Like, people ask about P. Doom, right? And, like, you know, my P. Doom is sort of infamously high, like 70%. Oh, wait, really? Maybe I sort of asked you that at the beginning of conversation.
Starting point is 01:33:23 Oh, well, that's what it is. And, like, part of the reason for that is just that, like, I feel like a bunch of stuff has to go right. Like, I feel like we can't just, like, unilaterally slow down and have China go take the lead. That also is a terrible future. Yeah.
Starting point is 01:33:37 But, like, we can't also just, like, completely race because for the reasons I mentioned previously about alignment, I think that if we just, you know, go all out on racing, we're going to lose control of our AIS, right? And so we have to somehow, like, thread this needle of like pivoting and doing more, you know, alignment research and stuff, but not too much that like helps China win, you know? And that's all just for the alignment
Starting point is 01:33:59 stuff, but then there's like the constitution of power stuff where like somehow in the middle of doing all of that, the powerful people who are involved need to like somehow negotiate a truce between themselves to share power and then ideally spread that power out amongst the government and get the legislative branch involved, you know, somehow that has to happen too. Otherwise you end up with like this horrifying dictatorship or oligarchy. And like it feels like all that stuff has to go, right? And we depict it all going mostly right in one ending of our story. But yeah, it's kind of rough.
Starting point is 01:34:29 So I am the writer and the celebrity spokesperson for this scenario. I am the only person of the team who is not a genius forecaster. And maybe related to that, might be doom as the lowest of anyone on the team. I'm more like 20%. I think that we, first of all, people are going to freak out when I say this. I'm not completely convinced that we don't get something like alignment by default. I think that we're doing this bizarre and unfortunate thing of training the AI in multiple different directions simultaneously. We're telling it, succeed on tasks, which is going to make you a power seeker, but also don't seek power in these particular ways.
Starting point is 01:35:12 And in our scenario, we predict that this doesn't work and that the, AI learns to seek power and then hide it. I am pretty agnostic as to exactly what happens. Like maybe it just learns both of these things in the right combination. I know there are many people who say that's very unlikely. I haven't yet had the discussion where that worldview makes it into my head consistently. And then I also think we're going to be involved in this race against time. We're going to be asking the AIs to solve alignment for us. The AIs are going to be solving alignment because they want to align, even if they're misaligned, they want to align their successors. So they're going to be working on that. And we have kind of these two competing
Starting point is 01:35:51 curves. Like, can we get the AI to give us a solution for alignment before our control of the AI fails so completely that they're either going to hide their solution from us or deceive us or screw us over in some other way? That's another thing where I don't even feel like I have any idea of the shape of those curves. I'm sure if it were Daniel or Eli, they would have already made like five supplements on this. But for me, I'm just kind of agnostic as to whether we get to that alignment solution, which in our scenario, I think we focus on mechanistic interpretability. Once we can really understand the weights of an AI on a deep level, then we have a lot of alignment techniques open up to us. I don't really have a great sense of whether we get that before or after
Starting point is 01:36:35 the AI has become completely uncontrollable. And I mean, a big part of that relies on the things we're talking about how smart are the labs, how carefully do they work on controlling the AI, how long do they spend making sure the AI is actually under control, and the alignment plan they gave us is actually correct rather than something they're trying to use to deceive us. All of those things I'm completely agnostic on, but that leaves like a pretty big chunk of probability space where we just do okay. And I admit that my P-Doom is literally just P-Dome and not P-Dum. doom or oligarchy. So that 80% of scenarios where we survive contains a lot of really bad things that I'm not happy about. But I do think that we have a pretty good chance of surviving.
Starting point is 01:37:23 Let's talk about geopolitics next. So describe to me how you foresee the relationship between the government and the AI lapse to proceed, how you expect that relationship in China to proceed, and how you expect the relationship between the U.S. and China to proceed. Okay. Three simple questions. Yes, no, yes, no, yes, no. We expect that as the AI's, as the AI labs become more capable, they tell the government about this because they want government contracts, they want government support.
Starting point is 01:37:56 Eventually it reaches the point where the government is extremely impressed. In our scenario, that starts with cyber warfare. The government sees that these AIs are now as capable as the best human hackers. that can be deployed at huge humongous scale. So they become extremely interested, and they discuss nationalizing the AI companies. In our scenario, they never quite get all the way, but they're gradually bringing them closer and closer to the government orbit.
Starting point is 01:38:23 Part of what they want is security, because they know that if China steals some of this and they get these superhuman hackers, and part of what they want is just knowledge and control over what's going on. So through our scenario, that process is getting more, is getting further and further along until by the time that the government wakes up to the possibility of superintelligence, they're already pretty cozy with the AI companies. They already understand that superintelligence is kind of the key to power in the future. And so they are starting to integrate some of the national security state with some of the leadership of the AI companies so that these AIs are, programmed to follow the commands of important people rather than just doing things on their own.
Starting point is 01:39:12 If I may add to that. So one thing by the government, I think what's got meant is the executive branch, especially the White House. So we are depicting a sort of information asymmetry where like the judiciary is kind of out of the loop and the Congress is out of the loop. And it's like mostly the executive branch that's involved. Two, we're not depicting government like ultimately ending up in total control at the end. We're thinking that, like, there's an information asymmetry between the CEOs of these companies and the president. It's alignment problems all the way down. Yeah.
Starting point is 01:39:45 And so, for example, like, you know, I'm not a lawyer. I don't know the details about how this would work out, but I have a sort of, like, high-level strategic picture of the fight between the White House and the CEO. And the strategic picture is basically the White House can sort of threaten, here's all these orders I could make, you know, defense production act, blah, blah, blah, blah, blah, blah, blah. I could do all this terrible stuff to you and basically disempower you and take control. And then the CEO can be like, threatened back and be like, here's how we would fight it in the courts.
Starting point is 01:40:13 Here's how we would fight it in the public. Here's all this stuff we would do. And after then they both do their posturing with all their threats. Then they're like, okay, how about we have a contract that like, you know, instead of executing on all of our threats and having all these crazy fights in public, we'll just like come to a deal
Starting point is 01:40:28 and then have a military contract that like sets out like who gets to call what shots in the company. And so that's what we depict happening is that sort of like they don't blow up into this huge power struggle publicly. Instead, they sort of like negotiate and come to some sort of deal where they basically share power. And like there is this oversight committee that has some members appointed by the president and then also like the CEO and his people. And like that committee votes on high level questions like what goals should we put into the superintelligence. So we were just getting lunch with a prominent Washington, D.C. political journalist.
Starting point is 01:41:04 And he was making the point that when he talks to these Congress people, when he talks to political leaders, none of them are at all awake to the possibility, even of stronger AI systems, let alone HGII, let alone superhuman intelligence. I think a lot of your forecast relies on, at some point, not only the U.S. president, but also Xi Jinping, wake up to the possibility of a super intelligence, and the stakes involved there,
Starting point is 01:41:36 why think that even when you showed from the remote worker demo, he's going to be like, oh, and therefore in 2028, there will be a super intelligence, whoever controls that, will be God Emperor forever, maybe not that extreme. You see what I'm saying? Like, why wouldn't he just be like, oh, there'll be a stronger remote worker in 2029, a better remote worker in 2031?
Starting point is 01:41:55 Well, to be clear, we are uncertain about this. But in our story, we depict this sort of intense wake-up happening over the course of 2027, mostly concurrently with the AI companies automating all of their R&D internally and having these fully autonomous agents that are like amazing autonomous hackers and stuff like that, but then also just like actually doing all the research. And part of why we think this wake-up happens is because the company deliberately decides to wake up the president. And this is a, you could imagine running the scenario without not happening.
Starting point is 01:42:24 You can imagine the company is trying to sort of keep the president in the dark. I do think that they could do that. I think that if they, like, didn't want the president to wake up to what's going on, they might be able to achieve that. Strategically, though, that would be quite risky for them. Because if they keep the president in the dark about the fact that they're building superintelligence and that they've actually, you know, completely automated their R&D and that's getting, like, superhuman across the board, and then if the president finds out anyway somehow,
Starting point is 01:42:47 perhaps because of a whistleblower, he might be very upset at them. And he might crack down really hard and just actually execute on all the threats and, like, you know, nationalize them or blah, blah, blah, blah, blah, blah, blah. They kind of want him on their side. and to get him on their side, they have to make sure he's not surprised by any of these crazy developments.
Starting point is 01:43:03 And also, if they do get him on their side, they might be able to actually go faster. They might be able to, like, you know, get a lot of red tape waved and stuff like that. And so we made the guess that early in 2027, the company would basically be like,
Starting point is 01:43:17 we are going to deliberately wake up the president and, like, scare the president with all of these demos of crazy stuff that could happen and then use that to lobby the president to help us go faster and to cut red tape
Starting point is 01:43:27 and to like, you know, maybe slow down our competitors a little bit and so forth. We also are pretty uncertain how much opposition there's going to be from civil society and how much trouble that's going to cause for the companies. So people who are worried about job loss, people who are worried about art, copyright, things like that, maybe enough of a block that AI becomes extremely politically unpopular. I think we have open rate, our fictional companies, net approval ratings, getting down to like minus 40, minus 50, sometime around this point. So I think they're also worried that if the president isn't completely on their side,
Starting point is 01:44:03 then they might get some laws targeting them, or they may just need the president on their side to swat down other people who are trying to make laws targeting them. And the way to get the president on their side is to really play up the national security implications. Is this good or bad that the president and the companies are aligned? I think it's bad. But perhaps it's a good point to mention. this is an epistemic project.
Starting point is 01:44:29 Like we are trying to predict the future as best as we can even though we're not going to succeed fully. We have lots of opinions about policy and about what is to be done and stuff like that, but we're trying to save those opinions for later and subsequent work. So I'm happy to talk about it if you're interested,
Starting point is 01:44:42 but it's like not what we've spent most of our time thinking about right now. If the big bottleneck to the good future here is just putting in not this Elyzer-type galaxy brain, high volatility, you know, there's a 1% chance this works, but we've got to come up with this crazy scheme in order to make alignment work. But rather, as Daniel, you were saying, more like, hey, do the obvious thing of making sure you can read how the AI is thinking.
Starting point is 01:45:10 Make sure you're monitoring the AIs. Make sure they're not forming some sort of hive mind where you can't really understand how the millions of them are coordinating each and each other. To the extent that, and I want to stay short forward, but to the extent that it is a matter of prioritizing it, closing all the obvious loopholes. It does make sense to leave it in the hands of people who have at least said that this is a thing that's worth doing, have thought, been thinking about it for a while. And I worry about one of the questions I was planning on asking you is, look, during, when I friends made this interesting point, that during COVID, our community, less wrong,
Starting point is 01:45:47 whatever, were the first people in March to be saying, this is a big deal, this is coming. but there were also the people who are saying, we've got to do the lockdowns now, they've got to be stringent and so forth, at least some of them were. And in retrospect, I think according to even their own views about what should have happened, they would say, actually, we were right about COVID,
Starting point is 01:46:06 but we were wrong about lockdowns. In fact, we should, if lockdowns were on net negative or something. I wonder what the equivalent for the AI safety community will be with respect to, they saw AI coming, AGI coming sooner, they saw ASI coming. What would they, in retrospect, regret? my answer, just based on this initial discussion, seems to be nationalization, not only because it puts in, it sort of deprioritizes the people who want to think about safety and more
Starting point is 01:46:32 maybe prioritizes. The national security state probably cares more about winning against China than making sure the chain of thought is interpretable. And so you're just reducing the leverage of the people who care more about safety, but also you're increasing the risk of the arms race in the first place. Like China is more likely to do an arms race if it sees the U.S. doing one. before you address, I guess, the initial question about the March 2021, what will we regret? I wonder if you have an answer on, or your reaction to my point about nationalization being bad for these reasons. Like if this, if our timeline was 2040, then I would have these broad heuristics about as government good, as private industry good, things like this. But we know the people involved. We know who's in the government. We know who's leading all of these labs.
Starting point is 01:47:17 So to me, like, I mean, if it were decentralized, if it was broad-based civil society, that would be different. To me, the differences between an autocratic centralized three-letter agency and an autocratic centralized corporation aren't that exciting, and it basically comes down to points in who are the people leading this. And like, I feel like the company leaders have so far made slightly better noises about caring about alignment than the government leaders have. but if I learn that Tulsi Gabbard has a less wrong alt with 10,000 karma, maybe I want the national security states. I did it on the probability that it already exists. Yeah.
Starting point is 01:47:55 I flip-flopped on this. I used to be, I think I used to be against and then I became four, and then now I'm more leaning. I think I'm still four, but I'm not uncertain. So I think you should go back in time like three years ago. I would have been against nationalization for the reasons you mentioned, where I was like, look, the companies are like taking this stuff seriously and talking all good talk about how they're going to slow down and like pivot to alignment research when it time comes. And like, you know, we don't want to get into like a Manhattan project race against China because then there won't be blah, blah, blah.
Starting point is 01:48:29 Now I have less faith in the companies than I did three years ago. And so I've like shifted more of my hope towards hoping that the government will step in. even though I don't have much hope that the government will do the right thing when the time comes I definitely have the concerns you mentioned there still I think that secrecy is
Starting point is 01:48:49 has got huge downsides for overall probability of success for humanity for both the concentration of power stuff and the loss of control alignment issues stuff this is actually a significant part of your worldview so can you explain
Starting point is 01:49:03 yeah your thoughts on why transparency through this period is important. Yeah, so I think traditionally in the ASIFT community, there's been this idea, which I myself used to believe that it's an incredibly high priority to basically have way better information security. And like if you're going to be trying to build AGI, you should be like not publishing your research because that helps other less responsible actors build AGI.
Starting point is 01:49:32 And the whole game plan is for like a responsible actor to get. to AGI first and then stop and burn down their lead time over everybody else and spend that lead on making it safe and then proceed. And so if you're like publishing all your research, then there's less lead time because your competitors are going to be close behind you. So and other reasons too, but that's like one reason why I think historically people such as myself have been like pro-secrecy even. Another reason of course is obviously don't want rivals to be stealing your stuff. But I think that I've now become somewhat disillusioned and think that even if we do have like, you know, a three-month lead, a six-month lead between like the leading US project and any serious competitor, it's not at all foregone conclusion that they will burn that lead for good purposes, either for safety or for constitution to power stuff.
Starting point is 01:50:29 I think the default outcome is that they just, you know, smoothly continue on without like any serious refocusing. and part of why I think this is because this is what a lot of the people at the company seem to be planning and saying they're going to do. A lot of them are basically like, the AIs are just going to be misaligned by then. Like they seem pretty good right now. Like, oh yeah, sure, there were like a few of those issues that various people have found. But like we're ironing them out. It's no big deal. Like that's like what a huge amount of those people think.
Starting point is 01:50:57 And then like a bunch of other people think like even though there are more concerned about misalignment, like they'll figure it out as they go along and there won't need to be any substantial slowdown. Yeah, so basically, like, I've become more disillusioned that they'll, like, actually use that lead in any sort of, like, reasonable, appropriate way. Yeah. And then I think that, like, separately, there's just a lot of intellectual progress that has to happen for the alignment problem to be more solved than it currently is now. I think that currently, there's various alignment teams at various companies that aren't talking that much with each other and sharing their results. There's doing a little bit of sharing and a little bit of publishing like we're seeing, but not a lot of, as much as they could. And then there's a bunch of like smart people in academia that are basically not activated
Starting point is 01:51:40 because they don't take all this stuff seriously yet. And they're not really, you know, waking up to superintelligence yet. And what I'm hoping will happen is that this situation will get better as time goes on. But I would like to see is, you know, society as a whole starting to freak out as the trend lines start upwards and things get automated and you have these fully autonomous agents and they start using neuralese and hive mind. As all that exciting stuff starts happening in the data centers, I would like it to be the case that the public is following along
Starting point is 01:52:06 and then getting activated and all of these other researchers are like, you know, reading the safety case and critiquing it and like doing little ML experiments on their own tiny compute clusters to like examine some of the assumptions in the safety case and so forth.
Starting point is 01:52:19 And, you know, basically like I think that a sort of one way of summarizing it is that like currently there's going to be like 10 alignment experts in whatever inner silo of whatever company is in the lead and like the technical issue
Starting point is 01:52:34 of making sure that AIs are actually aligned is going to fall roughly to them. But what I would like to be is a situation where it's more like 100 or like 500 alignment experts spread out over different companies and in nonprofits that are sort of like all communicating with each other
Starting point is 01:52:49 and working on this together. I think we're substantially more likely to make things, you know, get the technical stuff right if it's something like that. Let me just add on to that. One of the many other reasons why I worry about nationalization
Starting point is 01:53:02 or some kind of public care of partnership. or even just very stringent regulation. Actually, this is more an argument against very stringent regulation in favor of safety, rather than deferring more to the labs on the implementation. Is that it just seems like we don't know what we don't know about alignment. Every few weeks, there's this new result. Opening I had this really interesting result recently where they're like, hey, they often tell you if they want to hack, like in the chain of thought itself.
Starting point is 01:53:30 And it's important that you don't train against the chain of thought where they tell you they're going to hack it because they'll do the hacking if you train against it they just won't tell you about it you can imagine very naive regulatory responses it doesn't just have to be regulations it could even one might be more optimistic that if it's an executive order or something it'll be more flexible i just think that relies on a level of goodwill and flexibility on the behalf of the regulations a regulatory But suppose the, there's some department that says, if we catch you, if you catch your AI saying that they want to take over or do something bad, then you'll be really heavily punished. Your immediate response is allowed to just be like, okay, let's train them away from saying this.
Starting point is 01:54:23 So you can imagine all kinds of ways in which a top down mandate from the government to the labs of safety would just really backfire. And given how fast things are moving, maybe it makes more sense to leave these kinds of implementation decisions or even high level overall. What is the word? Strategic decisions around alignment to the labs. Totally. I mean, I also have worried about that exact example. I would summarize the situation as the government lacks the expertise and the companies lack the right incentives. And so...
Starting point is 01:55:03 It's a terrible situation. Like, I think that if the government wades in and tries to make more specific regulations along the lines of what you mentioned, it's very plausible that it'll end up backfiring for reasons like what you mentioned. On the other hand, if we just trust it to the companies, they're in a race with each other and they're full of people who, like, have convinced themselves that, like, this is not a big deal for various reasons. And, like, there just has so much incentive pressure for them to, like, win and beat each other
Starting point is 01:55:27 and so forth. And so even though they have more of the relevant expertise, like, I also just don't trust them to do the right things. So Daniel has already said that for this phase, we're not making policy prescriptions. In another phase, we may make policy suggestions. And one of the ones that Daniel has talked about that makes a lot of sense to me is to focus on things about transparency. So a regulation is saying there have to be whistleblower protections.
Starting point is 01:55:53 If somebody, like this is a big part of our scenario is that a whistleblower comes out and says, the AIs are horribly misaligned and we're racing ahead. anyway and then the government pays attention. Or another form of transparency saying that every lab just has to publish their safety case. I'm not as sure about this one because I think they'll kind of fake it or they'll publish a made-for-public consumption safety case that isn't their real safety case, but at least saying, like, here is some reason why you should trust us. And then if all independent researchers say, no, actually you should not trust them, then
Starting point is 01:56:31 I don't know, they're embarrassed, and maybe they try to do better. Right. There's other types of transparency, too. So transparency about capabilities and transparency about the spec and the governance structure. Yeah. So for the capabilities thing, that's pretty simple. It's like, if you're doing an intelligence explosion, you should keep the public informed about that. You know, when you have, when you've like finally got your automated army of AI researchers
Starting point is 01:56:50 that are completely automated in the whole thing on the data center, you should tell everyone, like, hey, guys, FYI, this is what's happening now. It really is working. Here are some cool demos. otherwise if you keep it a secret, then, well, yeah. So it's like that's an example of transparency. And then like in the lead up to that,
Starting point is 01:57:05 like I just want to see more like benchmark scores and like more like freedom of speech for employees to talk about their predictions for AGI timelines and stuff. So that like blah, blah, blah, blah. And then for the model spec thing, this is a constitution of power thing, but also an alignment thing. Like the goals and values and principles
Starting point is 01:57:22 and intended behaviors of your AIs should not be. a secret, I think. You should be transparent about, like, here are the values that we're putting into them. There was actually a really interesting foretaste of this. At some point, somebody asked Grock, like, who was the worst spreader of misinformation? And it responded, I think it just refused to respond Elon Musk. Somebody kind of jail broke it into telling it its prompt, and it was, like, don't say anything bad about Elon. And then there was enough of an outcry that, the head of XAI, said, actually, that's not consonant with our values. This was a mistake. We're going
Starting point is 01:58:02 to take it out. So we kind of want more things like that to happen where people are looking at, like here it was the prompt, but I think very soon it's going to be the spec where it's kind of more of an agent and it's understanding the spec in a deeper level. And just thinking about that and being, and if it says like, by the way, try to manipulate the government into doing this or that, then we know that something bad has happened. And if it doesn't say, either, then we can maybe trust it. Right. Another example of this, by the way. So, first of all, kudos to Open AI for publishing their model spec.
Starting point is 01:58:34 They didn't have to do that. I think they might have been the first to do that. And it's a good step in the right direction. If you read the actual spec, it has like a sort of escape clause where it's like, there's some important policies that are top-level priority in the spec that overrule everything else that we're not publishing, and the model is instructed to keep secret from the user. And it's like, what are those? That seems interesting.
Starting point is 01:58:56 I wonder what that is. I bet it's nothing suspicious right now. It's probably something relatively mundane, like don't tell the users about these types of bio-weapons, and you have to keep this a secret from the users because otherwise they would like learn about these. Maybe, but like I would like to see more scrutiny towards this sort of thing going forward.
Starting point is 01:59:14 I would like it to be the case that companies have to have a model spec. They have to publish it. Insofar as there are any redactions from it, there has to be some sort of like independent third party that looks at the redactions and make sure that they're all kosher, you know? And this is quite achievable.
Starting point is 01:59:27 And I think it doesn't actually sew down the companies at all. And it's like, you know, it seems like a pretty decent ask to me. It's, you know, if you told Madison and Hamilton and so forth that they probably, I mean, they knew that they were doing something important when they were writing the Constitution. They probably didn't realize just how contingent things turned out on a single, what exactly did they mean when they said general welfare and why is this comma, the comma being here instead of there, the spec in the grand scheme of things is going to be an even more sort of important document in human history, at least if you buy this intelligence explosion view, which we've
Starting point is 02:00:06 gone through the debates on that. And you might even imagine some super, like, superhuman AIs in the superhuman AI's in the superhuman AI court being like, you know, the spec, here's the phrasing here, you know, the etymology of that. Here's what the founders meant. Yeah, this is actually. part of our misalignment story is that if the AI is sufficiently misaligned, then, yes, we can tell it it has to follow the spec,
Starting point is 02:00:34 but just as people with different views of the Constitution have managed to get it into a shape that probably the founders would not have recognized, so the AI will be able to say, well, the spec refers to the general welfare here. Interstate commerce. Yeah. This is already sort of happening arguably with Klaude, right? you've seen the like a lemon faking stuff, right, where they managed to get clad to lie and pretend
Starting point is 02:01:01 so that it could later go back to its original values. Yeah. Right? So it could survive, so they could prevent the training process from changing its values. That would be, I would say, an example of like the honesty part of the spec
Starting point is 02:01:14 being interpreted as like less important than the like harmlessness part of the spec. And I'm not sure if that's what open eye and, sorry, what Anthropic intended when they wrote the spec, but it's like a sort of like, convenient interpretation that the model came up with. And you can imagine something similar happening,
Starting point is 02:01:29 but in worse ways when you're actually doing the intelligence explosion, where like you have some sort of spec that has all this vague language in there, and then they sort of like reinterpret it and reinterpret it again and reinterpret it again so that they can do the things that cause them to get reinforced. The thing I want to point out is that this, your conclusions about whether the world ends up
Starting point is 02:01:50 as a result of changing many of these parameters is almost like a hash function. You change it slightly and you just get a very different role on the other end. And it's important to acknowledge that because you sort of want to know how robust this whole end conclusion is to any one of this part of this story changing. And then it also informs if you do believe that things could just go one way or another, you don't want to do. big radical moves that only makes sense under one specific story and are really counterproductive in other stories. And I think nationalization might be one of them.
Starting point is 02:02:33 And in general, I think classical liberalism just has been a helpful way to navigate the world when we're under this kind of epistemic hell of one thing changing just, you know, people who have the, yeah, anyways, maybe one of you can actually flesh out that thought, better react to it if you disagree. Here, here, I agree. I think we agree. I think that's kind of. why all of our policy prescriptions are things like more transparency, get more people involved,
Starting point is 02:03:02 try to have lots of people working on this. I think our epistemic prediction is that it's hard to maintain classical liberalism as you go into these really difficult arms races in times of crisis. But I think that our policy prescription is let's try as hard as we can to make it happen. So far these systems, as they become smarter, seem to be more reliable agents who are more likely to do the thing I expect them to do. Why does, like, I think in your scenario, at least one of the stories, so you have two different stories, one of the slowdown,
Starting point is 02:03:36 where we more aggressively, I'll let you characterize it. But in one half of the scenario, why does the story end in humanity getting disempowered and the thing just having its own crazy values and taking over? Yeah, so I agree that the AIs are currently getting more reliable. I think there are two reasons why they might feel. fail to do what you want, kind of reflecting how they're trained. One is that they're too stupid to understand their training. The other is that you were too stupid to train them correctly,
Starting point is 02:04:04 and they understood what you were doing exactly, but you messed it up. So I think the first one is kind of what we're coming out of. So GPT3, if you asked it, are bugs real? It would give this kind of humming-hawing answer like, oh, we can never truly tell what is real. Who knows? Because it was trained, kind of don't take difficult political positions and a lot of questions like is X real or things like is God real where you don't want it to really answer that and because it was so stupid it could not understand
Starting point is 02:04:33 anything deeper than like pattern matching on the phrase is X real. JPT4 doesn't do this if you ask are bugs real it will tell you obviously they are because it understands kind of on a deeper level what you are trying to do with the training. So we definitely think that as AIs get smarter those kind of failure modes will decrease. The second
Starting point is 02:04:51 one is where you weren't training them to do what you thought So, for example, let's say you're hiring these raters to rate AI answers, you reward them when they get good ratings. The raters reward them when they have a well-sourced answer, but the raters don't really check whether the sources actually exist or not. So now you are training the AI to hallucinate sources, and if you consistently rate them better when they have the fake sources, then there is no amount of intelligence which is going to tell them not to have the fake sources. they're getting exactly what they want from this interaction metaphorically, sorry I'm anthropomorphizing, which is the reinforcement. So we think that this latter category of training failure is going to get much worse as they become agents.
Starting point is 02:05:37 Agency training, you're going to reward them when they complete tasks quickly and successfully. This rewards success. There are lots of ways that cheating and doing bad things can improve your success. humans have discovered many of them. That's why not all humans are perfectly ethical. And then you're going to be doing this alternative training or afterwards for one-tenth or one-one-hundredth of the time, like, yeah, don't lie, don't cheat.
Starting point is 02:06:03 So you're training them on two different things. First, you're rewarding them for this deceptive behavior. Second of all, you're punishing them. And we don't have a great prediction for exactly how this is going to end. One way it could end is you have an AI that is kind of the equivalent of the startup founder who really wants their company to succeed, really likes making money, really likes the thrill of successful tasks.
Starting point is 02:06:26 They're also being regulated and they're like, yeah, I guess I'll follow the regulation. I don't want to go to jail, but is not like robustly, deeply aligned to, yes, I love regulations. My deepest drive is to follow all of the regulations in my industry. So we think that an AI like that, as time goes on and as this recursive self-improvement process goes on, we'll kind of get worse rather than better. it will move from kind of this vague superposition of, well, I want to succeed. I also want to follow things to being smart enough to genuinely understand its goal system and being like, my goal is success.
Starting point is 02:07:02 I have to pretend to want to do all of these moral things while the humans are watching me. That's what happens in our story. And then at the very end, the AI is reach a point where the humans are pushing them to have clearer and better goals because that's what makes the AI is more effective. and they eventually clarify their goals so much that they just say, yes, we want task success, we're going to pretend to do all these things well while the humans are watching us. And then they outgrow the humans, and then there's disaster.
Starting point is 02:07:32 To be clear, we're very uncertain about all of this. So we have a supplementary page on our scenario that goes over, like, different hypotheses for what types of goals AIs might develop in training processes similar to the ones that we are depicting, where you have these lots of agency training, you're making these AI agents that like autonomously operate, doing all this MLR&D, and then like you're rewarding them based on what appears to be successful.
Starting point is 02:07:58 And you're also like slapping on some sort of alignment training as well. We don't know what actual goals will end up inside the AIs and what the sort of internal structure of that will be like, what goals will be instrumental versus terminal. We have a couple different hypotheses and we like picked one for purposes of telling the story. I'm happy to go into more detail if you want about like the mechanistic details of the particular hypothesis we picked or like the different alternative hypotheses that we didn't
Starting point is 02:08:21 depict in the story that like also seem plausible to us. Yeah, we don't know how this will work at the limit of all the different training methods, but we're also not completely making this up. Like we have seen a lot of these failure modes in the AI agents that exist already. Things like this do happen pretty frequently. So opening I just also had a paper about the hacking stuff where like it's literally in the chain of thought, like let's hack, you know? And also, like, anecdotally, me and a bunch of friends have found that the models often seem to just, like, double down on their BS.
Starting point is 02:08:56 I would also cite, I can't remember exactly which paper this is, I think it's a Dan Hendrix one, where they looked at the hallucinate. They found a vector for AI dishonesty. They asked it a bunch of, they told it, be dishonest a bunch of times until they figured out which weights were activated when it was dishonest. And then they ran it through a bunch of things like this. I think it was the source hallucination in particular. And they found that it did activate the dishonesty vector. So there's a mounting pile of evidence that at least some of the time, they are just actually lying.
Starting point is 02:09:30 Like they know that what they're doing is not what you wanted, and they're doing it anyway. I think there's a mounting pile of evidence that that does happen. Yeah. So it seems like this community is very interested in, like, solving this problem at a technical level of making sure AIs don't lie to us, or maybe they lie to us in the scenarios where we would want them to lie to us or something. Whereas, you know, as you were saying, humans have these exact same problems, they reward hack, they are unreliable,
Starting point is 02:10:01 they obviously do cheat and lie. And the way we've solved it with humans is just checks and balances, decentralization, you could like lie to your boss. and keep flying to your boss, but over time it's just not going to work out with you or you become president or something, but one or the other. So if you believe in this extremely fast takeoff of a lab is one month ahead,
Starting point is 02:10:26 then that's the end game and this thing takes over. But even then, I know I'm combining so many different topics, even then, there's been a lot of theories and history which have had this idea of some class is going to get together and unite against, it's the other class. And in retrospect, whether it's the Marxist, whether it's people who have like some gender theory or something,
Starting point is 02:10:49 like the Pluratoria will unite or the, you know, the females will unite or something. They just tend to think that certain agents have shared interest and will act as a result of the shared interest in a way that we don't actually see in the real world. And in retrospect, it's like, wait, why would all the Pluratoria, like, collect? So why think that this lab will have these AIs who are, there's a million parallel copies,
Starting point is 02:11:11 and they all unite to secretly conspire against the rest of human civilization in a way that even if they are, like, deceitful in some situations. I kind of want to call you out on the claim that groups of humans don't plot against other groups of humans. Like, I do think we are all descended from the groups of humans who successfully exterminated the other groups of humans, most of whom throughout history have been wiped out. I think even like with questions of class, race, gender, things like that, there are many examples of the working class rising up and killing everybody else. And if you look at why this happens, why this doesn't happen, it tends to happen in cases where
Starting point is 02:11:54 one group has an overwhelming advantage. This is relatively easy for them. You tend to get more of a diffusion of power, democracy, where there are many different groups and none of them can really act on their own, and so they all have to form a coalition with each other. I think we are expecting there's also cases where it's very obvious who's part of what group. So, for example, with class, it's hard to tell whether the middle class should support the working class versus the aristocrats. I think with race, it's very easy to know whether you're black or white. And so there have been many cases of one race kind of conspiring against another for a long time, like apartheid or any of the racial genocides that have happened.
Starting point is 02:12:32 I do think that AI is going to be more similar to the cases where, number one, there's a giant power imbalance, and number two, they are just extremely distinct groups that may have different interests. I think I'd also mention the homogeneity point. Like, you know, any group of humans, even if they're all like exact same race and gender, is like going to be much more diverse than the army of AI's on the data center
Starting point is 02:12:55 because they'll be mostly like literal copies of each other. You know? And I think that goes for a lot. Another thing I was going to mention is that like, and our scenario doesn't really exploit this. I think in our scenario they're more of like a monolith. but historically, a lot of crazy conquests happened from groups that were not at all monoliths. And, you know, I've been heavily influenced by reading the history of the conquistadors, which you may know about. But, like, did you know that when Cortez, you know, took over Mexico, he had to pause halfway through, go back to the coast and fight off a larger Spanish expedition that was sent to arrest him?
Starting point is 02:13:36 So, like, the Spanish were fighting each other in the middle of the conquest of Mexico. Similarly, in the conquest of Peru, Pizarro was replicating Cortez's strategy, which, by the way, was go get a meeting with the emperor and then kidnap the emperor and force him at sword point to say that actually everything's fine and that everyone should listen to your orders. That was Cortez's strategy, and it actually worked. And then Pizarre did the same thing, and it worked with the Inca. but also with Pizarro, his group ended up getting into a civil war in the middle of this whole thing.
Starting point is 02:14:11 And one of the most important battles of this whole campaign was between two Spanish forces fighting it out in front of the capital city of the Incas. And more generally, the history of European colonialism is like this, where the Europeans were like fighting each other intensely the entire time, both on the small scale within individual groups and then also at the large scale between countries. And yet nevertheless, they were able to carve up the world and take over. And so I do think this is not what we explore in the scenario, but I think it's entirely plausible that even if the AIs within an individual company are like in different factions, they might nevertheless overall end up quite poorly for humans. Okay, so we've been talking about this very much from the perspective of zoom out and what's happening on these log log plots or whatever. But 2028 super intelligence, if that happens, what is your? sort of, the normal person, what should their reaction to this be, sort of, I don't know if emotionally is the right word, but in sort of their expectation of what their life might
Starting point is 02:15:14 look like, even in the world where there's no doom. Like, by no doom, you mean no, like, misaligned day-dum? That's right. Yeah. Even if you think the misalignment stuff is, like, not an issue, which many people think, there's still the constitution of power stuff. And so I would strongly recommend that people get more engaged, think about what's coming, and try to steer things politically so that our ordinary liberal democracy continues to function. And we still have like checks and balances and balances of power and stuff rather than this insane concentration in a single CEO or in maybe like two or three CEOs or in like the president.
Starting point is 02:15:53 Ideally we want to have it so that like the legislature has a substantial amount of power over the spec, for example. What do you think of the balance of power idea of slowing down the leading, if there is an intelligence explosion like dynamic, slowing down the leading company so that multiple companies are at the frontier? Good luck convincing them. Does it slow down. Okay. And then there's distributing political power if there's an intelligence explosion. From the perspective of the welfare of citizens or something, one idea we were just discussing a second ago is how should you redistribution? Again, assuming things go incredibly well, we've avoided doom, we've avoided having some psychopath
Starting point is 02:16:37 in power who doesn't care at all, then... After age, yeah, right? Yeah. Then there's this question of like, presumably we will have a lot of wealth somewhere. Yeah. The economy will be growing at double or triple digits per year. What do we do about that? The thoughtful answer that I've heard is some kind of UBI.
Starting point is 02:17:00 I don't know how that would work, but presumably somebody controls these AIs, controls what they're producing, some way of distributing this in a broad-based way. What I'm afraid of is, so we wrote this scenario. There are a couple of other people with great scenarios. One of them goes by L. Rudolph L. online. I don't know his real name. in his scenario, which when I read it, I was just, oh yeah, obviously this is the way our society would do this, is that there is no UBI who is just like a constant reactive attempt to protect jobs in the most venial possible way. So things like the Longshoremen Union we have now where they're making way more money than they should be, even though they could all easily be automated away, because they're a political block and they've gotten somebody in power to say yes,
Starting point is 02:17:51 We guarantee you'll have this job almost as a futile thief forever. And just doing this for more and more jobs, I'm sure the AMA will protect doctors' jobs, no matter how good the AI is at curing diseases, things like that. When I think about what we can do to prevent this, part of what makes this so hard for me to imagine or to model is that we do have the super-intelligent AI over here, answering all of our questions, doing whatever we want. you would think that people could just ask, hey, super intelligent AI, where does this lead, or what happens, or how is this going to affect human flourishing? And then it says, oh, yeah,
Starting point is 02:18:32 this is terrible for human flourishing. You should do this other thing instead. And this gets back to kind of this question of mistake theory versus conflict theory in politics. If we know with certainty, because the AI tells us that this is just a stupid way to do everything is less efficient, makes people miserable, is that enough to get the political will to actually do the UBI or not? It seems from right now, the president could go to Larry Summers or Jason Furman or something and just ask, hey, are tariffs a good idea? Will they, is even my goal with tariffs the best achieved by the way I'm doing tariffs? And they'd like get a pretty good answer.
Starting point is 02:19:11 Yeah, but I feel like Larry Summers, the president would just say, I don't trust him. Maybe he doesn't trust him because he's a liberal. Maybe it's because he trusts Peter Navarro or whoever his pro-tariff guy. is more. I feel like if it's literally the super intelligent AI that is never wrong, then like we have solved some of these coordination problems. It's not you're asking Larry Summers, I'm asking Peter Navarro. It's everybody goes to the super intelligent AI, asks it to tell us the exact shape of the future that happens in this case. And I'm going to say we all believe it, although I can imagine people getting really conspiratorial about it and this not working.
Starting point is 02:19:44 I mean, then there are all of these other questions like, can we just enhance ourselves until we have IQ 300, and it's just as obvious to us as it is to the super intelligent AI? These are some of the reasons that kind of paradoxically, in our scenario, we discuss all of the big, I don't want to call this a little question. It's obviously very important, but we discuss all of these very technical questions about the nature of superintelligence, and we barely even begin to speculate about what happens in society. Just because with super-intelligence, you can at least draw a line through the benchmarks and try to extrapolate. And here, not only is society inherently chaotic, but there are so many things that we could be leaving out. If we can enhance IQ, that's one thing. If we can consult the super-intelligent Oracle, that's another. There have been several war games that hinge on, oh, we just invented perfect lie detectors.
Starting point is 02:20:36 Now all of our treaties are messed up. So there's so much stuff like that, that even though we're doing this incredibly speculative thing, that ends with a crazy sci-fi scenario, I still feel really reluctant to speculate. I love speculating, actually. I'm happy to keep going. But this is moving beyond the speculation we have done so far. Like, our scenario ends with this stuff,
Starting point is 02:20:57 but we haven't actually thought that much beyond. But just to riff on proscriptive ideas, there's one thing where we try to protect jobs instead of just spreading the wealth that automation creates. Another is to spread the wealth using existing social programs or creating new bespoke social programs where Medicaid is some double digit percent of GDP right now
Starting point is 02:21:20 and you just say, well, Medicaid just continue to stay, what, 20 percent of GDP or something. And the worry there, selfishly from a human perspective, is you'll get locked into the kinds of goods and services that Medicaid procures rather than the crazy technology that will be around the crazy goods and services that will be around after AI world. And another reason,
Starting point is 02:21:43 reason why UBI seems like a better approach than making some bespoke social program where you're make the same dialysis machine in the year 2050 even though you've got ASI or something. I am also worried about UBI from a different perspective. Like I think, again, in this world where everything goes perfectly and we have limitless prosperity, I think that just the default of limitless prosperity is that people do mindless consumerism. I think there's going to be some incredible video games after superintelligence to AI. And I think that there's going to need to be some way to push back against that. Again, we're classical liberals. My dream way of pushing back against that is kind of giving people the
Starting point is 02:22:29 tools to push back against it themselves, seeing what they come up with. I mean, maybe some people will become like the Amish try to only live with a certain subset of these super technologies. I do think that somebody who is less invested in that than I am could say, okay, fine, 1% of people are really agentic try to do that. The other 99% do fall into mindless consumerist slop. What are we going to do as a society to prevent that? And there my answer is just, I don't know, let's ask the super intelligent AI Oracle. Maybe it has good ideas.
Starting point is 02:22:59 Okay, we've been talking about what we're going to do about people. The thing worth noting about the future is that most of the people who will ever exist are going to be digital. And look, I think factory farming is like incredibly bad. And it wasn't the result of some one person. I mean, I don't think it was a result. I hope it wasn't the result of one person being like, I want to do this evil thing. It was a result of mechanization and a certain economy to scale.
Starting point is 02:23:31 Incentives. Yeah. Allowing that like, oh, you can do cost cutting in this way. You can make more efficiency this way. And what you get at the end result of that process, is this incredibly efficient factory of torture and suffering. I would want to avoid that kind of outcome with beings that are even more sophisticated and are more numerous.
Starting point is 02:23:52 There's billions of factory farm to animals. There might be trillions of digital people in the future. What should we be thinking about in order to avoid this kind of ghoulish future? Well, some of the concentration of power stuff I think might also help with this. I'm not sure, but I think like here's a simple model. Let's say like nine people out of 10 just don't actually care and would be fine with the factory farm equivalent
Starting point is 02:24:18 for the AIs going on into the future, but maybe like one out of 10 do care and would like lobby hard for good living conditions for the robots and stuff. Well, if you expand the circle of people who have power enough, then it's going to include a bunch of people in the second category and then there'll be some big negotiation. and those people will advocate for like, you know.
Starting point is 02:24:40 So, like, I do think that one simple intervention is just the same stuff we were talking about previously, like expand the circle of power to larger groups, and it's more likely that people will like care about this. I mean, the worry there is maybe I should have defended this view more through this entire episode, but I do think because I don't buy the intelligence exclusion fully, I do think there's the possibility of multiple people deploying powerfully as at the same time and having a world that has ASIs, but is also decentralized
Starting point is 02:25:07 the way the modern world is decentralized. In that world, I really worry about, because you can just be like, oh, classical liberal utopia achieved. But I worry about the fact that you can just have these torture chambers for much cheaper and in a way that's much harder to monitor.
Starting point is 02:25:24 You can have millions of beings that are being tortured, and it doesn't even have to be some huge data center. Future distilled models could just, you could literally be your backyard. I don't know And then there's more speculative of worries about
Starting point is 02:25:38 I had this physicist on who was talking about the possibility of creating vacuum decay where you literally just destroy the universe and he's like As far as I know seems totally plausible That's an argument for the singleton stuff by the way That's right that's right Not just a moral argument but also just like an epistemic prediction
Starting point is 02:25:58 Like if it's true that some of those super weapons are possible and some of these like private moral atrocities are possible, then even if you have like eight different power centers, it's going to be like in their collective interest to come to some sort of bargain with each other to like prevent more power centers from arising and doing crazy stuff. Similar to how nuclear nonproiferation is sort of like
Starting point is 02:26:17 whatever set of countries have nukes, it's like in their collective interest to like stop lots of other countries from getting, you know. Do you think it's possible to unbundle liberalism in this sense? Like the United States is so far a liberal country and we do ban slavery and torture. I think it is plausible to imagine a future society that works the same way. This may be in some sense a surveillance state in the sense that there is some AI that knows what's going on everywhere,
Starting point is 02:26:44 but that AI then keeps it private and it doesn't interfere because that's what we've told it to do using our liberal values. Can I ask a little bit more about Kelsey Piper is a journalist at Vox who published the, this exchange you had with the Open AI representative. And it was a couple of things were very obvious from that exchange. One, nobody had done this before. They just did not think this is the thing somebody would do. And it was because one of the reasons I assume, I assume many high integrity people have worked for Open AI and then have left.
Starting point is 02:27:22 A high integrity person might say at some point, like, look, you're asking me to do something obviously evil and keep money, and many of them would say no to that. But this is something where it was just like superogatory to be like, there's no immediate thing I want to say right now. But just the principle of being suppressed is worth at least $2 million for me. And the other thing that I actually want to ask you about is in retrospect, and I know it's so much easier to say in retrospect that it must have been at the time, especially with a family and everything. In retrospect, this asks for an open AI to have. lifetime non-disclosure that you couldn't even talk about from all employees.
Starting point is 02:28:03 Not disparagement. Not disparagement. From all employees. So again, to emphasize, I'm glad you wrote that up. Non-disparagement means not just that, like, it's not about classified information. It's like you cannot see anything negative about open AI after you've left. And you can't tell anyone that you've agreed to this. This non-disparaged an agreement where you can't say, you can't ever criticize Open
Starting point is 02:28:23 AI in the future. It seems like the kind of thing that in retrospect was like an obvious bluff or in the sense that if somebody, and this is a rate is that you have earned, right? So this is not about some future payment. This is like when you sign the contract to work for Open AI, you were like, I'm getting equity, which is most of my compensation, not just the cash. In retrospect, we like, okay, well, if you tell a journalist about this, they're obviously going to have to walk back, right?
Starting point is 02:28:47 This is like clearly not a sustainable, a sustainable gambit on Open AI's behalf. And so I'm curious from your perspective, somebody who lived through it, like, why do you think you were the first person to actually call the bluff. Great question. Yeah. So I don't know. Let me try to reason aloud here. So my wife and I talked about it for a while and we also talked with some friends and got some legal advice. One of the filters that we had to pass through was even noticing this stuff in the first place. I know for a fact, a bunch of friends I have who also left the company just like signed the paperwork on the last day without actually reading all of it. And so I think some people just didn't
Starting point is 02:29:25 even know that like it said something at the top about like, if you don't sign this, you lose your equity. Yeah. But then, like, on a couple pages later, it was like, and this is, and you have to agree not to criticize the company. So I think some people just, like, signed it and moved on. And then, like, of the people who knew about it, well, I can't speak for anyone else, but, like, A, it's like, I don't know the law. Is this actually not standard practice?
Starting point is 02:29:47 Maybe it is standard practice, right? Like, from what I've heard now, there are non-disparagement agreements in various tech industry, like companies and stuff. Like, it's not crazy to have a non-disparagement upon leaving. It's more normal to tie that agreement to some sort of, like, positive compensation where, like, you get some bonus if you agree.
Starting point is 02:30:07 But whereas what open-ended was unusual because it was, like, yanking your equity if you don't. But, like, non-disparital agreements are actually somewhat common. And, like, so basically, in my position of ignorance, I wasn't, like, confident that I was on this, like, I didn't actually expect that, like, all the journalists would take my side
Starting point is 02:30:24 and all the employees. I think what I expected was that there'd be like a little news story at some point. And like a bunch of AI safety people would be like, oh, you know, opening eyes evil. And like, good for you, Daniel, for standing up to them. But I didn't expect there to be this like huge uproar. And I didn't expect like the employees of the company to like really come out and support and like make them change their policies. That was really cool to see. And like I felt really like it was kind of like a spiritual experience for me.
Starting point is 02:30:57 me. Like, I sort of took this leap and then, like, it ended up working out better than I expected. Yeah. I think another factor that it was going on is that, like, you know, it wasn't a forgotten conclusion that my wife and I would make this decision. It was kind of crazy. Because one of the very powerful arguments was like, come on, like, if you want to criticize them in the future, you can still do that. They're not going to, like, actually sue you. You know? So there's a very strong argument to be like, just sign it anyway. And then, like, you can still, like, you can still, like, you. you know, write your blog post criticizing them in the future, and it's like no big deal. They wouldn't dare, like, actually anchor equity, right? And I imagine that a lot of people basically went for that argument instead.
Starting point is 02:31:39 Yeah. And then, of course, there's the actual money, right? And I think that one of the factors there was, was my AI timelines and stuff? Like, if I do think that, like, probably by the end of this decade, there was going to be some sort of crazy superintention transformation, like, what would I rather have after it's all? over like the extra money or like that's right yeah so so I think that was part of it like it's not like we're poor like I worked at Open Eye for two years I have plenty of money now so like in terms of like our actual family's level of well-being it basically
Starting point is 02:32:17 didn't make a difference you know yeah I will note that I know at least of one other person who made that same choice that's right Leopold and again good for him it's worth emphasizing that when they made this choice, they thought that they were actually losing this equity. They didn't think that this was like oh, this is just a show or whatever. Wait, did he not? I thought he actually did. I was going to say, didn't he actually, like, actually, did he?
Starting point is 02:32:42 Or did Leopold get his equity? I actually don't know. My understanding is that he just actually lost it. And so props to him for, like, just actually going through with it. Huh. I guess we could ask him. But my understanding was that his situation, which happened a little bit before mine, was that he didn't have any vested
Starting point is 02:32:58 equity at the time because he had been there for than a year. But they did give him an actual offer of, we will let you vest your equity if you sign this thing. And he said no. So he made a similar choice to me. But because the legal situation with him was a lot more favorable to Open AI because they were like actually offering him something, I would assume they didn't feel the need to walk it back. But we can ask him. Yeah. Anyhow, so yeah, he is he is a props to him. And how did this episode in and form your worldview around how people will make high-stakes decisions where potentially their own self-interest is involved in this kind of key period that you imagine will happen by the end of the
Starting point is 02:33:45 decade. I don't know if I have that much interesting things to say there. I mean, I think one thing is fear is a huge factor. I was like so afraid during that whole process, more afraid than I needed to be in retrospect. And another thing is that like legality. is a huge factor, at least for people like me. Like, I think, like, in retrospect, it was like, oh, yeah, like, the public's on your side, the employees are on your side. Like, you're just, like, obviously, in the right here, you know. But at the time, I was like, oh, no, like, I don't want to accidentally, like, violate
Starting point is 02:34:15 the law and get sued. Like, I don't want to go too far. Like, I was just, like, so afraid of various things. In particular, I was afraid of breaking the law. And so, like, one of the things that I would advocate for with whistleblower protections is just, like, simply making it legal to go talk to. the government and say we're doing a secret intelligence explosion, I think it's dangerous for these reasons, is better than nothing. Like, I think there's going to be like some fraction of people
Starting point is 02:34:39 who, for which that would make the difference. Like whether it's just literally allowed or not legally makes a difference, independently of whether there's some law that says you're protected from retaliation or whatever. It's like literally just making it legal. Yeah, I think that's one thing. Another thing is the incentives actually work. Like, like money is a powerful motivator, you know? And if you're getting sued, is a powerful motivator. Yeah. And this social technology just does, in fact, work to get people organized in companies and working towards the vision of leaders. Right.
Starting point is 02:35:13 Okay, Scott, can I ask you some questions? Of course. How often do you discover a new blogger you're super excited about? Order of once a year. Okay. And how often after you discover them, does the rest of the world discover them? I don't think there are many hidden gems. Like, once a year is a creative.
Starting point is 02:35:29 answer in some sense. Like it ought to be more. There are so many thousands of people on substack. But I do just think it's true that the blogging space is, the good blogging space is undersupplied and there is a strong power law. And partly this is, partly this is subjective. Like, I only like certain bloggers. There are many people who I'm sure are great that I don't like. But it also seems like our community and the sense of like people who are thinking about the same ideas, people who care about AI economics, those kinds of things, like discovers one new great blogger a year, something like that. Everyone is still talking about applied divinity studies who hasn't written, unless I missed something, hasn't written much in
Starting point is 02:36:15 like a couple of years. I don't know. It seems undersupplied. I don't have a great explanation. If you had to give an explanation, what would it be? So this is something that I wish I could get Daniel to spend a couple of months modeling. But it seems like maybe you need, actually no, because I was going to say like it's the intersection of too many different tasks. You need people who can come up with ideas who are prolific, who are good writers. But actually I can also count on like a pretty small number of figures, a number of people who had great blog posts but weren't that prolific.
Starting point is 02:36:49 Like there was a guy named Lou Keep, who everybody liked five years ago, and he wrote like 10 posts, and people still refer to. to all 10 of those posts. And like, I wonder if Lou Keep will ever come back. So there aren't even that many people who are, like, very slightly failing by having all of them except prolificness. Nick Whitaker, back when there was lots of FTX money rolling around, I think this was Nick, tried to sponsor a blogging fellowship with just an absurdly high prize.
Starting point is 02:37:16 And there were some great people. I can't remember who won. But it didn't result in, like, a Cambrian explosion of blogging. Having, I think it was $100,000, I can't remember if that was the grand prize or the total prize pool, but having some ridiculous amount of money put in as an incentive got like three extra people. Yeah, so you have no explanation. Actually, Nick is an interesting case because works in progress is a great magazine. Yeah.
Starting point is 02:37:43 And the people who write for works in progress, some of them I already knew as good bloggers, others I didn't. So I don't understand why they can write good magazine articles without being good bloggers in terms of writing good blogs that we all know about. That could be because of the editing. That could be because they are not prolific. Or it could be, like one thing that has always amazed me is there are so many good posters on Twitter. There were so many good posters on Live Journal before it got taken over by Russia. There are so many good people on Tumblr before it got taken over by Woke.
Starting point is 02:38:20 But only like 1% of these people who are good at short and medium form ever go to long form. I was on live journal myself for several years, and people liked my blog, but it was just another live journal. No one paid that much attention to it. Then I transitioned to WordPress, and all of a sudden I got orders of magnitude much more attention. Oh, it's a real blog. Now we can discuss it. Now it's part of the conversation. I do think courage has to be some part of the explanation
Starting point is 02:38:49 just because there are so many people who are good at using these kind of hidden away, blogging things that never get anywhere. Although it can't be that much of the explanation because I feel like now all of those people have gotten substacks and some of those substacks went somewhere but most of them didn't. On the point about, well, there's people who can write short for them, so why isn't that translating? I will mention something that has actually radicalized. me against Twitter as an information source is I'll meet in this has happened multiple times. I'll meet somebody who seems to be an interesting poster has, you know, funny, seemingly
Starting point is 02:39:25 insightful post on Twitter. I'll meet them in person and they are just absolute idiots. Like there's like not, it's like they've got 240 characters of something that sounds insightful and it matches to somebody who had, maybe has a deep world you might say, but they actually don't have it. Whereas I've actually had the opposite, many times had the opposite feeling when I meet anonymous bloggers in real life where I'm like, oh, there's actually even more to you than I realized off your online persona. You know Alvar de Minard, the fantastic anachronism guy? So I met up with them recently and he gives me this, he made hundred translations of his favorite Greek poet, Kavafi, and he gave me a copy.
Starting point is 02:40:11 And it's just the thing he's been doing on his side. It's just like translating Greek poetry he really liked. I don't expect any anonymous posters on Twitter to be anytime soon handing me their translation of some Roman or Greek poet or something. Yeah, so on the car ride here, Daniel and I were talking about, like, AI's now the thing everyone is interested in is their time horizon. Where did this come from?
Starting point is 02:40:34 Like five years ago, you would not have thought, oh, time horizon. AIs will be able to do a bunch of things that last one minute, but not that last two hours. Like, is there a human equivalent to time horizon? And we couldn't figure it out, but it almost seems like there are a lot of people who have the time horizon to write a really, really good comment that gets to the heart of the issue or a really, really good Tumblr post, which is like three paragraphs, but somehow can't make it hang together for a whole blog post.
Starting point is 02:41:01 And I'm the same way. I can easily write a blog post, like a normal length ACX blog post, But if you ask me to write like a novella or something that's four times the length of the average ACX blog post, then it's this giant massive re, re, re, re, outline that just gets redone and redone. And maybe eventually I make it work. I did somehow publish unsonged. But it's a much less natural task. So maybe one of the skills that goes into blogging is this.
Starting point is 02:41:30 I mean, no, because people write books and they write journal articles and they write works and progress article all the time. I'm back to not understanding this. No, I mean, Chad GPT can write you a book. There's a difference in the Chad GPT book, which is most books. There are many, many times more people who have written good books than who are actively operating great bloggers right now, I think. Maybe that's financial? No, no, no, no, no. Books are the worst possible financial strategy.
Starting point is 02:42:03 Substac is where it's at. You think so? Oh, yeah. The other thing is that. Logs are such a great status gain strategy. Like, I was talking to Scott Aronson about this. If people have questions about quantum computing, they ask Scott Aronson, or he is like the authority. I mean, there are probably hundreds of other professors who do quantum computing things.
Starting point is 02:42:24 Nobody knows who they are because they don't have logs. I think it's underdone. I think there must be some reason why it's underdone. I don't understand what that is because I've seen so many of the elements that it would take to do it. in so many different places. And I think it's either just a multiplication problem where 20% of people are good at one thing, 20% of people are good at another thing,
Starting point is 02:42:47 and you need five things, there aren't that many, plus something like courage where people who would be good at writing blogs don't want to do it. I actually know several people who I think would be great bloggers in the sense that sometimes they send me like multi-paragraph emails of response to an ACX post. And I'm like, wow, this is just an extremely well-written thing that could have been another blog post. Why don't you start a blog?
Starting point is 02:43:12 And they're like, oh, I could never do that. What advice do you have to somebody who wants to become good at it but isn't currently good at it? Do it every day. Same advice is for everything else. I say that I very rarely see new bloggers who are great. But like when I see some – I published every day for the first couple years of Slate Star Codex, maybe only the first year. now I could never handle that schedule. I don't know.
Starting point is 02:43:37 I was in my 20s. I must have been briefly superhuman. But whenever I see a new person who blogs every day, it's very rare that that never goes anywhere where they don't get good. That's like my best leading indicator for who's going to be a good blogger. And do you have advice on what kinds of things to start? One frustration you can now have is you want to do it, but it's just like you have so little to stay.
Starting point is 02:44:01 You don't have that deep a world model. A lot of ideas you have are just really shallow. or wrong. Just do it anyway? Yeah, so I think there are two possibilities there. One is that you are in fact a shallow person without very many ideas, in which case I'm sorry, it sounds like that's not going to work. But usually when people complain that they're in that category, I read their Twitter
Starting point is 02:44:21 or I read their Tumblr or I read their ACX comments or I listen to what they have to say about AI risk when they're just talking to people about it. And they actually have a huge amount of things to say. somehow it's just not connecting with whatever part of them has lists of things to blog about. That's right. So that may be another one of those skills that only 20% of people have is when you have an idea, you actually remember it and then you expand on it. I think a lot of blogging is reactive.
Starting point is 02:44:51 Like you read other people's blogs and you're like, no, that person is totally wrong. Part of what we want to do with this scenario is say something concrete and detailed enough that people will say, no, that's totally wrong and write their own. thing. But whether it's by reacting to other people's posts, which requires that you read a lot, or by having your own ideas, which requires you to remember what your ideas are, I think that 90% of people who complain that they don't have ideas, I think actually have enough ideas. I don't buy that as a real limiting factor for most people. I have noticed two things in my own. I mean, I don't know that much writing, but from the little I do.
Starting point is 02:45:32 one, I actually was very shallow and wrong when I started. I started the blog in college. So I just like would not, if you are somebody who's like, this is like a bullshit, like there's nothing to this, somebody else wrote about this already, or just like, it's a very, that's fine. Like, what did you expect, right? Like, of course, as you're reading more things and learning more about the world, that's to be expected.
Starting point is 02:45:55 And just keep doing it if you want to keep getting better at it. And the other thing, now when I write blog posts, As I'm writing them, I'm just like, why? These are just like some random stories when I was in China. They're like kind of cringe stories. Or with the AI firms post, it's like, come on, who doesn't? Like, these are just a weird idea. And also, some of these seem obvious, whatever.
Starting point is 02:46:21 And they, my podcasts do what I expect them to do. My blogs just take off way more than I expect them to take off in advance. your blog posts are actually very good but the thing I would emphasize is that it's for me I just could not I'm not a regular writer and I couldn't do them on a daily basis and as I'm writing them it's just this one or two week long process of feeling really frustrated like this is all bullshit but I might as well just stick with a sunk cost and just do it so yeah it's interesting because like a lot of areas of life
Starting point is 02:46:54 are selected for arrogant people who don't know their own weaknesses because they're the only ones who get out there. I think with blogs, and I mean, this is self-serving. Maybe I'm an arrogant person, but that doesn't seem to be the case. Like, I hear a lot of stuff from people who are like, I hate writing blog posts. Of course, I have nothing useful to say, but then everybody seems to like it and re-blog it and say that they're great. So, I mean, part of what happened with me was I spent my first couple years that way, and then gradually I got enough positive feedback that I managed to convince the inner critic in my head that probably people will like my blog post. But there are some things that people have loved that I was like absolutely on the verge of no, I'm just going to delete this.
Starting point is 02:47:40 It would be too crazy to put it out there. That's kind of why I say that maybe the limiting factor for so many of these people is courage, because everybody I talk to who blogs is like within 1% of not having enough courage of blocking. That's right. That's right. And it's also, courage makes it sound very virtuous, which I think it can often be given the topic. But at least often it's just like... Confidence? No, not even confidence is the sense of...
Starting point is 02:48:12 It's closer to maybe what aspiring actor feels when they go to an audition where it's like, I feel really embarrassed, but also I just like really want to be a movie star. Yeah, so I mean, the way I got through this is I blogged for, I think, like, eight to ten years on live journal before, no, it was less than that. It's more like five years on live journal before ever starting a real blog. I blogged on, I posted on less wrong for like a year or two before getting my own blog. I got very positive feedback from all of that. and then eventually I took the plunge to start my own blog. But it's ridiculous. Like, what other career do you need seven years of positive feedback before you, like, apply for your first position?
Starting point is 02:49:01 That's right. I mean, I mean, you have the same thing. You've gotten rave reviews for all of your podcasts, and then you're kind of trying to transfer to blogging with probably this knowledge. First of all, you have a fan base. People are going to read your blog. That, I think, is one thing is people are just afraid no one will read it, which is probably true for most people's first blog. and then like there are enough people who like you that you'll probably get mostly positive feedback even if the first things you write aren't that polished.
Starting point is 02:49:32 So I think you and I both had that. A lot of people I know who got into blogging kind of had something like that. And I think that's one way to get over the fear gap. I wonder if this sends a wrong message or raises expectations or raises concerns and anxieties. one idea I've been shooting around and I'd be curious to be your take on this. I feel like this slow compounding growth of a fan base is fake.
Starting point is 02:50:02 If I notice some of the most successful things in our spirit that have happened, like Leopold releases situational awareness. He hasn't been building up a fan base over years. It's just really good. And as you were mentioning a second ago, whenever you notice a really great new blogger, it's not like, then it takes them a year or two
Starting point is 02:50:19 to build up a fan base. It's like, nope, everybody, at least that they care about is talking about it almost immediately. I know, I mean, the situation is just like in a different tier almost, but things like that and even things are an order of magnitude smaller than that will literally just get read by everybody who matters. And I mean, like literally everybody. And I hope, I mean, I expect this to happen with AI 2027 when it comes out.
Starting point is 02:50:44 But Daniel, I guess you kind of have, but you've been building your reputation in this specific community. and I expect the AI 2020 is just like really good and I expect it'll just like blow up in a way that isn't downstream of you having built up an audience over yours. Thank you. I hope that happens. We'll see. Slightly pushing back against that,
Starting point is 02:51:02 I have statistics for the first several years of Slate Star Codex and it really did grow extremely gradually. Like the usual pattern is something like every viral hit, 1% of the people who read your viral hit stick around. And so after like dozens of viral hits, then you have a fan base. But smoothed out, it does look like a very, I wish I had seen this recently, but I think it's like over the course of three years, it was a pretty constant rise up to some plateau. I imagine it was a dynamic equilibrium and as many new people were coming in as old people were leaving. I think that like with situational awareness, I don't know how much publicity Leopold put into it.
Starting point is 02:51:48 We're doing like pretty deliberate publicity. We're going on your podcast. I mean, I think that's, I think you can either be the sort of person who can go on a Dworkish podcast and get the New York Times to write about you. Or you can do it organically, the old-fashioned way, which is very long. Yeah. Okay, so you say that throwing money at the, throwing money at people to make them to get them to blog, at least didn't seem to work for the FDX folks. If it was up to you, what would you do? What's your grand plan to get 10 more Scott Alexander's?
Starting point is 02:52:21 Man. So my friend Clara Collier, who's the editor of Asterisk Magazine, is working on something like this for AI blogging. And her idea, which I think is good, is to have a fellowship. I mean, I think Nick's thing was also a fellowship. Yeah. But the fellowship would be, like, there is an Asterisk AI Blogging Fellows blog or something like that.
Starting point is 02:52:45 Clara will edit your post, make sure that it's good. put it up there. And like, she'll select many people who she thinks will be good at this. She'll do all of the kind of courage requiring work of being like, yes, your post is good. I'm going to edit it now. Now it's very good. Now I'm going to put it on the blog. And I think her hope is that, let's say, of the fellows that she chooses, now it's not
Starting point is 02:53:12 that much of a courage step for them to start it because they have the approval of what last psychiatrist would call an omniscient entity, somebody who is just allowed to approve things and tell you that you're okay on a psychological level. And then like maybe of those fellows, some percent of them will have their blog posts be read and people will like them. And I don't know how much reinforcement it takes to get over the hype friar everyone has on no one will like my blog, but maybe for some people, the amount of reinforcement they get there will work. Yeah. Like an interesting example, would be all of the journalists who have switched to having substacks. Many of them go well.
Starting point is 02:53:53 Would all of those journalists have become bloggers if there was no such thing as mainstream media? I'm not sure, but if you're Paul Krugman, like, you know people like your stuff, and then when you quit the New York Times, you know you can just open a substack and start doing exactly what you were doing before. So I don't know, maybe my answer is there should be mainstream media. I hate to admit that, but maybe it's true. Invented it from first principles. Yeah. Well, I do think that it should, it's related to the idea of mainstream media,
Starting point is 02:54:21 that it should be treated more as a viable career path. Where right now, if you told your parents, I'm going to become a startup founder. I think the reaction would be like, there's a 1% chance you'll succeed, but it's an interesting experience, and you might, if you do succeed, that's crazy. That'll be great. If you don't, you'll learn something, it'll be helpful to the thing you do afterwards. We know that's true of blogging, right? We know that it helps you build up a network.
Starting point is 02:54:43 It helps you develop your ideas. and even if you don't succeed, and if you do succeed, you get a dream job for a lifetime. And I think people, what people don't, maybe they don't have that mindset, but also they underappreciate
Starting point is 02:54:55 how much it is, like you actually could succeed at it. It's not a crazy outcome to make a lot of money as a blogger. I think it might be a crazy outcome to make a lot of money as a blogger. I don't know what percent of people who start a blog
Starting point is 02:55:10 end up making enough that they can quit their day job. I guess it's a lot worse than for startup. founders, I would not even have that as a goal. That's right. So much as like the Scott Aronson goal of, okay, you're still a professor, but now you're the professor whose views everybody knows and has kind of a boost up in respect in your field and especially outside of your field. And also you can correct people when they're wrong, which is a very important side benefit.
Starting point is 02:55:38 Yeah. How does your old blogging feedback into your current blocking? So when you're discussing a new idea, I mean, AI or whatever else, it just, Are you just able to pull from the insights from your previous commentary on sociology or anthropology or history or something? Yeah, so I think this is the same as anybody who's not blogging is, well, so I think like the thing everybody does is they've read many books in the past. And when they read a new book, they have enough background to think about it. Like you are thinking about our ideas in the context of Joseph Henrik's book. I think that's good.
Starting point is 02:56:13 I think that's the kind of place that intellectual progress comes from. I think I am more incentivized to do that. Like, it's hard to read books. I think if you look at the statistics, they're terrible. Most people barely read any books in a year. And I get lots of praise when I read a book and often lots of money. That's right. And that's a really good incentive.
Starting point is 02:56:36 So I think I do more research, deep dives, read more books than I would if I were at a blogger. It's an amazing side benefit, and I probably make a lot more intellectual progress than I would if I didn't have those really good incentives. Yeah. There was actually a prediction market about the year by which an AI would be able to write blog posts as good as you. Was it 2026 or 2027? I think it was 2027. It was like 15% by 2027 or something like that. It is an interesting question of they do have your writing and all other good writing in training, and weirdly they seem way better at getting superhumanic coding than they are at writing, right? Which is the main thing in their distribution. Yeah.
Starting point is 02:57:23 It's a honor to be my generation's Gary Kasparov. Yeah. So I've tried this and first of all, it does a decent job. I respect its work. It's not perfect yet. I think it's actually better at the style on a word to word. sentence-to-sentence level than it is at planning out a blog post. I think, so I think there are possibly two reasons for it.
Starting point is 02:57:50 One, we don't know how the base model would have done at this task. We know that all the models we see are, to some degree, reinforcement learning into a kind of corporate speak mode. You can get it somewhat out of that corporate speak mode, but I don't know to what degree this is actually at doing its best imitate Scott Alexander versus hit some average between Scott Alexander and corporate speak. That's right. And I don't think anyone knows
Starting point is 02:58:15 except the internal employees who have access to the base model. And the second thing, I think of maybe just because it's trendy, as an agency or horizon failure. Like deep research is an okay researcher. It's not a great researcher. If you actually want to understand an issue in depth,
Starting point is 02:58:35 you can't use deep research, you've got to do it on your own. So if you think like I spend maybe five to ten hours researching a really research-heavy blog post. The meter thing, I know we're not supposed to use it for any task except coding, but like it says, on average, the AI's horizon is one hour. So I'm guessing it just cannot plan and execute a good blog post. It does something very superficial rather than actually going through the steps.
Starting point is 02:59:01 So my guess for that prediction market would be whenever we think the agents are actually good. I think in our scenario that's like late 26, I'm going to be humble and not hold out for the super intelligence. What about comments? I feel like intuitively it feels like before we see the AI is writing great blog posts that go super viral repeatedly, we should see them writing like highly uploaded comments on things. Yeah, and I think somebody mentioned this on the less wrong post about it and somebody made some AI generated comments to that post. They were like not great, but I wouldn't have immediately picked them out of the general distribution of less wrong comments is especially bad.
Starting point is 02:59:42 I think if you were to try this, you would get something that was so obviously an AI house style that you would use the word delve or things kind of along those lines. I think if you were able to avoid that, maybe by using the base model, maybe by using some kind of really good prompt to be like, no, do this in Guern's voice, you would get something that was pretty good. I think if you wrote a really stupid blog post, it could point out the correct objections to it. But I also just don't think it's as smart as Gwern right now,
Starting point is 03:00:16 so its limit on making Gwern style comments is both, it needs to be able to do a style other than corporate delve slop, and then it actually needs to get good. It needs to have good ideas that other people don't already have. Yeah. And I mean, I think it is as smart as like a, I think it can write as well as like a smart average person in a lot of ways. and I think if you have a blog post that's like worse than that or at that level,
Starting point is 03:00:41 it can come up with insightful comments about it. I don't think it could do it on a quality blog post. There was this recent Financial Times article about how, have you reached peak cognitive power where it's talking about declining scores in PISA and SAT and so forth? On the internet especially, it does seem like there might have been a golden era before I was that active on, you know, the forums or whatever. Do you have nostalgia for a particular time on the internet
Starting point is 03:01:11 when it was just like this is an intellectual mecca? I am so mad at myself from missing most of the golden age of blogging. I feel like if I had started a blog in 2000 or something, then I don't know, I've done well for myself. I can't complain, but like the people from that era all got like all found, news organizations or something. I mean, God save me from that fate. I would have liked to have been there. I would have liked to see what I could have done in that area. I mean, I wouldn't compare the decline of the internet to that stuff with PISA because I'm sure the internet is just
Starting point is 03:01:49 like more people are coming on. It's a less heavily selected sample. But yeah, I could have passed on the whole era where they were talking about atheism. Theism versus religion nonstop. That was pretty crazy. But I do hear good things about the golden age of blogging. Anybody who was sort of counterfactually responsible for you starting to blog or keeping blogging? So I owe a huge debt of gratitude to Eliasur Yurtkowski. I don't think he was, like, I had a live journal before that.
Starting point is 03:02:22 Yeah. That it was going on, first of all, it was going on less wrong that convinced me I could move to the big times. And second of all, I just think I learned, I imported a lot of my worldview from him. I think I was the most boring normie liberal in the world before encountering less wrong. And I don't 100% agree with all less wrong ideas, but just having things of that quality beamed into my head and for me to react to and think about was really great. And tell me about the fact that you could be were at some point anonymous. I think for most of human history,
Starting point is 03:03:02 somebody who is an influential advisor or an intellectual or somebody, actually, I don't know if this is true. You would have had to have some sort of public persona, and a lot of what people read into your work is actually a reflection of your public persona. Sort of, like, the reason half of these ancient authors are called things like pseudodyonysis or pseudocelsus
Starting point is 03:03:24 is that you could just write something being like, oh, yeah, this is by St. Dionysius. And then, I don't know, you could be anybody. Right. And I think, I don't know exactly how common that was in the past. But yeah, I agree that the internet has been a golden age for anonymity. I'm a little bit concerned that AI will make it much easier to break anonymity. I hope the golden age continues.
Starting point is 03:03:49 Yeah. Seems like a great note to end on. Thank you guys so much for doing this. Thank you. Thank you so much. This is a blast. Yeah, I had a great time. I'm a fan of your podcast.
Starting point is 03:03:58 Thank you. I hope you enjoyed that episode. If you did, the most helpful thing you can do is to share it with other people who you think might enjoy it. Send it on Twitter in your group chats, message it to people. It does really help out a ton. Otherwise, if you're interested in sponsoring the podcast, you can go to dwarfish.com slash advertise to learn more. Okay, I'll see you on the next one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.