Deep Questions with Cal Newport - Is AI About to Automate Every Office Job? | AI Reality Check

Starting point is 00:00:00 Back in February, Microsoft Chief Executive Mustafa Suleiman sat down for an interview with the Financial Times. During it, he made the following extraordinary claim. I think that we're going to have a human-level performance on most, if not all, professional tasks. So white-collar work where you're sitting down at a computer, either being a lawyer or an accountant or a project manager or a marketing person. And most of those tasks will be fully automated by an AI within the next 12 to 18 months. Now, if this prediction is true, then we're just a year away from one of the most sudden and calamitous economic shifts in the history of modern economics. I mean, worldwide, the knowledge and technology intensive industries produced over, produce over $10 trillion of value per year and make up more than a third of economic activity here in the U.S. So if basically all of this could be replaced by compute, and this is going to happen by next spring,

Starting point is 00:01:01 it would make the industrial revolution seem glacial by comparison. It would be the economic equivalent of the asteroid that killed much of life on Earth, including the dinosaur. So is it possible that Suleiman is right? And if he's not, what's a more realistic understanding of what AI will and will not be able to do in the workplace in the near future? Well, it's Thursday, which means it's time for another AI reality check episode. So this is a great opportunity to dive deeper in the Suleiman's claims. Now, I have a lot to say on this topic, including if you make it all the way to the end of this episode,

Starting point is 00:01:39 a little conspiracy that I uncovered when I was doing research on this topic. So stay tuned for that. But we have a lot to get to. So let's get started. As always, I'm Cal Newport, and this is Deep Questions. The show for people seeking depth in a distracted world. All right, as you may have guessed, I'm going to argue here today that Mustafa Suleiman's claim is not accurate. Now, I have three major reasons to propose why he is inaccurate in his claim.

Starting point is 00:02:16 I'm going to present these three reasons in order from least technical to most technical. Now, to be clear, I'm not trying to be like Dr. A.I. Skeptychai here, right? I mean, obviously people are finding uses for LLMs in the workplace. even if they are nowhere ready to replace all knowledge work jobs. So I'm hoping that as I get to the end of these three reasons, and before I get to the conspiracy, I promised you at the end, that I'll be able to review the ways that these tools actually are being useful. What are the actual parameters of where LLMs are

Starting point is 00:02:47 or will continue to make a difference in knowledge work in the near future? So we're going to give some positive information as well towards the end of this episode. All right, but let's dive in. I want to start with my first reason. why Suleiman is likely wrong in his claim. That reason is other tech leaders don't agree. So before we get into the details of what LLM is can or cannot do, I want to emphasize that Suleiman's idea that we are 12 to 16 months,

Starting point is 00:03:17 and given that he said this in February, that means basically a year away from all knowledge, work jobs being fully automatable. So he said fully automatable by AI. that claim is an outlier compared to what other leaders of major AI companies have been saying. So for example, before Suleiman grabbed the crown of most dumer about AI job impacts last in February, Dario Amade had been the most pessimistic among the CEOs. On multiple occasions, the claim that he made is that AI will replace up to 50% of entry-level knowledge work jobs within five years. years. That's the claim that Amadei has made several times now. Look, that's not great news either,

Starting point is 00:04:02 but it's actually much less dramatic than what Suleiman was predicting. Think about it. It's a longer timeline. Amadee is talking up to five years. It's only affecting entry-level jobs, right? So he's not talking about all knowledge work jobs, but just entry-level jobs. And he's not talking about most jobs. He's talking about 50%. So on all scales, we might measure Amade's claim, it's much less drastic than what Suleiman was saying. If we turn our attention to Navidio's Jensen Wong, he's much more aggressively against the type of claim that Suleiman or even Amadee is making right now.

Starting point is 00:04:36 He doesn't really see AI fully automating many jobs at all. And in fact, at a recent event at Stanford, Wong argued that making these predictions about large swaths of the economy being automated is actually anti-productive, if not just straight up false. So I want to play a clip of Jensen Wong at a recent event making the opposite claim of Suleiman. First of all, I think the narratives of AI destroying jobs is not going to help America. First of all, it's just, it's false.

Starting point is 00:05:10 Wong goes on to this event to argue that what we're more likely to see is AI tools being integrated into work like we did with computer tools in the 90s and early 2000s. It will change what jobs look like. It'll change the day-to-day, maybe what tools you're touching, but is not going to wholesale replace large swaths of the economy. He notes that Stanford appearance that the engineering teams at Navidia use a lot of AI tools, and they're busier than ever before, and they're hiring more engineers than ever before. So he doesn't see AI tools as a job destroyer, but instead as a job changer. All right, so let's move on now to the second reason why Suleiman is likely wrong in his prediction.

Starting point is 00:05:51 We are not seeing enough progress. All right. So Suleiman says we're basically a year away from most knowledge work jobs being fully automated. If that was true, we would need to be seen major rapid advances in LLM technology to keep us on such a ambitious trajectory, but it's simply not what is happening. Now look, this might sound crazy at first because you're bombarded with news constantly about AI and AI is this and AI is that and we're scared of this. And it gives you this general impression that aren't things just moving really fast in the AI world. But if you cut through the PR and the hype, here's what you will actually discover if you follow this technology closely. Since roughly late 2024, progress with newly released LLMs from the frontier AI companies have been making steady but not particularly.

Starting point is 00:06:43 fast progress. And instead of seeing the type of immediately obvious functional improvements that we got used to in that era when we went from GPT2 to GPT4, most of the improvements we're seeing from model to model are largely being captured in benchmarks. So charts of numbers of tests invented by the AI companies themselves in many cases that have obscure acronym names. So we just see these charts where they'll say, look, we've got a 20% boost on this benchmark, we're now moving to be comparable to this machine on this benchmark.

Starting point is 00:07:15 So we're not seeing major revolutionary leaps and functionality more. Instead, we're seeing slow and steady progress. To give you an example of what this is like, I'm going to load up here on the screen a Reddit thread that's talking about the most recent release from Anthropic, which is Claude's Opus 4.7. There's a summary here of a vigorous conversation that happened on this Reddit thread about the new Opus 4.7. Here's the summary. The verdict is in and it's not pretty. The overwhelming consensus in this thread is that Opus 4.7 is a massive regression and a serious downgrade from 4.6. Users across the board are reporting a dumber, lazier, and less reliable model that feels like a step back to early chat GPT. All right? So we're not

Starting point is 00:08:01 talking about, that's their newest model. We're not seeing revolutionary leaps from model to model. This is much more of a jagged type of frontier, make progress here, take steps back elsewhere. Another model that was released recently was OpenAIs, GPT 5.5. And people seem to like this better than Opus 4.7. But what, like, what is the magnitude of these improvements? Well, I loaded up a review. I'll put on the screen here. Matt Schumer had a long review. He posted online where he said, a big upgrade that doesn't always feel like one. I'll read you a couple things from this review. Schumer is excited about continued improvement in the LLM playing nicely with coding harnesses, so producing long-term plans for coding. Here's what he wrote. For serious

Starting point is 00:08:50 software work, it is exceptional, thoughtful, careful, able to make many of the same decisions. I would make and very good at iterating against a goal until a thing actually works. But as he also goes on to point out, these models have been getting slowly and surely better at this type of thing, steadily now for about a year, so you might not actually notice much of an improvement so they're already pretty good at that. Here's a summary of what Matt called the biggest story about

Starting point is 00:09:14 GPT5. GPD5 rounds out the weaker parts of the GPT line, design from existing context, iOS native Mac app, security, etc. Right. So what are we hearing here? This is what we're hearing about these latest newest models is slow and steady.

Starting point is 00:09:30 It's like normal software updates. We improve the native Mac iOS thing. We tweak this functionality, now we're getting better scores on this particular benchmark. Sometimes they swing in a miss, like Opus 4-7. Oh, the tweaks we made actually made this worse. And so people are going to go back to the previous. There's nothing bad about that. This is just like the normal pace of software improvements.

Starting point is 00:09:50 But the problem is to get from a place where we are now where almost no knowledge work is fully automatable, no major task is fully automatable, to a place where all knowledge work tasks are fully automatable. we're not going to get there in another year with these slow and steady improvements one step forward, one step back, two steps forward, one step back. If we tweak this, we improve that, we improve that. That's nowhere near a fast enough pace

Starting point is 00:10:15 to get us where Suleiman said we're going to get. Now, at this point, you might be saying, yeah, but what about coding agents? Coding agents feel like this example of a major knowledge work task, software engineering, that we weren't using AI heavily except for in more autocomplete ways.

Starting point is 00:10:33 And then suddenly it seemed last fall into the new year, everyone was using AI coding tools within software engineering firms. It's not everyone, but it's like massive percentages. Now, the way that the general public thinks about this is like, well, I don't know, like AI quotes around it keeps getting better and better, and it got good enough to automate that all the sudden. And as it continues to get better, maybe it will all the sudden unlock other types of tasks that we can fully automate.

Starting point is 00:10:58 But that doesn't really describe what happened, with the rise of coding agents within enterprise software development teams. What you have to understand is that that leap was actually as much about, if not more about the quote-unquote, coding harnesses as it was the underlying models. The coding harnesses is the software program written by people. It's not machine learning. It's not AI. That actually calls the LLMs for ideas and then executes the LLM's ideas on behalf of the LLM.

Starting point is 00:11:29 What really happened is there's a multi-year process of multiple companies working on these coding harnesses to make them more and more relevant. They're trying to figure out how can we get LLM's ability to produce computer code, which we've known since 2022. How do we get this ability better integrated into how actual professional software development teams operate on really large code bases? And so most of the innovation actually happened in that coding harness. They figured out all these different rules and approaches. We're going to use skill files. We're going to have a sort of simulation of memory over these skills files. There's a ton of just old-fashioned, like 1950-style AI in these coding harnesses,

Starting point is 00:12:10 like reg-X pattern matching, using existing software tools to verify code. So there's a lot of clued together stuff. You can read, I think Gary Marcus had a good piece about this a few weeks ago because they leaked the code for the Claude Code coding harness, so we know what's in there. So there's just slow and steady work on building the coding harness until they, They could finally figure out how to make it play nicely enough with enterprise coding that we can now bring LLMs into that process. Now, they also tune the models to play better with the harnesses as well, but I think it's in the harness development that we saw.

Starting point is 00:12:42 That's what made AI coding relevant to big software development teams. So what this tells us, if you want to have a similar jump in another type of major knowledge work tasks, Somewhere you have to have a lot of people iterating for maybe a year or two to try to figure out the right harness to connect properly into that particular type of job. And if we want Suleiman's claim to be true that all major knowledge work tasks will be fully automatable, you would need like a thousand of these teams. Each of them focused on another major knowledge work task trying to build a custom coding harness that works just for that task. Well, they're not doing that. You don't have enough people to do that. The market isn't there.

Starting point is 00:13:22 and it takes experts. The one thing that people working at AI companies are experts at is software development because that's what they do. So this was like the ideal place to do this. So I think the real lesson of the quote unquote sudden emergence of coding agents is that it's actually really hard and takes a lot of focused work to try to integrate AI into individual types of workflows. It's not something that just happens as the models get smarter.

Starting point is 00:13:47 So again, unless these companies are hiding thousands of teams working on all these different areas of knowledge work trying to find ways to subtly integrate AI into them, I do not see how we're going to suddenly have coding agent style automation in many other tasks within a roughly 12-month period. All right. My third and final point why Suleiman is likely wrong, the functionality of LLMs are limited. So we're moving down this technical stack here. Let's briefly open the black boxer on LLMs to better understand what they can.

Starting point is 00:14:21 can, and more importantly, cannot likely do, no matter how big we scale them. So let's start from the bottom. We've done this before, so I'll go quick. What does an LLM actually do? It predicts tokens. You give it text, and it outputs what token should come next. It has been trained to assume the text it's given as input is a real text written by a human that actually exist. And so there is a right answer for what the next token is, and it's trying to get the right answer.

Starting point is 00:14:51 That's at the base what an LLM does. Okay, so then how do we get long text out of an LLM? Well, you have to actually put a program on top of the LLM to keep calling it again and again what's known as auto-regressively. So you have some text, you give it as input, it outputs a token. You put that token at the end of the input text, and now you take that slightly longer input text and put it through the LLM, and you get another token.

Starting point is 00:15:15 Now you take that even longer input text, and you put it through the LLM, and you keep doing that until it's grown out a whole answer, and then you can return that answer to whoever made the original prompt. So what this auto regression of token guessing gives you is basically a story completer. Here's some text, and the LLM is implicitly trying to finish the story it was given as input, as accurately as possible, based on all the types of text it's seen so far. Now, in the original big LLMs like GPT3, it would create reasonable stories, but they would be all over the place. So then they figured out how to then what's known as post-trained or tune these LLMs of the possible ways they could go to tune them towards certain categories of types of text and away from others.

Starting point is 00:16:00 So GPT-35, which was the version that powered the original chat GPT, was tuned to be to think about the stories that's completing as answers to questions. So you're given your prompt, the input's a question and you're trying to answer the question. And that was much easier for like average users to deal with. All right, so they're story completers. Now, we don't want to downgrade that, right? This isn't a dismissal, like, ah, it's autocomplete. Because what we discovered, particularly with GPT4, is that as we scaled these things up, in order to successfully complete stories in reasonable ways,

Starting point is 00:16:34 like these LLMs actually encoded a lot of really interesting rules and logic. If my story involves a math problem, to complete that story in the right way, the LLM is going to have to have some math logic built in there because otherwise the story's not going to seem reasonable. And this was really the big discovery of GPT4 was like, wow, there's all of these abilities that were implicitly coded into this LLM during its training. It's just trying to complete the stories, but we trained it on so much stuff for so long that it learned all of these types of abilities that allows it to win the story,

Starting point is 00:17:07 you know, the complete stories well. Like it understands humor, it understands math, it understands computer code, it understands basic logic. If it's seen a game enough time, it has some sense of like, generally what the game rules are, what are valid move in that game looks like. It's pretty incredible what GPG4 could actually do. And there was this idea of like if we kept scaling these things bigger and bigger. Yeah, sure, they're story completers, but they can eventually have so many rules and logics programmed into them to win this game that like they will actually have like a human level intelligence and then we can just ask the, use the LM as the brain to automate everything.

Starting point is 00:17:41 That was division. Okay. Then what happened is we learned by the summer of 2024 is that this type of just scaling was hitting a wall. Just trying to make these models bigger and train them longer was not appreciably leading to new functionality being encoded into them like we had seen before. And this picked up a whole era

Starting point is 00:18:02 that really started with the Alphabet suit models out of OpenAI in the fall of 2024. It kicked off a whole era of post-training and tuning where, okay, they get further appearances of improvement in these models. We can't just scale them bigger. We can do things called post-training, where if we have very specific data sets

Starting point is 00:18:20 where we have questions and exact right answers, we can tune an already pre-trained model to be better at using its already wired intelligence to answer this type of problem. And that's why we saw starting in 2024, a big focus on the type of things for which we had data to do this tuning, reasoning, math, and computer coding.

Starting point is 00:18:40 That's why weren't they talking about other types of things you would want an LM to do because it would be economically very valuable, because all we can do now is tune these and get sort of like steady improvements in areas where we happen to have a lot of highly structured data to do the tuning on it. And that's really where we've been since 2024. And that's why we have more of this slow but steady improvement with LLMs on particular benchmarks and no longer these big general leaps like we had during the prescaling era when we saw like three to three five to four. Okay. So this is a problem for automating. work because we know we can't scale these LLMs just to have like a human level of reasoning on things. And for most things we do in knowledge work, most of the skilled things we do, we don't have highly structured data sets like we have for computer programming or math. So we can't even tune the pre-trained LLMs to be better at things that a lot of people do in their jobs.

Starting point is 00:19:36 So this is sort of a wall that's hard to get past when you leave where we are right now with LLMs. Now you might say, yeah, but what about workplace agents? A lot of what people do in knowledge work is actually not that skilled. It's individual things that LLM can do. Send an email. Get information about an upcoming conference. Move information to a spreadsheet.

Starting point is 00:19:57 Build slides and send it to the team. Actually, a lot of things we do in knowledge work is what I call shallow work. It's things that don't require a particularly high level of skill. So why can't we have at the very least something like a knowledge workplace administrative assistant? that can automate a lot of boring stuff we do in knowledge work, like we have in computer programming, where coding agents can do multiple steps of work on behalf of the programmer. So at the very least, this wouldn't be Suleiman's prediction,

Starting point is 00:20:24 but it would get us closer. Why don't we have those either? Well, it turns out that building these type of tools is difficult. I wrote an article about this back in January for the New Yorker. It was called Why Didn't AI Transform Our Lives in 2025? And in that article, I looked at the question of, why don't we have more workplace agents? We were told at the beginning of 2025 that this would be the year of the agent, not just coding agents, but for all the things you do and work, more people would have these AI assistance.

Starting point is 00:20:58 Why did that largely not happen? But when you actually study how these agents work, the problem is the plans, the multi-step plans that agents execute are the result of LLM prompts. You have a harness, which is a computer program written by a program. person that prompts an LLM and says, give me a multi-step plan for doing X. And then it takes that text and it goes step-by-step to harness and executes those things on behalf of the user. For computer programming, they've learned how to be pretty good at making those multi-step plans because the option space of what you do when you're building a computer program

Starting point is 00:21:33 is very narrow. And what they're doing is very verifiable. And so actually, the harnesses for coding agents will call non-AI tools. They can verify things. right, this step has a clear success category like the code compiles and passes these tests. And we're going to run another program. This is going to check the code, make sure it compiles and passes those tests. And if it doesn't, we can go back to the LLM and say, that plan didn't work, give us another one.

Starting point is 00:21:56 It's very well set up for doing this. But once you're in the more, so I wrote about in my New Yorker article, once you're in this more general world of more ambiguous sort of knowledge work task, what you're going to get out of the LLM is a reasonable sounding plan, reasonable sound, because that's what it does. it's a story completer. Reasonable sounding plans get you in trouble. You need correct plans. And the way that humans

Starting point is 00:22:18 actually make plans is we do a few things. One, we test a bunch of possibilities internally to see what makes the most sense. And two, we have some sort of notion of correctness and a world model we can use to evaluate plans to see, does this actually do

Starting point is 00:22:31 what I wanted to do? Are there any mistakes along the way? That's not how LMs operate. Again, we're just auto-aggressively producing tokens. So you get a reasonable sounding plan, but it's not stepping back. It doesn't have a world model to test it against. It can't have hard-coded rules that it consistently applies.

Starting point is 00:22:48 It has no ability to do future simulation of possible outcomes. And so we just get a story that sounds like a good plan, but often has issues along the way. And if an agent is just going to automatically execute these things, we get into trouble. All right. So making agents, even just like administrative assistance in non-computer programming areas is also a very hard problem and not one that we have a good solution to. We also underestimate the degree to which if you watch a programmer using a coding agent,

Starting point is 00:23:16 they're constantly tweaking and re-asking and adjusting. There's a huge amount of work to try to get consistently useful output out of these agents. And most knowledge workers just simply aren't going to do that or have the technical chops to pull that off. Earlier this year, actually, in the fall, Open AI even mentioned, they were slowing down or reducing their non-coding agent projects because they weren't working, and they wanted to focus on chat CPT and their coding agents. All right, so if we put all these things together, we put these three reasons together,

Starting point is 00:23:46 I think it's clear that Mustafa Suleiman's claim that AI would fully automate most knowledge work jobs by next spring really is an outlier opinion. There's an opinion that is contradicted by the statements of other CEOs, the reality of the rate of progress for LLMs over the last couple of years, which is not nearly fast enough to get where he says we're going to get. and the technical limitations of these models. They're just LLMs with a harness on it is just not a great setup

Starting point is 00:24:10 for automating a lot of different types of knowledge, work, jobs. So why then might Suleiman be pushing this story? What's in it for him? I'm going to play you a quick clip of one potential explanation. It comes from the absolutely agentic channel. Let's hear what they had to say there. And it's worth noting that this interview

Starting point is 00:24:29 comes at a particularly interesting moment for Microsoft. Their stock recently took a hit as markets questioned whether the enormous capital expenditure on AI infrastructure is going to pay off. So Sullyman has a motive to sell the dream. More generally, claims about AI massively disrupting the job market like we just sort of heard there tend to be very useful for the major AI companies. They know they can just say wild things about economic disruptions and they're not going to be challenged on these predictions even if they don't come to be true because it may. matches a general vibe of there's disruption and that's exciting or scary, we want people to be worried or whatever it is. They're not going to be challenged, so why not?

Starting point is 00:25:10 And it makes them seem like they're working on the most important technology ever, which is exactly the type of technology. If you're an investor, you would say, sure, Anthropic. We've given you $60 billion and you've made $5 billion to date of revenue. But like, hey, I'm not going to do normal expectation accounting with you because you're building the most important technology ever. And so none of these numbers matter. You're going to automate the whole economy.

Starting point is 00:25:32 So it's super in the favor of the CEOs to say these things. Now, to be clear, I don't want this to make it seem like LLM-based tools are irrelevant to the workplace. They're not. As I promised, I wanted to talk briefly about the ways in which LLMs and LLM-based tools are actually useful in non-coding knowledge work sectors. There's five things I want to quickly mention that right now we're seeing or will soon see useful applications of LLM-based tools. One, sifting through reasonable amounts of text to generate summaries or define useful examples. Because of the way LLMs are built with what are known as attention sublayers, they're very good at selectively turning their attention to relevant parts of the input text when they do their processing.

Starting point is 00:26:15 So if the text you're feeding to an LLM is not too expansive, it's really good at like find examples of this, summarize the cases of this that shows up. Look for examples in here of X in this text that should catch my attention. They're really good at that, and that can be useful. There's a lot of like a time, otherwise time intensive things that humans would have to do that this makes much. You could do this with natural language prompting. If the text gets too long, the accuracy is going to fall, and then you're going to have to do something more advanced, which I'll talk about in a second. The second major use we're seeing of these in the knowledge workplace is data formatting.

Starting point is 00:26:51 It's very good, again, at summarizing text and rewriting it in different formats, right? So if you can say, hey, take these 10 consumer comments and go through and make a bullet point list or put it into a slide of like the five most common things they say. It might not be that precise, but I'll do a good job of reformatting text or clean up this data so that I can put it inside of a spreadsheet. Now, again, if the amount of data gets too big or your precision level, like it needs to do this exactly right every time gets too high, just prompting an LLM is not going to necessarily be accurate enough. but that brings us to the third use, which is emerging more and more. For highly technical users, you can use coding agents to produce small programs

Starting point is 00:27:32 to do this type of processing more precisely and on bigger amounts of data. So if you give me like 10,000 rows of a spreadsheet and say go through here and clean it up in this way, so you can't just feed that to an LLM. The context is too big. It's going to produce reasonable-looking stuff with lots of mistakes.

Starting point is 00:27:48 But what you can do, if you're technical enough, is have something like Claude Code, produce a quick Python script that can read through the file and very systematically and correctly make the changes. And then it can, once you have that program, you can feed it the biggest possible file. And I think that's useful

Starting point is 00:28:04 for people who are more technical within knowledge work. A lot of people are using it as a better Google. A lot of what happens with these chatbot now is basically doing a Google search and then feeding that to an LLM, which can then summarize it for you. And that's very useful for a lot of people. It can summarize data

Starting point is 00:28:20 from a Google searches effectively or put a useful formats. I think that is very useful. And I'm hoping soon we're going to have much better calendar or appointment management. That's one agentic thing that's very narrow and very good for an LLM to do. Find an appointment with these roughly described parameters on my calendar. Elims are very good at that. I wrote an article last year about email sorting with LLMs. That's another good use. I mean, it's expensive from a token perspective. But hey, is this email? you can give it like very natural language rules for what you care about and what you don't know how to filter it. And LM can act on that and act on an email and filter it.

Starting point is 00:28:57 So I think there's these focus sort of filtering and agentic interactions in the workplace where LMs would be very useful. So these are all really cool tools. There's a couple other things people are doing right now in the workplace, which I think they shouldn't. I don't think they should be writing so much with LLMs. If an LM is producing your slide deck or it's producing the emails you're sending, then you shouldn't be using that slide deck and you shouldn't be seeing those emails. If the information content was too low, you should have had something much simpler. I'm also not a big fan of using LLMs to quote-unquote refine your thinking. Their synchophantic, hallucinatory, and emotionally manipulative is not a good way to try to sharpen your thinking.

Starting point is 00:29:31 You need to read hard things. You need to write to organize your thoughts. You need to talk to other real people. But there are real good uses of LLMs in the workplace that are expanding. But none of this is a future in which by next spring all the jobs are automatable. That is just not realistic. All right, to conclude, I promised you earlier in this episode that I uncovered a quote-unquote conspiracy while I was working on this story, which I thought was fun, so I will tell you about it. All right, so the Suleiman interview with the Financial Times went up in February.

Starting point is 00:30:02 I'll load it on the screen. So here it is. Mustafa Suleiman sets out Microsoft's AI goal for humanist superintelligence. Okay, February 12th. when that interview went out, there were dozens of articles about the part in that interview that I played at the beginning of the show where he said 12 to 18 months, all of these jobs where people sent from a computer will be fully automatable by AI. It was clipped dozens of times all over YouTube and social media. You can see clips of that part of the interview. It was written about in many publications, you know, where they link to the interview and have the actual exact.

Starting point is 00:30:42 quote. But if you go to this official video today on the Financial Times website, they've edited it out. So it's gone. He leads right up to the point where he said that quote. And then there's an awkward cut where it cuts from a close shot to a wide shot and he jumps to another topic. You notice it when you first listen to it. Like that's a weird jump. Why was he going from this to that? It's the edited out of the original video. Now it was too late because again, dozens of people already clipped it and spread it and major publications wrote about it. So I don't know why they thought they'd get away with editing it out. But it is now actually gone from the official version of the video.

Starting point is 00:31:24 Now, I don't know why they did this. Here would be my best guess. I think Suleiman saw these other CEOs making big bombastic claims and getting a lot of attention and hype for their products. And they're like, we've got to get in this game too. And so he tried it, but he went too far. That's what I think happened. He's like, I can do like an Omiday here. What if all jobs are going to be gone in a year?

Starting point is 00:31:48 And he went too far. And I think after the fact, whoever it was on the executive team or the lawyers or whatever, we're like, this is not, we don't want to be saying these type of things. This is way too drastic. It makes us seem a little bit doddy. It's like, all right, go edit it. Go edit it. So they went back and they edited out of the video.

Starting point is 00:32:04 That's what I, I don't know if that's what happened. That's what I think happened. But it was just too late because it was already spread out there. But if that's true, then here's what I'm going to do. going to say. Look, I don't know, actually here's what I'll say. I don't know if that's true or not, but I'm going to take a page out of other AI commentators and say, you know what, that matches my vibe about what's really going on. I think the idea that they edited it out because they got it wrong is directionally true. So in the spirit of all the other AI reporting going on out there,

Starting point is 00:32:31 I'm just going to claim that's true because it feels right to me. All right, that's it for this week. We'll be back next Thursday with another AI reality check episode. Also, stay tuned on Monday on this feed, we have advice episodes, which I think you'll like. So you should check those out as well. They provide practical advice for seeking depth and a distracted world. It's all good.

Starting point is 00:32:51 So stick around. Until next time, remember, you should care about AI, but not everything that's said about it.

Deep Questions with Cal Newport - Is AI About to Automate Every Office Job? | AI Reality Check

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.