Better Offline - Hater Season: Cal Newport on AI Reporting

Starting point is 00:00:00 This is an IHeart podcast. Guaranteed Human. Run a business and not thinking about podcasting. Think again. More Americans listen to podcasts than adds supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as large as the next two combined.

Starting point is 00:00:15 Learn how podcasting can help your business. Call 844-844-I-Hart. Another podcast from some SNL late-night comedy guy, not quite. Unhumor me with Robert Smygel and friends. Me and hilarious guests from Bob Odenkirk to David Letterman help make you funnier. This week, my guest,

Starting point is 00:00:32 SNL's Mikey Day and head writer, Streeter Seidel, help an a cappella band with their between songs banter. Where does your group perform? We do some retirement homes. Those people are starving for banter. Listen to humor me with Robert Smigel and friends on the I-Heart Radio app, Apple Podcasts,

Starting point is 00:00:48 or wherever you get your podcasts. Hey, everyone, it's Ryder Strong, and Wilfredel from PodMeets World. And now the Pod Meets Twirled podcast. We're two men who were completely clueled, to reality TV, and we're gearing up for the season finale of Survivor. I know we annoyed a lot of our listeners by our severe lack of survivor knowledge. That is the point of the show. I'm just going to remind you.

Starting point is 00:01:13 Again, we are experts. Listen to Pod Meets Twirl on the Iheart Radio app, Apple Podcasts, or wherever you get your podcasts. There are times when the mind becomes a difficult place to live. This is David Eagleman with the Inner Cosmos podcast, and for Mental Health Awareness Month, We'll talk with singer-songwriter Jewel about anxiety. I started living in my car, and then my car got stolen. I was having panic attacks. I was agoraphobic.

Starting point is 00:01:38 This is a month of deeply personal and honest conversations about what happens when the brain goes off course. Listen to inner cosmos on the Iheart radio app, Apple Podcasts, or wherever you get your podcasts. Also Media. Your witness to a great becoming. It's Better Offline, and I'm Ed Zittron. Today we're joined by computer science professor and tech writer Cal Newport for Hayter Season.

Starting point is 00:02:19 Cal, thank you for joining me. Always happy to do some Haiti, I suppose. Well, you had an excellent YouTube that I'll be linking in the show notes about the mistakes in AI reporting. Though I would, my hater in me says I don't think these are mistakes. But you really, you touched one of the best videos I've seen on AI report, or AI in general, was what you did. But it was basically this thing of like the digital ick. that these stories are meant to make you feel uncomfortable. And of course, this faux astonishment thing, you know what?

Starting point is 00:02:48 Just run the tape. Tell me a little bit about what the bits that you found because I watched it going, yeah, yeah, yeah, like an angry person. Yeah, I could imagine you when I was recording that video. I was like, I bet Ed is cheering out. Welding out. I was having a great time. Exactly.

Starting point is 00:03:07 Well, let me just give the context, right? So I'm in an interesting situation for observing this because, you know, I am a computer scientist, so I'm not afraid of the technologies. I'm happy to talk about transformers and feed forward networks and diffusion models. And like, that's not that scary to me. But I also write about technology. My main journalistic home is the New Yorker where I write a lot about, you know, I do AI journalism there. So I'm up on that as well.

Starting point is 00:03:31 And so I'm often noticing things in journalism when I see AI coverage that is faulty. It really catches my attention because I have a foot in both. of these worlds. So there's a lot of good AI reporting out there. There's also a lot of trash. And I really wanted to help people figure out how do you sort it? Like, how do you figure out if you're reading something? Should I pull the ripcourt on this article? Like, this is not helping me. Like, how do you know if it's good or not? So I was like, okay, here's what I'll do is I'll come up with the three most common traps I see in AI reporting that makes me one who, you know, throw my iPad at the wall. And I came up with three and I'll just give you

Starting point is 00:04:06 the three names. I made up all these names. I don't, I don't think they're great. My producer thinks he loves them, but here we go. Vibe reporting. That's number one. So that is where you will omit certain facts and put loosely related quotes next to each other in a way that creates a general vibe that you want to be true, but it's not quite true. So you don't actually make a concrete claim that's not true, but you imply that claim by what you omit or what you put in your story or what quotes you put next to it.

Starting point is 00:04:39 So you'll put a quote, for example, about layoffs at the gaming division at Microsoft next to an unrelated quote about concerns about AI and its impact on jobs. Now you have the vibe. Oh, man, all these people just got laid off because of AI. AI is taking jobs. Where in reality, the layoffs had nothing to do with AI, but you put these things next to each other. You give that sense. Then I had mining digital ick. So to me, that's any AI story where you take an example from the edges of AI, like something that wireheads in San Francisco are up to.

Starting point is 00:05:08 and you just tell a story that's unsettling without talking about any of the technical details like, well, what's different here? Was there a technical innovation we need to know about and not discussing any concrete implications? Oh, this means this is going to change in the future or it's going to have an impact on this sector. You're just telling a story to unsettle.

Starting point is 00:05:27 And I think a lot of the coverage of Maltbook and OpenClaw fell into that. And then finally, it's Faustonishment, which is more of a YouTube phenomenon than a print journalism phenomenon, but that's where every single thing that happens in AI is insane, amazing, terrifying, everything is going to change. And so you constantly create this atmosphere of something seismic just happened so that

Starting point is 00:05:51 the consumer of the information ends up in a bit of a panic. Like, I can't put my finger on exactly what's terrible, but like everything terrible is happening. Those three traps, to me, should be automatic rip cords from what you're reading or watching. The only disagreement I'll have is that you would say that this isn't the majority of AI journalism. actually argue it would be, especially that kind of the beginning one, the vibe reporting, is very common because the big one right now I'm seeing is this AI software, AI's replacing

Starting point is 00:06:20 software thing. If you are a reporter, and Cal is not making this statement I am, if you're a reporter and you bring up anthropic co-work around anything, you are wrong. I was about to call you a name, but I'm being nice today for some reason. All right, you're a dips year, because you are a dips year, if you look at Claude Co-work, which is a thing for fucking around on your desktop, and you say this is going to compete with Salesforce, you just don't know what you're talking about. You are wrong. But then one abstraction higher is this idea that Claude Code is going to destroy SaaS, Software as a Service. And the idea being that software as a service is this thing, that people are just going to stop building their own CRMs, they're going to stop building their own

Starting point is 00:07:01 per seat software things, but they're going to build it internally. This just, I don't know if you've seen this cow, it just reflects a complete lack of understanding of how software works. Yeah. Because you don't, you don't pay Microsoft for Microsoft 365 teams because you can't build your own word or what have you. I don't think you could. But nevertheless, it's also, because they maintain it because they make sure it doesn't break, or if it breaks, they fix the bit. So they make sure that it stays up all the time. I make sure it's accessible. It has secure login. Well, look, there's a few things going on here. One of these things I reported, on last month. I did a

Starting point is 00:07:38 big New Yorker piece on agents, right? So this is relevant to Clod code and how people are thinking about the current future. And there was this, basically here's what seems to have happen. Claude code and these other command line interface agents can do really cool things. I'm using the word cool here

Starting point is 00:07:54 in a very carefully. That's fine. Yeah, like cool in a sense of... I would like you to be like completely, like give people the actual explanation here. Yeah. So cool meaning like Oculus, right? You put on the Oculus visors for the first time, everyone had the same reaction. This is really cool.

Starting point is 00:08:11 Now, that's separate from, that's a trillion-dollar business. Like, let's put that side, but this is really cool. I'm seeing 3D in a world where it tracks my head. Coddcode and other command line interface coding tools became like that for programmers. It was like really fun, the watch it doing multi-step execution of the construction of demos or this or that. And the reason why it could do those cool demos, it was sort of well-suited for that world, because that's a world that exists only in text. So Cloud Code works on a command line interface.

Starting point is 00:08:39 It's all text-based. And it works with a file system. You can write files, edit files, send files to compilers. It's all existed, a small number of commands on a command line interface. It's all in a world of text because that's where computer programs are built. That's a perfect case for LLMs, which love dealing with text, and they love dealing with structured text like computer code. There was an extrapolation that then happened.

Starting point is 00:09:00 And really this caught on January of 25. which is where you first began to get this sentiment of, oh, it's doing such cool things over in the world of command line interfaces and code. Certainly these agents can now soon do similar cool things and like all different things we do on a computer. And that is what laid this foundation of people were so impressed. Programmers were so impressed by the coolness

Starting point is 00:09:26 of what was happening with Claude Code and the other command lane interface tools that they extrapolated that vibe over to other computer usage is the point of that article I reported was oh it turns out like everything else we don't on computers much harder it's not six text commands

Starting point is 00:09:44 on a command line interface creating structure code that you can compile and test to see if it compiled or not it's much messier the interfaces are visual we don't realize what complexity goes into the things we click and select doing something as simple as even trying to just book like a hotel in a new city or something like this and that if you use a language

Starting point is 00:10:02 model as to underline logic and decision engine of making actual actions in the real world, well, the language models make things up and get things wrong or a little bit wrong 20% of the time. And a little bit wrong in computer science means breaking everything. Well, it means a lot. Like, it's okay in code because they say, oh, that didn't compile. Let's try again. But when you're booking a hotel room, as I sort of detailed that article, it means like you ended up in the wrong city two years from now and the room cost $6,000, right? And so it just didn't work. So 2025 was supposed to be the year of the agents. And It just didn't work.

Starting point is 00:10:35 And they don't really know how to fix it. But we're still vibe. So this is vibe reporting. Yeah, but Claude Code is like really exciting. And so why can't we do that with everything else in the computer? It's actually a much harder problem than they were letting on. Well, the thing is I've seen, especially like in 2026, I've seen a lot more Claude Code stuff.

Starting point is 00:10:52 And there was, there's been a very big consent manufacturing operation going on right now. Wall Street Journal, Atlantic, CNBC. Deirdre Bosa, this is a statement from me, not cow. Digibos from CNBC should be fucking ashamed of herself going on CNBC every day just going, God, code's going to destroy all software. She's on Twitter

Starting point is 00:11:09 because she was able to vibe code a some sort of Monday clone, which is just like a project management tool. It's like, I made some software that worked. This is everything now. But that's kind of what you've been talking about with this vibe reporting, where it's like, I did a small thing.

Starting point is 00:11:28 Now all things will be done in this manner, Whether it's possible, God no. God no. But she, like many reporters, are able to find a lot of people who are invested in AI who will absolutely go on TV and say, yep, it's completely true. That's going to happen. 100%. It's just so strange because it makes me feel paranoid and kind of conspirator when you look at the majority of news about AI, and it is this vibe reporting. It's these vast extrapolations from 16,000 job block, um, job losses at Amazon. They mentioned AI. This plus this equals that AI is replacing people. It's just so, it makes me feel like uncomfortable with the world. Yeah. Well, can we, can we sit for a moment on that Amazon example? Because I think it's a great one. Please, please. It frustrates me. That one frustrated me. Tell me. Go ahead. All right. So, Amazon lays off 16,000 people. Right. All right. It's covered in two different ways. So the vibe reporting way it's covered. In my newsletter, my podcast, I looked at an example,

Starting point is 00:12:30 from courts. And it was covered as clearly intended to imply Amazon laid off 16,000 people because of AI. They're being replaced by AI. They put the subhead of the article was the CEO of Amazon talking about how AI is going to increasingly disrupt the job force. And then in the article itself, no alternative explanations are given for these layoffs. They kind of just give the details of like, here's how many people are laid off and here's where they figured it out. And then they put a couple quotes in there about AI being very disruptive and being able to automate jobs. It turns out those layoffs had nothing to do with AI. And you could find other reporting that focused, because it was reporting that was for

Starting point is 00:13:09 the financial market, so it was trying to focus on what the hell is actually happening. And the deeper reporting was like, yeah, they laid off a bunch of managers because, like, a lot of tech companies, they overhired during the pandemic, because cloud computing became much in demand during the pandemic. So a lot of tech companies overhired during the pandemic, and they're all shedding those jobs again. And Amazon's pretty ruthless about this, right? They're always looking for excuses to fire people.

Starting point is 00:13:30 And they said, we have too much bureaucratic bloat. There's too many managers. We're going to fire a bunch of these managers we hired so that we can be more lean again. That has nothing to do with AI. In fact, this is the second or third round of these firings that they plan to do. The first round occurring before ChatGPT was even released. This has nothing to do with jobs being replaced by AI. But then you can vibe report it because, like, well, technically speaking, Amazon also is investing money in AI products.

Starting point is 00:13:58 So technically speaking, money saved by firing these managers could be re-spent on AI. So you can say with a semi-straight face, they fired people because of AI. But clearly, you know the impression that you're giving to the reader is that they were replaced by AI. And it had nothing to do with it. And I heard, by the way, so I wrote about that I put in that video. I've heard on background from multiple Amazon executive since. They were like, we were completely baffled by this coverage that was implying that people were being fired here to do with AI. This is just what we do at Amazon. We're ruthless. Like if you're not earning your keep, we fire you,

Starting point is 00:14:32 they were completely baffled by that coverage. And like, thanks for pointing it out. Like, what are you talking about? I got to be honest, Cal, if I got an email like that from an Amazon executive, I tell them to go fuck themselves. And I mean this nicely because Andy Jassy last year, June 17th, 2025 put a whole thing about today in virtually every corner of the company we're using generic AI to make customers lives better. I believe Amazon benefits from that obfuscation. And I think they deliberately fuel it. Now, there may be executive. who disagree with this. Well, these were lower level managers, right?

Starting point is 00:15:00 So not exactly, I shouldn't say executives, but like people who work there that were like, oh, yeah, no, no, no. They're not firing people because like AI. They're just being brutal. Right, right. They're the people who tell the truth. Yeah, exactly. Exactly. But no, you are right.

Starting point is 00:15:13 I think for the, there has been a lot of vibe reporting. Basically, I've been covering this for the last two years, the entire sequence of job reductions that were post-pandemic corrections. So the entire tech industry overhired during the pandemic, the entire tech industry cut jobs. in the last few years because they hire too many people and now they have to correct back to where they were pre-pandemic. Consistently across the board, these cuts or vibe reported as due to AI.

Starting point is 00:15:39 Consistently, you see exactly that story. I'm a computer scientist. We see this in coverage of computer science majors as well. Same idea. Computer science majors historically directly tied to the tech industry. If they're cutting in a down cycle, majors go down. If they're hiring, I mean, it's just economic, I mean, it's not surprising.

Starting point is 00:15:56 When the tech industry is booming, we get a lot more majors because they're good jobs, right? And so computer science majors went down in the last year or two as the tech company started cutting. That was reported in the Atlantic as kids are not majoring in AI because AI makes the superiors. Majoring in CompSite, you mean? Comsat, yeah, because AI is going to do all these jobs. That's because we've had these cycles every five years we have this cycle. This is nothing, anyways.

Starting point is 00:16:24 No, no, I'm with, and the Atlantic has just done a piss poor job with this. this stuff. I'm hating and I'm hating on them because they had an awful Claude code thing. They just, you know, I'm going to bring it up and read the title. I'm not going to say the reporter's name because I think that that's, that's mean. But let's look at this. Move over chat, GPT. You're about to hear a lot more about Claude Cod and you know why you're going to hear about that because the fucking Atlantic is writing about it. And it's just, uh, over the whole days, Alex Lieberman had an idea. What if he could create Spotify wrapped for his text messages without writing a single line of code?

Starting point is 00:16:58 Lieberman, co-founder of the media outlet Morning Brew created, I message rapped. I just want to start here and say, that guy doesn't do any fucking work, and I know people who work there. Like, I just want to start by saying, if that's the best you've got, and that probably involved throwing a giraffe and there's the entire zoo into a vat of acid to make the GPUs move, you're meant to read this, and the deliberate effect you're meant to have is you're meant to be scared and excited. Like, it is a kind of a mishmash between the vibe reporting and the I guess it's the first. In fact, this might be a triple score. Because this is meant to make you feel with discomfort, but also make you freak out, but also be excited. A rare triple score. It's just stories like these piss me off because I'm fine with people going, this could do this. I don't mind. It happens. But when it's just like, feed me the slop right into my mouth an asshole, make me feel all the marketing things all at once. First of all, I'm not impressed by the idea of doing IMessage wrapped.

Starting point is 00:18:03 That's not a thing a human being with like friends and hobbies does. I've been watching the first season of True Detective. I've got many more shows I could watch. Never once would I think, what if I could get a rapt of all my messages? What a fucking psychotic thing to do. But when you read this, it's like, you spent your holidays with your family? wrote one tech policy expert. That's nice.

Starting point is 00:18:28 I spent my holidays with Claude Codd Cod. Well, it's fun. It's fun to create these sort of demo apps. If you're like engineering-minded, so in that sense, the most cynical analysis here is that Claude Code is like model trains for engineering-minded people. It's a fun hobby. I can put, look at this. Like, I made a thing that can, like I was talking to a friend of mine yesterday,

Starting point is 00:18:55 computer scientist and you're like, oh, I built a thing where I can, you know, whatever, email an appointment to a thing that gets parsed by an LLM and then goes to my calendar. That's just fun for him. In the same way that someone else might be like, hey, I built a cabinet and like it's really nice. Like, look, I got the wood to go together

Starting point is 00:19:13 and like I'm proud. I could just have on a cabinet or whatever. Except you build, except you built something with your hands. Well, sure. And a cabinet can have things put in it. It's, I just, it feels like Lego. It's more like it's toy software that has some functionality. Because the big thing I'm waiting for with the vibe coding stuff is an actual product.

Starting point is 00:19:37 You know what I mean? Well, that doesn't go well. I mean, look, this is the story of this is the story of Maltbook, right? Oh, yeah. Tell me more about Maltbook. We were just talked about this. There's a whole can of worms to open there. I'm just going to open like the top of the nearest can, which is before even getting

Starting point is 00:19:52 to what Mold Book is and is not, you know, it was in the news all the time. It was vibe-coded and immediately was just full of terrible security holes because it was vibe-coded. And it turned out that you could get the API keys. So the key you use when you access the paid service to get used an LLM, you have an API key so they know who to charge. You could just steal everyone's API key who was using it because the guy just vibe-coded it. And so no one was actually there looking at the code. But what I think is going on. Okay, so here's what I would like to see.

Starting point is 00:20:23 I would like to see more reporting that would be how are people using X? Like to me that's very interesting. Yeah. Right? How are people using X? Yeah. And the problem is those answers right now,

Starting point is 00:20:33 and this is confounding, I think, to people who are very excited by the potential and coolness of these things in isolation, the answer to how are people using X is often not nearly as exciting as you would guess. And I think the reality with,

Starting point is 00:20:46 I mean, I don't quite have my arms around Claude Code. I do know there's a lot of people who are building kind of like internal tools or personal tools with it, which I think it's cool. Most people aren't interested in that, but for some people they are. And, you know, I think it's fun.

Starting point is 00:20:59 Computer programming is fun, right? So, like, the ability to make a program work is, and some of those tools are useful, but that's not a major industry on its own. You know, I'm designing a tool for my small company that makes it easier for us to do whatever. That's cool, but that's not like a trillion-dollar industry. I can't get my arms around yet

Starting point is 00:21:15 exactly how professional computer programmers are using it. They're all talking about it. Really? Yeah. But I can't get my arms around. Yeah, what do you, what's, what's your sense of, I hear everything. I'm going to be honest, I was going to ask you. I was literally going to ask, have you heard people using it? Because if you go on X the Everything app, if you put on Fulhasmat suit, you go on X and you go and look and the way that people talk about this is like they have connected into the matrix that they are now a thousand X engineer. But when you go and look at what they do, no one actually says, There's always these kind of vibe stories, the vibe tales, the mythology where it's like, I had a problem vaguely that would have taken me X number of hours, but when I used Claude Code,

Starting point is 00:22:04 it solved it immediately and caught two bugs that I didn't know about. It's very Marine Todd style, then everybody clapped. Yeah. And it's, you're a computer science teacher, and you don't know either, which makes me think it's not as big a deal as people are saying. The other bit of evidence is that at the end of last year, Anthropics Claude Code revenue was 1.1 billion annualized, so about 90 to 100 million a month. For a revolution, that feels low somehow. Another podcast from some SNL late-night comedy guide, not quite. Unhumor me with Robert Smygel and friends.

Starting point is 00:22:52 Me and hilarious guests from Jim Gaffigan to Bob Odenkirk to David Letterman, help make you funnier. This week, my guest, SNL's Mikey Day and headwriter, Streeter Seidel, help an Acapella band with their between songs banter. There's the worst singer in the group? The worst? Yeah. Me. Is there anything to the idea that because you're from Harvard,

Starting point is 00:23:14 you only got in because your parents made a huge donation. The group. The yard birds, right? That's the name. The Harvard yard, but they're open. Do you have a name suggestion? We're open. Since you guys are middle aged.

Starting point is 00:23:27 One erection Listen to humor me with Robert Smygle and friends On the IHeart Radio app, Apple Podcasts, or wherever you get your podcast. Humor me I need some jokes to make me seem funny. Run a business and not thinking about podcasting, think again. More Americans listen to podcasts

Starting point is 00:23:50 than ads supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as large as the next two combined. So whatever your customers listen, to, they'll hear your message. Plus, only IHeart can extend your message to audiences across broadcast radio. Think podcasting can help your business.

Starting point is 00:24:07 Think IHeart. Streaming, radio, and podcasting. Let us show you at IHeartadvertising.com. That's IHeartadvertising.com. Hey, I'm Deanna Maria Riva, actress, mother, lover, and a Gen X woman walking through life one hot flash and hormonal crying jag at a time. You ladies know what I mean.

Starting point is 00:24:24 I'll bet you a perimenopausal chin here you do. So let's talk about it. Join me on my new podcast. Kna Kineepi with the Adamenea Riva, where I call on my Gen X squads from Ohio to Hollywood as we navigate Midlife's most fantastic BS. All of a sudden, I'd had hanginess happening on my own. I was like, what the hell is that? I was married when I had her, so I didn't even consider how empty that Ness was going to be. Mood swings, night sweats, fupas, sex drive. Wait, what sex? Dating at 45, how can it be getting naked at 50 with a new guy. That one's kind of hard.

Starting point is 00:24:59 Well, that's lighting. They say we can't polish a turd, but we're sure going to try. So let's get blunt with laughs, tears, or tears of laughter, and dive into it, unfiltered and unbothered and ask, how hard can it be? I cannot believe I'm about to say this out loud in public. Listen to How Hard Can It Be with Diana Maria Riva as part of My Cultura Podcast Network available on the Iheart Radio app, Apple Podcasts, or wherever you get your podcasts. American soccer is about to explode. The World Cup is coming. Ramon sending on to Ernie.

Starting point is 00:25:31 Score at the chip. I'm Tab Ramos. I'm Tom Boe. On our podcast, Inside American Soccer, you'll get the real storylines. I'm not worried about Policic. I'm not worried about Balligan.

Starting point is 00:25:48 I'm not worried about McKinney. My only concern is what happens in the back. The biggest decisions. If you're going to look at stats and numbers, he has no shot at making this World Cup team. And the truth about the U.S. national team. It wouldn't be a huge surprise if our team ends up in the quarterfinals or potentially a great run into the semifinals. The World Cup is almost here. Experience it all with us.

Starting point is 00:26:16 Listen to Inside American Soccer with Tom Bogart and Tabramos on the IHeart radio app, Apple Podcasts, wherever you get your podcast. Yeah, I mean, what I know is a computer scientist, you can't write performance-oriented code, you can't write safe code. you can't write code. It has to sort of juggle a sort of complex set of scenarios. I mean, you just need good programmers' eyes on it building this code.

Starting point is 00:26:44 Why is there a way of explaining to a non-coder why that is? Code is, there's like a poetic element to it. You know, it's writing good computer code is difficult. You're often, you're dealing with, you know, what is my problem here? And I want an elegant way of sort of satisfying this problem. you're often drawing from pretty nuanced algorithms and data structures to try to figure out,

Starting point is 00:27:07 how am I going to organize information and efficiently access it? When performance comes into play, there's a lot of really subtle decisions to make about, you know, how am I going to store or use things in such a way that we don't get bogged down when we're trying to execute things? It just becomes, I don't code as much anymore because I'm a theoretician, but I did my whole life since I was, you know, seven. And it's a, it's an art form, right? When you're using clod code, you're not really supposed to look at, at the code. And so I think that takes a lot of uses probably off the table. It's like the use cases you're supposed to have these different instances of cloud code running, and this one's

Starting point is 00:27:39 going to write code, and then this one's going to write test for that code, and then the cloud code is going to run the test and then try to fix the code if it doesn't matches the test, but your eyes aren't on the code at all. And I, there's, I mean, obviously for a lot of programming, that's an issue. But then the other thing I've heard about the computer programming industry is it's very stratified. There's a smaller number of like really good serious programmers that produce like 90% of the really important valuable code on which everything runs. I don't think they would touch cloud code with a 10-foot pole. Like, they're good at what they do. And then you have these like huge strata of people writing like JavaScript and sort of hacking

Starting point is 00:28:15 together Python and, you know, it's like not very good code. And then it's, I guess you could replace some of that with it. It's functional enough. It's functional enough. But I can't get my arms around it yet. But I do get, you know, I have a lot of sources. So I hear from, you know, I do hear from people that are talking about how cool cloud code is. I do think it's cool. I hear from a lot of professional programmers are like, yeah, we don't use, we can't use this. We're trying to write serious programs. Like, we have to sell this software. Like we, this is, this not solving a problem we have. So, you know, I don't know, but the problem is that's what the reporting should be. Hey, here we are at this company looking over the shoulder of people. What are they doing?

Starting point is 00:28:52 Let's talk to the engineering teams on background of this tech company exactly what role is going not here. And I think what a lot of reporting has become on AI is your hype laundering. So you look at the discussion about the technology happening from more engineering-minded people. You convince yourself as a reporter, I can't understand the engineering, but I will trust the people who do. And then you launder what you're sensing from that hype into your articles, not realizing that like nerds like me, we get hyped up about stuff. And we get super excited about stuff and we go create, like you can't just launder our hype into this is what's happening. And so it's like, reporting on a war where you have no one embedded. There's no one actually on the ground where the

Starting point is 00:29:32 battles are happening. You're just responding to the press conferences that the generals are holding back in the Pentagon. It's not a way to report on what's actually going on. And I think the other thing as well is there is a, if you don't do this hype laundering, I don't know how these, if you're a report listening to this and you have a thought about this, send it to me anonymously. Is it Trump at 76 on signal? But, my thought is as well is there's probably a problem with rationalization as well because if you look at this and you say okay well it's kind of cool it's fun in whatever indeterminate way it doesn't seem like serious software like actual real deal software is being made with it but then the CEO of google

Starting point is 00:30:16 says 30% of code is written in AI which is bullshit and i've heard from so many software engineers well it couldn't be that everyone's just wrong right it couldn't be a case of that everyone is making the most egregious capital expenditure fuck up of all time. This will be historic, I believe, worse than railways, digital beanie babies, but done at the scale of laser tag arenas. Now, it can't possibly be that because everyone else is saying this is exciting and good. And at that point, they choose, instead of being worried, instead of being a bit anxious about this, they say, well, Amazon Web Services spent a lot of money.

Starting point is 00:30:55 So this spends a lot of money too. So this is actually good. It's actually good. And indeed, these people seem emphatic and excited. And I as a non-coder can build a fudged CRM that probably would not withstand even the laziest hacker. I can do this. And thus it will extrapolate further from there. And what sucks is what the people that I believe actually will be hurt by this are retail investors. I think regular people buying stocks in these companies. or selling software stocks because they believe that code will replace them. And ultimately, I think it's just going to be a bloodbath for people's 401 case that could have been avoided, except it would require reporters to do something uncomfortable. And I don't think they want to do that ever. I think there's two things going on is what I've decided with reporting. First of all, it's asymmetric risk. So a lot of reporters are like, look, there's not a major risk if I'm excited about this

Starting point is 00:31:55 and it doesn't pan out. Because we could be like, yeah, surprisingly, this didn't pan out, and with some factors we couldn't see. But they do feel like there's a huge amount of risk of saying, this is not a big deal, and then it is. And so it's definitely an asymmetric risk. We saw a lot of this during, like, COVID as well, right? It was less, there would be less harm if you were too alarmist about something. But there could be a lot of harm. They felt like reputational, if you're like, this is not a big deal, and it was.

Starting point is 00:32:21 And so there's definitely an asymmetric risk profile. There's also like a meaning defining profile. It's just really exciting to think everything is going to be disrupted and change unrecognizably. It sort of gives a focus and meaning to like an otherwise somewhat chaotic and disrupted world that we're in right now. And so there's that aspect too is that people want to believe there is something massive about to happen. Because in some sense it wipes away all the like bad stuff that's happening. Who cares? none of this is relevant because this much bigger thing is coming.

Starting point is 00:32:56 So I think that gets wrapped into it as well. I think the economic reporters are more on this because like their whole job is to try to, I mean, they're not on it. I think after your work, they're on it more. But like, I don't know if I agree. I, I am reading big series.

Starting point is 00:33:09 There's an article in Bloomberg. I'm trying to shove through archive. dot is because they, they were so mean and unfair to my friend Steve Burke at Gamers Nexus. That won't pay them. But it's more shit about like the soft. narrative. The fact that software is being disrupted by Claude Codd, you get the same pallid reporting from Bloomberg and even the Financial Times about Anthropic. And the FD's generally pretty

Starting point is 00:33:33 good. Well, like, yeah, they're just going to make $30 billion next year. It's just fanciful. It's there's the skepticism, the cynicism doesn't exist. And I get, I agree on the risk. Well, there was a bubble reporting, the last fall, there was a period where everyone did the bubble reported. After GP2-5, you had a two-month period where every major publication did bubble reportings. But yeah, I guess it did kind of die off. It died off because they hear one nice thing from Jensen Huang and they're like, well, I'm sold. Like that jacket, that jackets looking pretty sharp. I mean, someone who wears that jacket, how could they be wrong? Would a man that has a shiny jacket like the lead singer of corn wears at concerts? Would he lie? And it's

Starting point is 00:34:20 just what really bothers me as far as the economic reporting though is the Oracle because it's like Oracle needs to pay $300 billion over five years by Open AI, a company that if we're to believe reporting, which I do not, they made $13 billion last year in revenue and lost, sorry, they've made $13 billion and lost, they claim nine, I think it's probably higher. How are they meant to pay $30 to $60 billion a year in a year. And everyone's like, well, they'll work it out. How's Oracle meant to build those data centers? Those data centers are going to cost $189 billion. They've raised $100 billion. Okay, how they meant to do that? And everyone's just like, oh, they'll work it out. I wish I could do this with the fucking bank. I need a $500 million house. I'll pay you for it in some point,

Starting point is 00:35:12 all right? It's fine. And the news would run articles about my genius housing purchases. It's just It's one of those things where, and you said, well, the asymmetric risk doesn't exist. It does for some of these reporters, because I've been saving their bylines for years, because I actually think that there needs to be some sort of reckoning with this, because if you look back, there are major financial outlets that did the same thing, who were literally propping up Sam Bankman freed two weeks before FTCS collapsed. Yeah. Who then went on immediately to cover AI. Yeah. Fucking. And now they're peddling bullshit for Anthropos.

Starting point is 00:35:48 And sorry, I'm kind of hating, well, I guess it's hate it season so I can. It just frustrates me because regular people are being scared. They're being scared by the kind of astonishment, which I actually love, the first punishment reporting of like, oh, well, open claw has proven, open claw is proven that AI is here or they built their own social networks so we should be scared. Software is dead. Insane. Yeah.

Starting point is 00:36:14 Yeah. Singularity is here. And it's like, I assume you saw that, for the listeners, by the way, open claw had this, this fucking clawed bot, whatever it is. They had their own social network where the quote unquote LLMs would post. But it turns out that most of those posts were just made by their owners. Yeah. And this is a good case study, right?

Starting point is 00:36:36 This one bothered me because like writer friends I know who are not technology related were texting this to me. They were worried, right? They were texting me these articles. Like this seems bad. Like this seems like something really bad. They were really getting the digital ick really strongly off of these articles. And I would start reading these articles and I say, well, there's no discussion of what is the technical breakthrough here and what are the concrete implications? Because it turns out there was zero technical breakthroughs.

Starting point is 00:37:02 There is no new AI technology connected to OpenClaw, which used to be called whatever, open-mold and open-cloth or whatever. OpenClaw is, I think, where they ended up, right, which is an open source library or framework for building AI agents powered by LLMs. There's no new AI technology involved in this at all. The agents you build are just accessing off-the-shelf LLMs that we're all using for chatbots anyways. You can aim it at whatever commercial chat bot you want. There's no new framework for how the agents work. It's the same sort of React loop that we've been trying for the last two or three years where you just have a program. It's like a bit of Python code that asks an LLM.

Starting point is 00:37:42 hey, here's what I want to do. Here's the tools you have available. Come up with a plan. And then it sends it back a plan. And then the, the Python code takes the first step out of that text and says, okay, here's the tools available. What should I do to execute this step? And then like the LLM will give you some steps.

Starting point is 00:37:59 And then the Python code runs those steps. And then you just go back and forth, right? That's like this basic React. Which is exactly how Manus worked as well. Everything. There's nothing new about this. Manus was this AI agent that Facebook. might be,

Starting point is 00:38:13 meta might be acquiring, they literally, when you use it, it just writes Python code for every step. Well, yeah. So this is, I mean,

Starting point is 00:38:20 there's nothing new here technologically. The only thing that was new about OpenClaw is it was open source, so it made it easy for anyone to write bad agents. And the other thing

Starting point is 00:38:30 that made it interesting to wireheads in San Francisco is that the commercial products where people are trying to build these at companies, they have common sense security. Like, well, probably, if it's just a Python program

Starting point is 00:38:40 blindly doing what an LLM hell is it to do, like, it probably shouldn't have access to, like, your credit cards or to your hard drive or whatever. At OpenCloud, like, you can just, you can give it access to anything you want on your computer. And so you could build really cool demos that are also, like, incredibly insecure and unsafe. And so it was fun for hobbyists, but there was no new technology there. Nothing was new. It was just a way for other people, like, hobbyists, to build their own agents that were, like, less safe than the more carefully built. Like, what people don't, here's a story people don't know is, like, in the immediate after.

Starting point is 00:39:12 aftermath of chat GPT. So we're in early 2023, right? So chat GPT has come out. OpenAI goes on a road tour of major publications, right? And they're to try to, hey, here's what's going on. You need to write about this. In early 2023, they're like, the next thing we're going to offer is something like these agents.

Starting point is 00:39:28 They call them plugins. And you can install plugins that basically can do actions on behalf of your LLM queries. You could have like a book an airline flight plug in and say, hey, chat GPT, book me a flight, and language model can't do anything but produce text, but then the

Starting point is 00:39:44 plugin could take the text and go and book you a flight. And they're like, this is the future, obviously. And that project disappeared because, oh, it's incredibly unsafe and unreliable to have code that can interact with the real world that's following commands from an unreliable hallucinated LLM. It's like

Starting point is 00:40:00 that went away, not because the technology was hard, but because it's not safe. So there's nothing new. My whole article on ages, they tried this all of 2025, and we're failing to get this type of agent to work for anything outside of basically like producing computer code. So open claw is nothing new. It was just a way for other people to build these things. The only interesting stories were security whole stories. But it was reported, man, I was listening

Starting point is 00:40:23 to the All In podcast. And they were, they said, this is the future of AI. Like, this is it. Everyone is going to have an OpenClaw agent as like a personal assistant. And this is like the biggest thing in OpenAI. And I don't know who it was. One of those guys was like, you know, Yeah, we replaced our podcast producer, you know, with this agent who can, like, email guests on our behalf and book it on our calendar. And, well, it's costing about $1,000 a day right now to run it. But, like, I don't know why we're going to need employees in the future. Hell yeah, brother. Yeah.

Starting point is 00:40:55 So anyways, like, it's a non. If you said, what's the technical implications? None. If you say, what are the concrete implications, the best story they can have is, like, maybe when people, I don't know what it is. All the companies have already been trying to build these things for years. So I don't know. But it was reported like just something icky is happening. And that MOLP book was an application that was built for these agents to communicate on like a Reddit-style social network.

Starting point is 00:41:22 They vibe-coded that framework and it was full of security holes, as we mentioned before. And there was like a small number of users that created a huge amount of agents. And they were just kind of like prompting and prodding their agents to produce like, let's talk about creating a own church or killing humanity. It's like, guys, this, you have Python code asking LLM. to write text in the style of like the matrix and you're posting it on a fake social network. The real story here should be where are they, don't these people have things to do? Don't they have jobs? Like what is this?

Starting point is 00:41:51 Yeah, yeah, yeah. It really, it really, my model train. I added a tunnel. So that one really got to me. I actually will push back on that. Model train listeners, I think, make up 20% of the listenership of this podcast. Of course. Also, model trains are cool, signaling to my fellow artists.

Starting point is 00:42:08 out there. But, no, model trains are a physical thing, which you build it, you build a little city. I think they're delightful. Respect to those. With this, it's like, if they, and you know what, if they were just saying, hey, I've been fucking around with some software and I did something cool, I'd respect the shit out of them. I'd be like, yeah, enjoy yourself. It's going away. This is all subsidized rates, but have fun with that. Don't know why you need a Mac studio, but good luck. No, they are like, this is the future. But I'm going to be honest, Cal, my real question is, what does OpenClaught actually do? Because when I went and read what it actually,

Starting point is 00:42:41 I read so many posts, you said it books, calendars, and does this. I could find no proof that anyone successfully did that. And it doesn't. It's just a series of library calls

Starting point is 00:42:52 that you can use to build your own program in theory that would do that. So OpenClaught itself is just like a series of like interfaces and hooks that makes it easy to write a program that talks to other services, and makes calls to LLM. So it's just like a rapper.

Starting point is 00:43:06 in which you can write your own code. And some people are trying. Yeah, like you can write, you can have it talk to your calendar. You can have it talk to your email in theory. But does it work? But does it work? Well, it's just asking an LLL. You can simulate.

Starting point is 00:43:22 I mean, I did this for my New Yorker piece. I was like, look, I can just simulate being an agent. Just ask an ELLLM. Here's what I'm trying to do. Give me like the steps for doing this or whatever. And like anything else you ask of an LLM, it will give you answers that sound very reasonable in general. and then have like a lot of issues in the details,

Starting point is 00:43:38 which is why agents based on LLMs have struggled because if you, it's fine if you as a human are talking to an LLM because you don't realize how much filtering and tweaking. You're like, well, that's kind of ignore that and this is good or let me ask you to redo it. It's really an issue if you write a program that just says, I will do whatever,

Starting point is 00:43:54 I'll ask the LLM for a plan and then just do whatever it says. Because it doesn't know, that doesn't really make sense, or like this sounds generally, generally reasonable, but with like issues in the details. Does it work well when you execute things? I mean, in a practical sense, they said on the all-in podcast, three dumb bitches saying, exactly, that's a meme reference. Don't use that word usually.

Starting point is 00:44:19 Those people sitting around going, blah, blah, blah, we've made it and replaced our podcast producer with an LLM that can make appointments and send emails. In practice, is that true? I severely doubt that. I just, I, also, you're spending $1,000 a day. you're paying $365,000 a year for this, right? Yeah. Are you really or are you just, have you, is it complete your talk?

Starting point is 00:44:46 It probably isn't. And that's the thing. I don't, I get why they all in guys do it because they probably have investments and their boosters. TBPN, same fucking deal. It really rules that the two largest, like, tech things in the valley are just state media, but for Silicon Valley. what gets me is when you get like the Atlantic, CNBC, Business Insider and places like that doing stuff like this.

Starting point is 00:45:12 And it bothers me because they don't even need the hate on it. They could just say, yeah, it could do this. It's pretty cool, right? Yeah. But I guess that that doesn't get the clear. It's not an interesting story. The real story is not always that interesting. It's like, look, hobbyists are building these tools that kind of do cool things,

Starting point is 00:45:29 but make mistakes and it's a little expensive. Like, the most interesting story out of OpenClaw is, the only thing I think is going to be impactful out of it is because it was so expensive, right, like that $1,000 a day, what it's forcing people these hobbyists to do is to turn to like cheaper alternatives for models. And that I do think is significant.

Starting point is 00:45:48 This idea that there's open source models out there, they're significantly cheaper than trying to use like OpenAI or Cloud. More and more people are running their own local models because it turns out for, you know, most specific uses you might use an LLM. You don't need like a super, a super fine-tuned trillion parameter beast of a model running in some data center somewhere. It's like, you know what?

Starting point is 00:46:08 I'm parsing my email that try to extract like suggested times. I'm fine with like a 20 billion parameter model that can easily fit in a single GPU on a thing I have, you know, we share in our office or something like that. So to me, that's the most interesting story out of OpenClaught is the bad news it could be for the big companies. As people get more comfortable with, we don't need these Formula One car versions of language models for the stuff we're actually doing, we're fine with the Ford Focus, right? And I think that's a transition that's going to, that's a transition that to me is more

Starting point is 00:46:43 that's what we should be writing about. Like, that's an interesting idea to me is that you have these huge high valuation companies, but you also have all these open source models, like the weights are just out there in the public domain that can do 98% of what people care about. And you're beginning to get low-cost competitive services where people can just spin those up and cheaper data centers. That's interesting to me. That's an economic story.

Starting point is 00:47:03 AI creating its own church is not as relevant to me. Another podcast from some SNL, late-night comedy guy, not quite. Unhumor me with Robert Smygel and friends, me and hilarious guests from Jim Gaffigan to Bob Odenkirk to David Letterman, help make you funnier. This week, my guest, SNL's Mikey Day and head writer, Streeter Seidel, help an acapella band with their between songs banter. Who's that worst singer in the group? The worst? Yeah. Me.

Starting point is 00:47:41 Is there anything to the idea that because of your... from Harvard. You only got in because your parents made a huge donation. The group. The yard birds, right?

Starting point is 00:47:53 That's the name. The Harvard yard, but they're open. Do you have a name suggestion? We're open. Since you guys are middle-aged, one erection. Listen to humor me with Robert Smigel and Friends on the I-Heart Radio app,

Starting point is 00:48:07 Apple Podcasts, or wherever you get your podcast. Humor me. I need some jokes to make me See funny. Run a business and not thinking about podcasting, think again. More Americans listen to podcasts than ads supported streaming music from Spotify and Pandora. And as the number one podcaster, IHearts twice as large as the next two combined.

Starting point is 00:48:29 So whatever your customers listen to, they'll hear your message. Plus, only IHeart can extend your message to audiences across broadcast radio. Think podcasting can help your business. Think IHeart. Streaming, radio, and podcasting. Let us show you at iHeartadvertising.com. That's iHeartadvertising.com. American soccer is about to explode.

Starting point is 00:48:51 The World Cup is coming. Ramos sending on to Ernie Stewart for Chip. I'm Tad Ramos. I'm Tom Boe. On our podcast, Inside American Soccer, you'll get the real storylines. I'm not worried about Policic. I'm not worried about Balagan.

Starting point is 00:49:11 I'm not worried about McKinney. My only concern is what happens in the back. The biggest decisions. If you're going to look at stats and numbers, he has no shot at making this World Cup team. And the truth about the U.S. national team. It wouldn't be a huge surprise if our team ends up in the quarterfinals or potentially a great run into the semifinals.

Starting point is 00:49:32 The World Cup is almost here. Experience it all with us. Listen, Inside American Soccer with Tom Bogart and Tab Ramos on the IHeart Radio app, Apple Podcasts, wherever you get your podcast. Hey, I'm Deanna Maria Arriva, actress, mother, lover, and a Gen X woman walking through life won Hot Flash and Horn. hormonal crying jag at a time. You ladies know what I mean. I'll bet you a perimenopausal chin here you do. So let's talk about it. Join me on my new podcast. How hard can it be with the Adamani Arriba, where I call on my

Starting point is 00:50:04 Gen X squads from Ohio to Hollywood as we navigate midlife's most fantastic BS. All of a sudden I'd had hanginess happening on my own. I was like, what the hell is that? I was married when I had her, so I didn't even consider how empty that nest was going to be. Mood swings, night sweats, It's Fupa's sex drive. Wait, what sex? Dating at 45. How hard can it be? Getting naked at 50 with the new guy.

Starting point is 00:50:30 That one's kind of hard. Well, that's lighting. They say we can't polish a turd, but we're sure going to try. So let's get blunt with laughs, tears or tears of laughter, and dive into it, unfiltered and unbothered and ask, how hard can it be? I cannot believe I'm about to say this out loud in public. Listen to How Hard Can It Be with Diana Maria Riva as part of my Cultura podcast network available on the IHeart Radio app. Apple Podcasts or wherever you get your podcasts. I'm already working on a story this Friday, in fact,

Starting point is 00:51:03 on my premium newsletter about the fact that actually the margins of serving GPU compute are dog shit. Like the best of the best of like a co-location place, like Applied Digital, it's like 27% gross margins. And that's at 100% utilization. Anything below 0.7, they're burning cash. I hear there's a day it's sent around in North Dakota losing a million dollars a day. That's the thing.

Starting point is 00:51:25 this is even with these lower cost models. On device could be interesting. That's what's going to happen, I think. I think on device is what's going to happen. I just remembered something. This is a classic bullshit story that I see every so often. So have you read any of the stories that are like, yeah, Claude can now work for hours uninterrupted?

Starting point is 00:51:44 Have you read about these? You heard this? You've seen this? Work for hour? Oh, yeah, yeah. Yeah, yeah. We're talking about the multi-step agenda execution. Well, no, it keeps coming back.

Starting point is 00:51:53 Yeah. Yeah, AI on the CNBC, AI on the verge of eight-hour job shift without burnout or break. Is it, is 24-hour AI workday next? Going to just censor myself what I was going to say there? Because it's just like, yeah, it can work for hours. Is the output good? Does it? Yeah, it's just a lose. Does it? Does it just keep, call it, recall it, recall it. Like, I can, I can write a Python program that calls an out for 24 hours. If you need me to burn something for hours on end, I just give me some gasoline. I got you, baby. I could sort this right out. It'd be much cheaper. But it's like, yeah, by September 2025, Claude Sondit 4.5 was reported to run autonomously

Starting point is 00:52:35 for up to 30 hours, reported with, I mean, but that's more vibe reporting. It's you're meant to read that and go, wow, this thing is performing tasks that are useful, that execute code, that do something. And in that 30 hours, that is equivalent to 30. human working hours versus they sat there and pissed their pants for like 29.5 of those at least. And also that those are like specialized tasks typically that have clear milestones and testing. So it's like it can do something. The loop can try and try until it passes the test.

Starting point is 00:53:11 And then you know that's done. And then you can move on to the next step. And then you can keep calling to LM and executing until it passes the next test. And you can move on. And in theory, like it's, yeah, they're avoiding error. cascade because it's testable and they can keep retrying or something like that. I mean, this was like the meter had this problem with their graph of how long of tasks AI can now do. And it was like this sort of like super arbitrary decision of like, oh, here's a task that takes a human five

Starting point is 00:53:36 hours. Now AI can do that. Here's a task. Whereas really more about like how many things in a row can you do without the errors cascading out of control or something like this. Right. It's very different. You're being way nicer than you should be, in my opinion, just because they don't even. even explain. They don't even say like, yeah, and we managed to make it do it. And this, they just go, that's right hours. CNBC just fucking, just like, woo, just for tearing their shirts off and screaming. But it goes back to the two things we need this report. And you need technical, the details of the relevant technical innovation and a discussion of concrete implications and your future implications of that breakthrough. That's what I'm always looking for. So if you want to cover a story like that,

Starting point is 00:54:17 like, well, what does that mean? What is the technical breakthrough, right? What does this mean? What does this technically you can do 30 hours of work or eight hours. What is the work? What was changed? What did they figure what was happening before? What technical breakthrough made this possible now? And then what are the concrete implications? What specific things now can we direct? This will now allow us to do. Tell us what jobs are going away. Tell us what tool we're going to see. Like you have to. But we avoid that, because then you're putting your chips down. And then those things don't come along. And so I'm always looking for that. What's the technical innovation and what are the concrete implications? If you don't have that, you're mining emotions.

Starting point is 00:54:52 you're mining emotions or just helping boost stock. That's what really bothers me because I hate that they're scaring people. I really, really hate that they're doing marketing. Like they're just doing, like, I run a P-RFA. I've dealt with early-stage startups for like 15 years. There isn't a single early-stage startup that would get a percentage point of this bullshit. Like you email a reporter about like a series A-stop. like, all right, all right, motherfucker.

Starting point is 00:55:24 How much revenue? Are they profitable yet? Why not? Why not? Explain to me right now. Anthropics like, oh, we're going to burn $100 billion on training, I guess. What do you think?

Starting point is 00:55:36 And they're like, yeah, I love it. I actually think that's future. That's great. I'm not, and to be clear, I think that this scrutiny should be from the beginning to the end. I think that everyone should face this scrutiny. Yeah. I'm not saying that I should get an easier.

Starting point is 00:55:49 I'm saying actually everyone should. Anthropics should face. the most brutal scrutiny. And I guess that they don't because they want access. It's just very dull and annoying. And it's just helping already rich guys like Mario Amadei, who should not have more money. Listen to him speak.

Starting point is 00:56:09 He needs to face some stress. I think some stress would help him grow. But, Cal, as we wrap up, I did actually have like a technical thing that was an idea of being percolating. So I think that this term training with models is being misused and used in a way that is kind of vibe reporting style, which is they use training as a word that suggests that it will stop, that they will stop training these models. But correct me if I'm wrong, training is everything from building a new model to updating a model's current parameters, correct? Yeah, there's pre-training and post-training is the right way to think about it. So the pre-training is unsupervised. That's where you take all the text that's ever been written.

Starting point is 00:56:49 and you will take a real piece of text written by a human, and you'll cut it off at an arbitrary point, and you'll tell the language model, you guess the word that came next. There's a real word that came next. This is real writing. Guess the word that came next, and then it guesses,

Starting point is 00:57:03 and then you adjust the weight so it gets closer to the right answer. That's pre-training. And when you adjusts the weights, what are you doing? You're running a training algorithm called back propagation. This is a Jeff Hinton innovation, where you're going through and you're adjusting the weights all the way through the layers in such a way that the answer it gives for this particular test

Starting point is 00:57:22 gets a little bit closer to the right answer. And that's pre-training. That's pre-trained. It's unsupervised. So you take Hamlet or you take Dickens. It's like the best of times. It was the worst of. And then you give that to the thing,

Starting point is 00:57:32 what word should come next? And, you know, it says bacon. Like, we're going to adjust these weights now in a way that like your answer gets a little bit closer towards times. Right. Okay, that's pre-training. Then you get post-training where you already have trained. So you have this network,

Starting point is 00:57:45 all the weights have been set through this massive, multi-month, you know, billion-dollar pre-training. And now you want to go through and you want to tweak this to avoid certain types of behaviors or to influence towards certain types of behaviors. So for post-training, it's almost always based off if you have inputs like prompts and correct answers. This is the right way to respond to this question, right? So you have pairs of questions and answers.

Starting point is 00:58:11 You give the prompt to the LLM. It spits out some answer. And now what you're doing, you're using it's called re-enner. reinforcement learning is a general technology, but you're using techniques from reinforcement learning to sort of like zap it, like you would zap a dog when your dog training it. If it's a bad answer, you zap it, so you get those weights away from that answer. And if it's a good answer, you give it a treat. And this is post-training.

Starting point is 00:58:33 And so that's where you've moved past the word-guessing game, which is where all of the sort of general smarts comes from these models. And now you're doing this sort of zapping and treat training around very specific things, right? This is where... But they do that. Keep going. Yeah. So, like, so you'll go through and, like, ask it questions where the answers might be, like, about building bombs or whatever.

Starting point is 00:58:55 And every time it spits out an answer about a bomb, you give it, like, a really bad, negative shock. And you know, like, definitely turn off those circuits. Like, we don't want you to spit out answers by bombs. Like, that's where all the guardrails come from. Or if you want it to get better at doing, like, a particular math exam. You can give it, like, lots of questions from that math exam. And then you have the right answer, and you can kind of zap it to move it towards

Starting point is 00:59:17 what the answers look like on this math exam. So post-training is more focused. You have particular types of behavior you're trying to sort of instill in this already pre-trained massive network. That's mainly where the focus turned after GPT-4. So GPT-4 was like the extent of pre-training making it smarter.

Starting point is 00:59:37 After GPT-4, trying to make those models even bigger and pre-trained them longer didn't lead to much performance increases. So everything we got between four in the lead-up to five, was post-training. And that's when they began focusing on metrics. Because if I have a particular metric, I can post-train a model to do well on that test.

Starting point is 00:59:56 And so everything became about metrics and post-trane. Now we can do this thing better, or look at this thing we do better. And so that's kind of the game that's played now is we do lots of post-training. That requires much more specific data because you need like right answers, pairs of prompts and correct answers.

Starting point is 01:00:13 Right. So only certain things we can do this with. But that's the game we've been playing since like 2024. do they do that with models that exist now? So this is basically updates, right? Yeah, they do it on a semi-regular basis, yeah. But usually there'll be a name change. Like GPT-5-2 is different than 5-1, different than 5.

Starting point is 01:00:31 But don't they update the current models? They don't re-pre-train them. That's too expensive. No, no, I'm not saying that. I'm saying, do they post-train the current models to make them better at stuff? Yeah. To tweak things. Yeah.

Starting point is 01:00:43 So that's, this is a very long way to get to a point. kind of making, which is one of the problems with vibe reporting on this is training is framed as this thing, like inference is framed as OPEX that is permanent that you cannot avoid inference being creating the outputs. Training is framed as this R&D mysticism, which is just out there. And you know, they never say this, but you hear training, you think, oh, you train and then you perform. And so training would end. But from what I understand, training is as common and necessary in expense as inference at this point. Yeah, it's the only way you improve or update things, right?

Starting point is 01:01:22 So if you had a Microsoft 365 software, you're constantly sending updates and patches and whatever as you like add new features or whatever. In the AI model world, it's, yeah, it's post-training. It's the only way to make any sort of improvement or fixing bugs. Like you're like, oh, here's something it's saying that we're really upset about, okay, let's go in and do some zapping. Let's get out the zapper.

Starting point is 01:01:44 Give it a bunch of examples of saying that bad thing. and let's zap and say, don't do that. And now it's like very unlikely to do that. So yeah, they're constantly, they're constantly doing that. Otherwise, you're a stasis. Which is different than the way most people actually think of it differently. They think that the model is somehow like learning online.

Starting point is 01:02:01 Which is absolutely not. It's absolutely not true. That as you talk to it, it's learning and it's getting better. And they're like, but wait a second, it remembered something I talked about earlier. It must be getting smarter. The model doesn't care about you. Just your local software. you don't realize this.

Starting point is 01:02:17 It's including, like, huge bits of stuff you've talked to it before in the prompt that's going to the model. It's like, here's a bunch of stuff that Ed has submitted to you in the past. Okay, now here's his current question. So the model hasn't changed. The model doesn't have memory. It can adjust its, it doesn't adjust its memory in real time. It's all static. There is no dynamic memory involved in these models.

Starting point is 01:02:38 They're fixed until you go out and post-trained it, and then you bring the new weights back in the data center, and that's the thing to an inference now. But that's the thing. like this the reason I bring it up as the kind of closing vibe story is because training is very clearly getting more expensive what they say profitable on inference which again doesn't really make sense to me but putting that aside they're not but even if they were if training never stops then who cares about profitable in inference like it just means that you will get more expensive forever. inference is just, yeah. Inference is P, training is poo. I'm not putting that one in an article. Yeah, okay, it's an interesting question, right?

Starting point is 01:03:21 It's also getting like the, what's an Altman who was making those comments about, well, if we just didn't have to pay for the training, this would be profitable. If I didn't have all these expenses, I'd be so profitable. The one distinction that's maybe relevant there, not to be an apology, but the one distinction that's relevant is pre-training versus post. So pre-training is insanely expensive, right? Because you're training something on all the words in the world. And it takes months.

Starting point is 01:03:48 So it's just like you're running a data center that's going to have nowadays up to six-digit GPUs running full-time for months just to get that pre-training done. And you have to pay for all of that. Right. So that's all time. You're not getting money. You're just paying for training. Post-training is also expensive. It's not that expensive, though, because as expensive as pre-training because each post-training.

Starting point is 01:04:10 because each post-training session, it's a way, way, way, way smaller data set that you're post-training it on. It's like, all right, we generated like 10,000 examples of people, you know, responding to questions in a racist way. And those 10,000 examples

Starting point is 01:04:22 will use to reinforcement learn and try to move it away from answering those type of questions in a racist way. That's like not that big of a data set compared to we're going to train this on every word written that we have access to.

Starting point is 01:04:35 So the post-training is not as expensive as pre-training. Unless you're doing it all the time, Yeah, that's true. Like, that's the thing. If you're doing it all the time and it takes months of pre-training, but you're constantly doing something like post-training for months. Yeah, I mean, post-training not functionally the same thing.

Starting point is 01:04:53 It's the same thing. I think this is a fair point. When you have a particular example that you're post-training on, like one prompt with an answer, it's kind of like you're doing inference in reverse, right? So you're going from, you're back propagating from one side of the network to the beginning as opposed to going from the beginning to the end. Now, it's more expensive than that because when you're, just doing inference, the fundamental

Starting point is 01:05:12 operation is basic multiplication. You're just multiplying numbers in the big table. Back propagation which you use to training, it is multiplication when you do a bunch of, it's a bunch of derivatives because you're constantly you need to calculate like the derivative of these. I mean,

Starting point is 01:05:29 like it doesn't really matter, but you kind of need that you want the derivative because you want the gradient descent to be towards like better and away from worse. And derivatives, my understanding is this is like more expensive operation per weight that you're trying to change because you're not just multiplying a number,

Starting point is 01:05:43 you're having to calculate derivatives and becomes a little bit more complicated. So yeah, it's like inference and reverse, but also like a little bit more expensive. So if you have 10,000 sample question responses, you're going to use the post train. It's kind of like 10,000 users sent prompts and they're particularly expensive prompts

Starting point is 01:06:02 and you had to pay for that instead of them paying for it. So, yeah, it's good to think of it as like inference and reverse. it's also an ongoing cost. Like, it's, everyone is leaning on this idea that this stuff will magically become profitable. I don't know if you've seen the cash flow diagrams of Anthropic and OpenAI, but there's a mysterious math going on where year 2028, 2029, they just become profitable. But I was going to ask you about this last month. There's this announcement that OpenAI had some massive increase in revenue.

Starting point is 01:06:36 Yeah, well, this is actually a great. vibe reporting thing and that's annualized revenue. They said they hit 20 billion in annualized revenue, which would mean 1.67 billion in a month now. Important details. We don't know how they're defining a month. We don't know if they mean 28 days, 29 days, 30 days. We don't know if they mean a calendar month, so the month of November or December. Or do they just been any 30 days? We don't know if they're doing insane math, which happens very rarely, that this company feels like one that might do just my gut instinct, is they may be doing, here's a seven-day period, and we're going to turn it into a month. Like, they, we don't know how they're doing this. And they also coupled this by saying

Starting point is 01:07:18 that as compute grows, revenue grows. I don't know if you've seen this. No, this is their formula. Yeah, okay. Yeah, it's an insane formula that does not map to any economics. Like, it's just, it's the kind of thing that if we had a functioning business and tech press that would just be scrutinized to the bone that would just be ripped apart and say, what the fuck does that mean? Because if this were true, if you simply add more gigawatt, add more revenue, then you would simply print more money, like it would be a money printer versus money. Well, it would be like my movie theater did well. Therefore, if I build 100,000 movie theaters, we're going to make 100,000 more times the amount of money. And it's like, well, wait, that theater was

Starting point is 01:07:58 in Manhattan. And it was like really well run. And they're, yeah. Yeah. Well, okay. I assume this much. Trying to understand that story. Okay. Yeah, the annualized revenue stuff. I think I even used that term talking to someone I credit you. I was like Ed Zetron would say look out for annualized revenue. It is, the funny thing is with that as well is it's more viable reporting because

Starting point is 01:08:22 ARR standard. It's very standard in SaaS companies that sell on a per seat basis. So a Salesforce would do AI, well, even they are doing annualized now with the AI revenue. But it would be a software company has 100 seats they sell to a company and they charge 15 bucks a bit, a seat per month. And they charge that annually. And the actual cost of each user is fairly measurable because they're doing CPU-based stuff. Like it gets expensive at scale because there's a lot of people. But it doesn't get multiple. I can't say the word. It doesn't get much more expensive as you grow. With AI, it's actually because of the way large language models work. your most excited customers are your most expensive. Yeah, because they blow past whatever their monthly revenue is, whatever their cost is, they blow past it. So you can't even do a per se revenue. It's just every, all of these things, when I say them out loud, I'm like, I feel like this should be more obvious.

Starting point is 01:09:21 It's why I loved your video because it was like, thank you. Someone else. Well, here's the question. Here's my, here's my, here's like the dangerous question. I actually put out a video last week. It was like dangerous question. We're now on year. three or four of new like the, I'm counting new year, starting like New Year 2020 or whatever, of people saying, oh my God, this is so cool. These massive disruptions, they're going to change everything is imminent. And year after year we've said that, I'm not yet seeing the massive disruptions. Like not the, not the stories of what might be disrupted or the stories of what's different.

Starting point is 01:09:57 But like, how many years do we have to go without industries, crumbling or major new economic players that didn't exist before or complete restructuring of huge companies around this technology. How many years do we have to go? So I did a video where I found the Reddit thread where someone just asked this question. This was from like earlier in the month. They were like outside of the vibe coding stuff, what are the, like what are people, what are the big tools that have come out of this technology that are changing things?

Starting point is 01:10:29 And I read through this whole thread and it was interesting. there's not, people don't have much, like, well, you know, like, it could, these, like, really small case studies. I used it to help, you know, gather cleanup data that I got from whatever. I was like, this is, like, such a nerdy specific use case. And so I went through that thread in a video. And this has kind of been my question. It's like, it is very cool technology. But how do we know where this is going to fall? Like, to me, the scale is, would go like this. Like, blockchain software, then Oculus VR, then maybe internet, then electricity.

Starting point is 01:11:04 So we're going to have a scale of disruption, right? Blockchain software is something where the premise made no sense and it was never going to get off the ground and there was going to be no impact on the world. And because my training in CSs is in distributed system theory, I was there in 2020 saying,

Starting point is 01:11:20 guys, let me just tell you, this is nonsense. No, Web3 is not about to take off. None of this makes sense. And that was true. That did nothing. Then you have like Oculus. VR. It really is cool, right? You put on these things like, that is awesome. Like, I love this technology, but it's having a hard time have any real major impact because people aren't sure what they do with it. Also, most people don't necessarily have a great experience initially because it's extremely dependent on where you are, who you are, the size of your skull. That kind of. Yeah. For a big head, for a limited group of people, it's really cool, but it fails to come out. Then the internet is like really disruptive, changed a lot of things. More, it's not so much as,

Starting point is 01:11:57 like whole industry has disappeared or whatever, but it changed the way a lot of industries actually functioned, and then, like, electricity, you could say, like, it just completely changed what day-to-day existence and business was like. How existence of the right? Exactly, yeah. And so the big question, like, everyone should be asking, is where is Genervae going to fall on this?

Starting point is 01:12:14 And, you know, I would say right now, this is what gets me yelled at. I'm not saying this is the prediction necessarily going forward, but right now, I don't think it's got past, much farther past the Oculus part of that scale. I actually, where it's really cool, there's very cool things. Pat GPD is very cool.

Starting point is 01:12:29 It's very cool that it can have that comprehension, and no one thought it could do that. But we haven't yet figured out what to do with it. Comprehension. It's a... Text comprehend. I mean, we take it for granted, but for CS people, the ability that I can...

Starting point is 01:12:44 Hey, give me text that, like, whatever, in the style of a poem that does whatever and that includes a character from Star Wars. And then it can give you text that does that. That comprehension, like, for a computer scientist, was like, oh, we didn't really know how to consider On a technical level, yeah. Yeah, that's like very cool.

Starting point is 01:13:00 Yeah. But we haven't got past... This is like the surprising thing of this field is we're not really past the Oculus stage yet. Like where there's like for certain... This is vibe coding is really cool. The comprehension is cool. So like Sora is weird, but like it's cool. You can do that.

Starting point is 01:13:16 But the markets are not... None of these have markets yet, right? Like there's not big markets in any of these yet. And we'll... How far will it go from Oculus to the internet? To me, that is like the number. one question, the number two question, the number three question of all reporting on this, and almost no one's talking about that. It's just hype laundering. We'll take this hype,

Starting point is 01:13:37 we'll extrapolate it, we'll react to that extrapolation. That's kind of what reporting is right now in AI, whereas to me, this is the hugest question. If this ends up Oculus, retail investors are going to get screwed. If it ends up internet, all right, that's like a really interesting, significant story. If it ends up electricity, obviously that really matters, but like I don't know anyone who actually thinks it's going to be that disruptive, not the current technology. That's the story to me, not let's like extrapolate us, you know, hey, what these things are creating a church or let's let's, let's hype londer, extrapolate that and react to our extrapolation. That's not really reporting so much as speculative fiction writing, I guess.

Starting point is 01:14:14 I don't know quite what to call it. But this is the real question. Where exactly are we now? And what are the possibilities of like where this is going to go positive and negative? I don't we have enough talk on that. I fully agree. Cal, it's been. such a pleasure having you. Thank you for joining me. Always happy to talk shop. Always happy to hate with you, I guess. Aater season, I like it. Hater season is the best.

Starting point is 01:14:36 We will be back this week with either a monologue or an interview. I have not decided, because I've got a wonderful Corey Quinn interview I just did, so I'm considering putting that in the monologue. You'll find out on Friday. Anyway, this has been Better Offline. I'm at Zittron. Subscribe to the premium. Download a T-shirt, whatever you desire. Thank you for listening to Better Offline.

Starting point is 01:15:04 The editor and composer of the Better Offline theme song is Mattersowski. You can check out more of his music and audio projects at Mattisowski.com. You can email me at E-Z at Better Offline.com or visit Better Offline.com to find more podcast links and, of course, my newsletter. I also really recommend you go to chat. Where's Your Ed dot at to visit the Discord and go to R-S-S-Better Offline to check out R-R-R-R-R-R-R-R-R-R-D. Thank you so much for listening. Better Offline is a production of Cool Zone Media. For more from Cool Zone Media,

Starting point is 01:15:39 visit our website, coolzonemedia.com, or check us out on the IHeartRadio app, Apple Podcasts, or wherever you get your podcast. Another podcast from some SNL late-night comedy guy, not quite. Unhumor me with Robert Smygel and friends. Me and hilarious guests from Bob Odenkirk to David Letterman

Starting point is 01:16:17 help make you funnier. This week, my guest, SNL's Mikey Day and head writer Streeter Seidel help an a cappella band with their between songs banter. Where does your group perform? We do some retirement homes. Those people are starving for banter. Listen to humor me with Robert Smigel and friends

Starting point is 01:16:33 on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. There are times when the mind becomes a difficult place to live. This is David Eagleman with the Inner Cosmos podcast, and for Mental Health Awareness Month, we'll talk with singer-songwriter Jewel about anxiety. I started living in my car, and then my car got stolen.

Starting point is 01:16:53 I was having panic attacks. I was agoraphobic. This is a month of deeply personal and honest conversations about what happens when the brain goes off course. Listen to Inner Cosmos on the IHeart Radio app,

Starting point is 01:17:05 Apple Podcasts, or wherever you get your podcasts. Hey, everyone, it's Ryder Strong and Wilfredell from PodMeets World. And now the PodMeets Twirled podcast. We're two men who were completely clueless to reality TV, and we're gearing up for the season finale of Survivor.

Starting point is 01:17:23 I know we annoyed a lot of our listeners by our severe lack of survivor knowledge. That is the point of the show. I'm just going to remind you. Again, we are experts. Listen to Podmeets Twirl on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts. Real talent is defined by what people can do, not where they learn to do it. So by stopping at the education section of a resume, you might throw away the perfect hire. Skills first hiring helps you see talent others miss, like more than 70 million.

Starting point is 01:17:53 stars, skilled through alternative routes. Let their story unfold and gain a competitive advantage because hiring managers who start with skills are 60% more likely to find a successful hire. Higher skills first. Learn why at tear the paper ceiling.org. Brought to you by Opportunity at Work and the Ad Council. This is an IHeart podcast. Guaranteed Human.

Better Offline - Hater Season: Cal Newport on AI Reporting

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.