Better Offline - Hater Season: Cal Newport on AI Reporting
Episode Date: February 11, 2026Better Offline’s “Hater Season” - an ongoing roundtable with tech’s greatest haters - continues as Ed talks with computer science professor and writer Cal Newport about the way...s in which the media fails to report the truth about AI.Please support me by subscribing to my premium newsletter - here’s $10 off your first year of annual https://edzitronswheresyouredatghostio.outpost.pub/public/promo-subscription/84rt762qen Cal Newport’s “Don’t Trust AI Reporting”: https://youtu.be/xUh3Gc-BAlo?si=XpVzYHh7_k_VXpU1 Podcast & Videos: https://www.youtube.com/@CalNewportMedia/ Newsletter: https://calnewport.com New Yorker archive: https://www.newyorker.com/contributors/cal-newport YOU CAN NOW BUY BETTER OFFLINE MERCH! Go to https://cottonbureau.com/people/better-offline and use code FREE99 for free shipping on orders of $99 or more. --- LINKS: https://www.tinyurl.com/betterofflinelinks Newsletter: https://www.wheresyoured.at/ Reddit: https://www.reddit.com/r/BetterOffline/ Discord: chat.wheresyoured.at Ed's Socials: https://twitter.com/edzitron https://www.instagram.com/edzitron https://bsky.app/profile/edzitron.com https://www.threads.net/@edzitron Email Me: ez@betteroffline.comSee omnystudio.com/listener for privacy information.
Transcript
Discussion (0)
This is an IHeart podcast.
Guaranteed Human.
Run a business and not thinking about podcasting.
Think again.
More Americans listen to podcasts
than adds supported streaming music from Spotify and Pandora.
And as the number one podcaster,
IHearts twice as large as the next two combined.
Learn how podcasting can help your business.
Call 844-844-I-Hart.
Another podcast from some SNL late-night comedy guy,
not quite.
Unhumor me with Robert Smygel and friends.
Me and hilarious guests from Bob Odenkirk to David Letterman
help make you funnier.
This week, my guest,
SNL's Mikey Day and head writer, Streeter Seidel,
help an a cappella band
with their between songs banter.
Where does your group perform?
We do some retirement homes.
Those people are starving for banter.
Listen to humor me with Robert Smigel and friends
on the I-Heart Radio app, Apple Podcasts,
or wherever you get your podcasts.
Hey, everyone, it's Ryder Strong,
and Wilfredel from PodMeets World.
And now the Pod Meets Twirled podcast.
We're two men who were completely clueled,
to reality TV, and we're gearing up for the season finale of Survivor.
I know we annoyed a lot of our listeners by our severe lack of survivor knowledge.
That is the point of the show. I'm just going to remind you.
Again, we are experts.
Listen to Pod Meets Twirl on the Iheart Radio app, Apple Podcasts, or wherever you get your podcasts.
There are times when the mind becomes a difficult place to live.
This is David Eagleman with the Inner Cosmos podcast, and for Mental Health Awareness Month,
We'll talk with singer-songwriter Jewel about anxiety.
I started living in my car, and then my car got stolen.
I was having panic attacks.
I was agoraphobic.
This is a month of deeply personal and honest conversations
about what happens when the brain goes off course.
Listen to inner cosmos on the Iheart radio app, Apple Podcasts,
or wherever you get your podcasts.
Also Media.
Your witness to a great becoming.
It's Better Offline, and I'm Ed Zittron.
Today we're joined by computer science professor and tech writer Cal Newport for Hayter Season.
Cal, thank you for joining me.
Always happy to do some Haiti, I suppose.
Well, you had an excellent YouTube that I'll be linking in the show notes about the mistakes in AI reporting.
Though I would, my hater in me says I don't think these are mistakes.
But you really, you touched one of the best videos I've seen on AI report, or AI in general, was what you did.
But it was basically this thing of like the digital ick.
that these stories are meant to make you feel uncomfortable.
And of course, this faux astonishment thing, you know what?
Just run the tape.
Tell me a little bit about what the bits that you found because I watched it going,
yeah, yeah, yeah, like an angry person.
Yeah, I could imagine you when I was recording that video.
I was like, I bet Ed is cheering out.
Welding out.
I was having a great time.
Exactly.
Well, let me just give the context, right?
So I'm in an interesting situation for observing this because, you know,
I am a computer scientist, so I'm not afraid of the technologies.
I'm happy to talk about transformers and feed forward networks and diffusion models.
And like, that's not that scary to me.
But I also write about technology.
My main journalistic home is the New Yorker where I write a lot about, you know, I do AI journalism there.
So I'm up on that as well.
And so I'm often noticing things in journalism when I see AI coverage that is faulty.
It really catches my attention because I have a foot in both.
of these worlds. So there's a lot of good AI reporting out there. There's also a lot of trash.
And I really wanted to help people figure out how do you sort it? Like, how do you figure out
if you're reading something? Should I pull the ripcourt on this article? Like, this is
not helping me. Like, how do you know if it's good or not? So I was like, okay, here's what
I'll do is I'll come up with the three most common traps I see in AI reporting that makes me
one who, you know, throw my iPad at the wall. And I came up with three and I'll just give you
the three names. I made up all these names. I don't, I don't think they're great. My
producer thinks he loves them, but here we go.
Vibe reporting.
That's number one. So that is where you will omit certain facts and put loosely related quotes
next to each other in a way that creates a general vibe that you want to be true,
but it's not quite true.
So you don't actually make a concrete claim that's not true, but you imply that claim by what
you omit or what you put in your story or what quotes you put next to it.
So you'll put a quote, for example, about layoffs at the gaming division at Microsoft next to an unrelated quote about concerns about AI and its impact on jobs.
Now you have the vibe.
Oh, man, all these people just got laid off because of AI.
AI is taking jobs.
Where in reality, the layoffs had nothing to do with AI, but you put these things next to each other.
You give that sense.
Then I had mining digital ick.
So to me, that's any AI story where you take an example from the edges of AI, like something that wireheads in San Francisco are up to.
and you just tell a story that's unsettling
without talking about any of the technical details
like, well, what's different here?
Was there a technical innovation we need to know about
and not discussing any concrete implications?
Oh, this means this is going to change in the future
or it's going to have an impact on this sector.
You're just telling a story to unsettle.
And I think a lot of the coverage of Maltbook
and OpenClaw fell into that.
And then finally, it's Faustonishment,
which is more of a YouTube phenomenon
than a print journalism phenomenon,
but that's where every single thing that happens
in AI is insane, amazing, terrifying, everything is going to change.
And so you constantly create this atmosphere of something seismic just happened so that
the consumer of the information ends up in a bit of a panic.
Like, I can't put my finger on exactly what's terrible, but like everything terrible
is happening.
Those three traps, to me, should be automatic rip cords from what you're reading or watching.
The only disagreement I'll have is that you would say that this isn't the majority of
AI journalism.
actually argue it would be, especially that kind of the beginning one, the vibe reporting,
is very common because the big one right now I'm seeing is this AI software, AI's replacing
software thing. If you are a reporter, and Cal is not making this statement I am, if you're a
reporter and you bring up anthropic co-work around anything, you are wrong. I was about to call you
a name, but I'm being nice today for some reason. All right, you're a dips year, because you are
a dips year, if you look at Claude Co-work, which is a thing for fucking around on your desktop,
and you say this is going to compete with Salesforce, you just don't know what you're talking about.
You are wrong. But then one abstraction higher is this idea that Claude Code is going to destroy
SaaS, Software as a Service. And the idea being that software as a service is this thing,
that people are just going to stop building their own CRMs, they're going to stop building their own
per seat software things, but they're going to build it internally. This just, I don't know if you've
seen this cow, it just reflects a complete lack of understanding of how software works. Yeah.
Because you don't, you don't pay Microsoft for Microsoft 365 teams because you can't build your
own word or what have you. I don't think you could. But nevertheless, it's also,
because they maintain it because they make sure it doesn't break, or if it breaks, they fix the bit.
So they make sure that it stays up all the time. I make sure it's accessible. It has secure login.
Well, look, there's a few things going on here. One of these things I reported,
on last month. I did a
big New Yorker piece on agents,
right? So this is relevant to Clod
code and how people are thinking about the current future.
And there was this, basically here's what seems to have
happen. Claude
code and these other command line interface
agents can do
really cool things. I'm using the word cool here
in a very carefully.
That's fine. Yeah, like cool in a sense of... I would like
you to be like completely, like give people
the actual explanation here. Yeah. So cool meaning
like Oculus, right? You put
on the Oculus
visors for the first time, everyone had the same reaction.
This is really cool.
Now, that's separate from, that's a trillion-dollar business.
Like, let's put that side, but this is really cool.
I'm seeing 3D in a world where it tracks my head.
Coddcode and other command line interface coding tools became like that for programmers.
It was like really fun, the watch it doing multi-step execution of the construction of demos or this or that.
And the reason why it could do those cool demos, it was sort of well-suited for that world,
because that's a world that exists only in text.
So Cloud Code works on a command line interface.
It's all text-based.
And it works with a file system.
You can write files, edit files, send files to compilers.
It's all existed, a small number of commands on a command line interface.
It's all in a world of text because that's where computer programs are built.
That's a perfect case for LLMs, which love dealing with text,
and they love dealing with structured text like computer code.
There was an extrapolation that then happened.
And really this caught on January of 25.
which is where you first began to get this sentiment of,
oh, it's doing such cool things over in the world
of command line interfaces and code.
Certainly these agents can now soon do similar cool things
and like all different things we do on a computer.
And that is what laid this foundation of people were so impressed.
Programmers were so impressed by the coolness
of what was happening with Claude Code
and the other command lane interface tools
that they extrapolated that vibe over to
other computer usage is
the point of that article
I reported was oh it turns out like
everything else we don't on computers much harder
it's not six text commands
on a command line interface creating structure code
that you can compile and test to see if it compiled or not
it's much messier the interfaces
are visual we don't realize what complexity goes into
the things we click and select doing something as simple
as even trying to just book like a hotel
in a new city or something
like this and that if you use a language
model as to underline logic and decision
engine of making actual actions in the real world, well, the language models make things up and get
things wrong or a little bit wrong 20% of the time. And a little bit wrong in computer science
means breaking everything. Well, it means a lot. Like, it's okay in code because they say, oh, that
didn't compile. Let's try again. But when you're booking a hotel room, as I sort of detailed
that article, it means like you ended up in the wrong city two years from now and the room cost
$6,000, right? And so it just didn't work. So 2025 was supposed to be the year of the agents. And
It just didn't work.
And they don't really know how to fix it.
But we're still vibe.
So this is vibe reporting.
Yeah,
but Claude Code is like really exciting.
And so why can't we do that with everything else in the computer?
It's actually a much harder problem than they were letting on.
Well, the thing is I've seen, especially like in 2026, I've seen a lot more Claude Code stuff.
And there was, there's been a very big consent manufacturing operation going on right now.
Wall Street Journal, Atlantic, CNBC.
Deirdre Bosa, this is a statement from me, not cow.
Digibos from CNBC should be fucking ashamed of herself
going on CNBC every day
just going,
God, code's going to destroy all software.
She's on Twitter
because she was able to vibe code
a some sort of Monday clone,
which is just like a project management tool.
It's like, I made some software that worked.
This is everything now.
But that's kind of what you've been talking about
with this vibe reporting,
where it's like, I did a small thing.
Now all things will be done in this manner,
Whether it's possible, God no. God no. But she, like many reporters, are able to find a lot of people who are invested in AI who will absolutely go on TV and say, yep, it's completely true. That's going to happen. 100%. It's just so strange because it makes me feel paranoid and kind of conspirator when you look at the majority of news about AI, and it is this vibe reporting. It's these vast extrapolations from 16,000 job block, um,
job losses at Amazon. They mentioned AI. This plus this equals that AI is replacing people.
It's just so, it makes me feel like uncomfortable with the world. Yeah. Well, can we,
can we sit for a moment on that Amazon example? Because I think it's a great one.
Please, please. It frustrates me. That one frustrated me. Tell me. Go ahead. All right. So,
Amazon lays off 16,000 people. Right. All right. It's covered in two different ways.
So the vibe reporting way it's covered. In my newsletter, my podcast, I looked at an example,
from courts. And it was covered as clearly intended to imply Amazon laid off 16,000 people because
of AI. They're being replaced by AI. They put the subhead of the article was the CEO of Amazon
talking about how AI is going to increasingly disrupt the job force. And then in the article itself,
no alternative explanations are given for these layoffs. They kind of just give the details of like,
here's how many people are laid off and here's where they figured it out. And then they put a couple
quotes in there about AI being very disruptive and being able to automate jobs.
It turns out those layoffs had nothing to do with AI.
And you could find other reporting that focused, because it was reporting that was for
the financial market, so it was trying to focus on what the hell is actually happening.
And the deeper reporting was like, yeah, they laid off a bunch of managers because, like,
a lot of tech companies, they overhired during the pandemic, because cloud computing became
much in demand during the pandemic.
So a lot of tech companies overhired during the pandemic, and they're all shedding those
jobs again.
And Amazon's pretty ruthless about this, right?
They're always looking for excuses to fire people.
And they said, we have too much bureaucratic bloat.
There's too many managers.
We're going to fire a bunch of these managers we hired so that we can be more lean again.
That has nothing to do with AI.
In fact, this is the second or third round of these firings that they plan to do.
The first round occurring before ChatGPT was even released.
This has nothing to do with jobs being replaced by AI.
But then you can vibe report it because, like, well, technically speaking, Amazon also is investing money in AI products.
So technically speaking, money saved by firing these managers could be re-spent on AI.
So you can say with a semi-straight face, they fired people because of AI.
But clearly, you know the impression that you're giving to the reader is that they were replaced by AI.
And it had nothing to do with it.
And I heard, by the way, so I wrote about that I put in that video.
I've heard on background from multiple Amazon executive since.
They were like, we were completely baffled by this coverage that was implying that people were being fired here to do with AI.
This is just what we do at Amazon. We're ruthless. Like if you're not earning your keep, we fire you,
they were completely baffled by that coverage. And like, thanks for pointing it out.
Like, what are you talking about? I got to be honest, Cal, if I got an email like that from an Amazon executive,
I tell them to go fuck themselves. And I mean this nicely because Andy Jassy last year, June 17th,
2025 put a whole thing about today in virtually every corner of the company we're using
generic AI to make customers lives better. I believe Amazon benefits from that obfuscation.
And I think they deliberately fuel it. Now, there may be executive.
who disagree with this.
Well, these were lower level managers, right?
So not exactly, I shouldn't say executives, but like people who work there that were like, oh, yeah, no, no, no.
They're not firing people because like AI.
They're just being brutal.
Right, right.
They're the people who tell the truth.
Yeah, exactly.
Exactly.
But no, you are right.
I think for the, there has been a lot of vibe reporting.
Basically, I've been covering this for the last two years, the entire sequence of job reductions that were post-pandemic corrections.
So the entire tech industry overhired during the pandemic, the entire tech industry cut jobs.
in the last few years because they hire too many people
and now they have to correct
back to where they were pre-pandemic.
Consistently across the board,
these cuts or vibe reported as due to AI.
Consistently, you see exactly that story.
I'm a computer scientist.
We see this in coverage of computer science majors as well.
Same idea.
Computer science majors historically directly tied to the tech industry.
If they're cutting in a down cycle, majors go down.
If they're hiring, I mean, it's just economic,
I mean, it's not surprising.
When the tech industry is booming,
we get a lot more majors because they're good jobs, right?
And so computer science majors went down in the last year or two as the tech company started cutting.
That was reported in the Atlantic as kids are not majoring in AI because AI makes the superiors.
Majoring in CompSite, you mean?
Comsat, yeah, because AI is going to do all these jobs.
That's because we've had these cycles every five years we have this cycle.
This is nothing, anyways.
No, no, I'm with, and the Atlantic has just done a piss poor job with this.
this stuff. I'm hating and I'm hating on them because they had an awful Claude code thing.
They just, you know, I'm going to bring it up and read the title. I'm not going to say the
reporter's name because I think that that's, that's mean. But let's look at this.
Move over chat, GPT. You're about to hear a lot more about Claude Cod and you know why you're
going to hear about that because the fucking Atlantic is writing about it. And it's just,
uh, over the whole days, Alex Lieberman had an idea. What if he could create Spotify wrapped for his
text messages without writing a single line of code?
Lieberman, co-founder of the media outlet Morning Brew created, I message rapped. I just want to
start here and say, that guy doesn't do any fucking work, and I know people who work there.
Like, I just want to start by saying, if that's the best you've got, and that probably
involved throwing a giraffe and there's the entire zoo into a vat of acid to make the GPUs move,
you're meant to read this, and the deliberate effect you're meant to have is you're meant to be
scared and excited. Like, it is a kind of a mishmash between the vibe reporting and the
I guess it's the first. In fact, this might be a triple score. Because this is meant to make you feel with discomfort, but also make you freak out, but also be excited. A rare triple score. It's just stories like these piss me off because I'm fine with people going, this could do this. I don't mind. It happens. But when it's just like, feed me the slop right into my mouth an asshole, make me feel all the marketing things all at once.
First of all, I'm not impressed by the idea of doing IMessage wrapped.
That's not a thing a human being with like friends and hobbies does.
I've been watching the first season of True Detective.
I've got many more shows I could watch.
Never once would I think, what if I could get a rapt of all my messages?
What a fucking psychotic thing to do.
But when you read this, it's like, you spent your holidays with your family?
wrote one tech policy expert.
That's nice.
I spent my holidays with Claude Codd Cod.
Well, it's fun.
It's fun to create these sort of demo apps.
If you're like engineering-minded, so in that sense,
the most cynical analysis here is that Claude Code is like model trains for engineering-minded people.
It's a fun hobby.
I can put, look at this.
Like, I made a thing that can, like I was talking to a friend of mine yesterday,
computer scientist and you're like, oh, I built
a thing where I can, you know, whatever, email
an appointment to
a thing that gets parsed by an LLM and then goes to my
calendar. That's just fun for him.
In the same way that someone else might be like,
hey, I built a cabinet and like it's really
nice. Like, look, I got the wood to go together
and like I'm proud. I could just have on a cabinet or whatever.
Except you build, except you built
something with your
hands. Well, sure. And a cabinet can have things
put in it. It's, I just, it
feels like Lego.
It's more like it's toy software that has some functionality.
Because the big thing I'm waiting for with the vibe coding stuff is an actual product.
You know what I mean?
Well, that doesn't go well.
I mean, look, this is the story of this is the story of Maltbook, right?
Oh, yeah.
Tell me more about Maltbook.
We were just talked about this.
There's a whole can of worms to open there.
I'm just going to open like the top of the nearest can, which is before even getting
to what Mold Book is and is not, you know, it was in the news all the time.
It was vibe-coded and immediately was just full of terrible security holes because it was vibe-coded.
And it turned out that you could get the API keys.
So the key you use when you access the paid service to get used an LLM, you have an API key so they know who to charge.
You could just steal everyone's API key who was using it because the guy just vibe-coded it.
And so no one was actually there looking at the code.
But what I think is going on.
Okay, so here's what I would like to see.
I would like to see more reporting that would be
how are people using X?
Like to me that's very interesting.
Yeah.
Right?
How are people using X?
Yeah.
And the problem is those answers right now,
and this is confounding, I think,
to people who are very excited
by the potential and coolness
of these things in isolation,
the answer to how are people using X
is often not nearly as exciting
as you would guess.
And I think the reality with,
I mean, I don't quite have my arms around Claude Code.
I do know there's a lot of people
who are building kind of like
internal tools or personal tools with it,
which I think it's cool.
Most people aren't interested in that,
but for some people they are.
And, you know, I think it's fun.
Computer programming is fun, right?
So, like, the ability to make a program work is,
and some of those tools are useful,
but that's not a major industry on its own.
You know, I'm designing a tool for my small company
that makes it easier for us to do whatever.
That's cool, but that's not like a trillion-dollar industry.
I can't get my arms around yet
exactly how professional computer programmers are using it.
They're all talking about it.
Really?
Yeah.
But I can't get my arms around.
Yeah, what do you, what's, what's your sense of, I hear everything. I'm going to be honest, I was going to ask you. I was literally going to ask, have you heard people using it? Because if you go on X the Everything app, if you put on Fulhasmat suit, you go on X and you go and look and the way that people talk about this is like they have connected into the matrix that they are now a thousand X engineer. But when you go and look at what they do, no one actually says,
There's always these kind of vibe stories, the vibe tales, the mythology where it's like,
I had a problem vaguely that would have taken me X number of hours, but when I used Claude Code,
it solved it immediately and caught two bugs that I didn't know about.
It's very Marine Todd style, then everybody clapped.
Yeah.
And it's, you're a computer science teacher, and you don't know either, which makes me think it's not as big a deal as people are saying.
The other bit of evidence is that at the end of last year, Anthropics Claude Code revenue was 1.1 billion annualized, so about 90 to 100 million a month.
For a revolution, that feels low somehow.
Another podcast from some SNL late-night comedy guide, not quite.
Unhumor me with Robert Smygel and friends.
Me and hilarious guests from Jim Gaffigan to Bob Odenkirk to David Letterman, help make you funnier.
This week, my guest, SNL's Mikey Day and headwriter, Streeter Seidel,
help an Acapella band with their between songs banter.
There's the worst singer in the group?
The worst?
Yeah.
Me.
Is there anything to the idea that because you're from Harvard,
you only got in because your parents made a huge donation.
The group.
The yard birds, right?
That's the name.
The Harvard yard, but they're open.
Do you have a name suggestion?
We're open.
Since you guys are middle aged.
One erection
Listen to humor me with Robert Smygle and friends
On the IHeart Radio app, Apple Podcasts,
or wherever you get your podcast.
Humor me
I need some jokes to make me seem funny.
Run a business and not thinking about podcasting, think again.
More Americans listen to podcasts
than ads supported streaming music from Spotify and Pandora.
And as the number one podcaster,
IHearts twice as large as the next two combined.
So whatever your customers listen,
to, they'll hear your message.
Plus, only IHeart can extend your message
to audiences across broadcast radio.
Think podcasting can help your business.
Think IHeart.
Streaming, radio, and podcasting.
Let us show you at IHeartadvertising.com.
That's IHeartadvertising.com.
Hey, I'm Deanna Maria Riva, actress, mother, lover,
and a Gen X woman walking through life
one hot flash and hormonal crying jag at a time.
You ladies know what I mean.
I'll bet you a perimenopausal chin here you do.
So let's talk about it.
Join me on my new podcast.
Kna Kineepi with the Adamenea Riva, where I call on my Gen X squads from Ohio to Hollywood as we navigate
Midlife's most fantastic BS. All of a sudden, I'd had hanginess happening on my own. I was like,
what the hell is that? I was married when I had her, so I didn't even consider how empty that
Ness was going to be. Mood swings, night sweats, fupas, sex drive. Wait, what sex? Dating at 45,
how can it be getting naked at 50 with a new guy. That one's kind of hard.
Well, that's lighting.
They say we can't polish a turd, but we're sure going to try.
So let's get blunt with laughs, tears, or tears of laughter, and dive into it, unfiltered and unbothered and ask, how hard can it be?
I cannot believe I'm about to say this out loud in public.
Listen to How Hard Can It Be with Diana Maria Riva as part of My Cultura Podcast Network available on the Iheart Radio app, Apple Podcasts, or wherever you get your podcasts.
American soccer is about to explode.
The World Cup is coming.
Ramon sending on to Ernie.
Score at the chip.
I'm Tab Ramos.
I'm Tom Boe.
On our podcast,
Inside American Soccer,
you'll get the real storylines.
I'm not worried about Policic.
I'm not worried about Balligan.
I'm not worried about McKinney.
My only concern is what happens in the back.
The biggest decisions.
If you're going to look at stats and numbers,
he has no shot at making this World Cup team.
And the truth about the U.S. national team.
It wouldn't be a huge surprise if our team ends up in the quarterfinals or potentially a great run into the semifinals.
The World Cup is almost here. Experience it all with us.
Listen to Inside American Soccer with Tom Bogart and Tabramos on the IHeart radio app, Apple Podcasts, wherever you get your podcast.
Yeah, I mean, what I know is a computer scientist, you can't write performance-oriented code, you can't write safe code.
you can't write code.
It has to sort of juggle
a sort of complex set of scenarios.
I mean, you just need
good programmers' eyes on it
building this code.
Why is there a way of explaining
to a non-coder why that is?
Code is, there's like a poetic element to it.
You know, it's writing good computer code is difficult.
You're often, you're dealing with,
you know, what is my problem here?
And I want an elegant way of sort of satisfying this problem.
you're often drawing from pretty nuanced algorithms and data structures to try to figure out,
how am I going to organize information and efficiently access it?
When performance comes into play, there's a lot of really subtle decisions to make about, you know,
how am I going to store or use things in such a way that we don't get bogged down when we're trying to execute things?
It just becomes, I don't code as much anymore because I'm a theoretician, but I did my whole life since I was, you know, seven.
And it's a, it's an art form, right?
When you're using clod code, you're not really supposed to look at,
at the code. And so I think that takes a lot of uses probably off the table. It's like the use
cases you're supposed to have these different instances of cloud code running, and this one's
going to write code, and then this one's going to write test for that code, and then the
cloud code is going to run the test and then try to fix the code if it doesn't matches the test,
but your eyes aren't on the code at all. And I, there's, I mean, obviously for a lot of
programming, that's an issue. But then the other thing I've heard about the computer programming
industry is it's very stratified. There's a smaller number of like really good serious
programmers that produce like 90% of the really important valuable code on which everything runs.
I don't think they would touch cloud code with a 10-foot pole. Like, they're good at what they do.
And then you have these like huge strata of people writing like JavaScript and sort of hacking
together Python and, you know, it's like not very good code. And then it's, I guess you could
replace some of that with it. It's functional enough. It's functional enough. But I can't get my
arms around it yet. But I do get, you know, I have a lot of sources. So I hear from, you know, I do hear from
people that are talking about how cool cloud code is. I do think it's cool. I hear from a lot of
professional programmers are like, yeah, we don't use, we can't use this. We're trying to write
serious programs. Like, we have to sell this software. Like we, this is, this not solving a problem
we have. So, you know, I don't know, but the problem is that's what the reporting should be.
Hey, here we are at this company looking over the shoulder of people. What are they doing?
Let's talk to the engineering teams on background of this tech company exactly what role is going
not here. And I think what a lot of reporting has become on AI is your hype laundering. So you look at
the discussion about the technology happening from more engineering-minded people. You convince yourself
as a reporter, I can't understand the engineering, but I will trust the people who do. And then
you launder what you're sensing from that hype into your articles, not realizing that like nerds like
me, we get hyped up about stuff. And we get super excited about stuff and we go create, like you can't
just launder our hype into this is what's happening. And so it's like,
reporting on a war where you have no one embedded. There's no one actually on the ground where the
battles are happening. You're just responding to the press conferences that the generals are holding
back in the Pentagon. It's not a way to report on what's actually going on. And I think the other thing
as well is there is a, if you don't do this hype laundering, I don't know how these, if you're
a report listening to this and you have a thought about this, send it to me anonymously. Is it Trump
at 76 on signal? But,
my thought is as well is there's probably a problem with rationalization as well because if you look at
this and you say okay well it's kind of cool it's fun in whatever indeterminate way it doesn't seem like
serious software like actual real deal software is being made with it but then the CEO of google
says 30% of code is written in AI which is bullshit and i've heard from so many software engineers
well it couldn't be that everyone's just wrong right it couldn't be a case of
that everyone is making the most egregious capital expenditure fuck up of all time.
This will be historic, I believe, worse than railways, digital beanie babies,
but done at the scale of laser tag arenas.
Now, it can't possibly be that because everyone else is saying this is exciting and good.
And at that point, they choose, instead of being worried, instead of being a bit anxious
about this, they say, well, Amazon Web Services spent a lot of money.
So this spends a lot of money too. So this is actually good. It's actually good. And indeed, these people seem emphatic and excited. And I as a non-coder can build a fudged CRM that probably would not withstand even the laziest hacker. I can do this. And thus it will extrapolate further from there. And what sucks is what the people that I believe actually will be hurt by this are retail investors. I think regular people buying stocks in these companies.
or selling software stocks because they believe that code will replace them.
And ultimately, I think it's just going to be a bloodbath for people's 401 case that could have
been avoided, except it would require reporters to do something uncomfortable.
And I don't think they want to do that ever.
I think there's two things going on is what I've decided with reporting.
First of all, it's asymmetric risk.
So a lot of reporters are like, look, there's not a major risk if I'm excited about this
and it doesn't pan out.
Because we could be like, yeah, surprisingly, this didn't pan out, and with some factors we couldn't see.
But they do feel like there's a huge amount of risk of saying, this is not a big deal, and then it is.
And so it's definitely an asymmetric risk.
We saw a lot of this during, like, COVID as well, right?
It was less, there would be less harm if you were too alarmist about something.
But there could be a lot of harm.
They felt like reputational, if you're like, this is not a big deal, and it was.
And so there's definitely an asymmetric risk profile.
There's also like a meaning defining profile.
It's just really exciting to think everything is going to be disrupted and change unrecognizably.
It sort of gives a focus and meaning to like an otherwise somewhat chaotic and disrupted world that we're in right now.
And so there's that aspect too is that people want to believe there is something massive about to happen.
Because in some sense it wipes away all the like bad stuff that's happening.
Who cares?
none of this is relevant because this much bigger thing is coming.
So I think that gets wrapped into it as well.
I think the economic reporters are more on this because like their whole job is to try to,
I mean, they're not on it.
I think after your work, they're on it more.
But like,
I don't know if I agree.
I,
I am reading big series.
There's an article in Bloomberg.
I'm trying to shove through archive.
dot is because they,
they were so mean and unfair to my friend Steve Burke at Gamers Nexus.
That won't pay them.
But it's more shit about like the soft.
narrative. The fact that software is being disrupted by Claude Codd, you get the same pallid
reporting from Bloomberg and even the Financial Times about Anthropic. And the FD's generally pretty
good. Well, like, yeah, they're just going to make $30 billion next year. It's just fanciful.
It's there's the skepticism, the cynicism doesn't exist. And I get, I agree on the risk.
Well, there was a bubble reporting, the last fall, there was a period where everyone did the bubble
reported. After GP2-5, you had a two-month period where every major publication did bubble
reportings. But yeah, I guess it did kind of die off. It died off because they hear one nice
thing from Jensen Huang and they're like, well, I'm sold. Like that jacket, that jackets
looking pretty sharp. I mean, someone who wears that jacket, how could they be wrong? Would a man
that has a shiny jacket like the lead singer of corn wears at concerts? Would he lie? And it's
just what really bothers me as far as the economic reporting though is the Oracle because it's like
Oracle needs to pay $300 billion over five years by Open AI, a company that if we're to believe
reporting, which I do not, they made $13 billion last year in revenue and lost, sorry, they've made
$13 billion and lost, they claim nine, I think it's probably higher. How are they meant to pay $30 to $60 billion a
year in a year. And everyone's like, well, they'll work it out. How's Oracle meant to build those
data centers? Those data centers are going to cost $189 billion. They've raised $100 billion.
Okay, how they meant to do that? And everyone's just like, oh, they'll work it out. I wish I could do
this with the fucking bank. I need a $500 million house. I'll pay you for it in some point,
all right? It's fine. And the news would run articles about my genius housing purchases. It's just
It's one of those things where, and you said, well, the asymmetric risk doesn't exist.
It does for some of these reporters, because I've been saving their bylines for years, because I actually think that there needs to be some sort of reckoning with this, because if you look back, there are major financial outlets that did the same thing, who were literally propping up Sam Bankman freed two weeks before FTCS collapsed.
Yeah.
Who then went on immediately to cover AI.
Yeah.
Fucking.
And now they're peddling bullshit for Anthropos.
And sorry, I'm kind of hating, well, I guess it's hate it season so I can.
It just frustrates me because regular people are being scared.
They're being scared by the kind of astonishment, which I actually love, the
first punishment reporting of like, oh, well, open claw has proven, open claw is proven
that AI is here or they built their own social networks so we should be scared.
Software is dead.
Insane.
Yeah.
Yeah.
Singularity is here.
And it's like,
I assume you saw that, for the listeners, by the way, open claw had this, this fucking clawed bot, whatever it is.
They had their own social network where the quote unquote LLMs would post.
But it turns out that most of those posts were just made by their owners.
Yeah.
And this is a good case study, right?
This one bothered me because like writer friends I know who are not technology related were texting this to me.
They were worried, right?
They were texting me these articles.
Like this seems bad.
Like this seems like something really bad.
They were really getting the digital ick really strongly off of these articles.
And I would start reading these articles and I say, well, there's no discussion of what is the technical breakthrough here and what are the concrete implications?
Because it turns out there was zero technical breakthroughs.
There is no new AI technology connected to OpenClaw, which used to be called whatever, open-mold and open-cloth or whatever.
OpenClaw is, I think, where they ended up, right, which is an open source library or framework for building AI agents powered by LLMs.
There's no new AI technology involved in this at all.
The agents you build are just accessing off-the-shelf LLMs that we're all using for chatbots anyways.
You can aim it at whatever commercial chat bot you want.
There's no new framework for how the agents work.
It's the same sort of React loop that we've been trying for the last two or three years where you just have a program.
It's like a bit of Python code that asks an LLM.
hey, here's what I want to do.
Here's the tools you have available.
Come up with a plan.
And then it sends it back a plan.
And then the, the Python code takes the first step out of that text and says, okay, here's
the tools available.
What should I do to execute this step?
And then like the LLM will give you some steps.
And then the Python code runs those steps.
And then you just go back and forth, right?
That's like this basic React.
Which is exactly how Manus worked as well.
Everything.
There's nothing new about this.
Manus was this AI agent that Facebook.
might be,
meta might be acquiring,
they literally,
when you use it,
it just writes Python code
for every step.
Well, yeah.
So this is,
I mean,
there's nothing new here
technologically.
The only thing that was new
about OpenClaw
is it was open source,
so it made it easy
for anyone to write bad agents.
And the other thing
that made it interesting
to wireheads in San Francisco
is that the commercial products
where people are trying
to build these at companies,
they have common sense security.
Like, well, probably,
if it's just a Python program
blindly doing what an LLM
hell is it to do, like, it probably shouldn't have access to, like, your credit cards or to your
hard drive or whatever. At OpenCloud, like, you can just, you can give it access to anything
you want on your computer. And so you could build really cool demos that are also, like,
incredibly insecure and unsafe. And so it was fun for hobbyists, but there was no new technology
there. Nothing was new. It was just a way for other people, like, hobbyists, to build their own
agents that were, like, less safe than the more carefully built. Like, what people don't, here's
a story people don't know is, like, in the immediate after.
aftermath of chat GPT. So we're in early
2023, right? So chat
GPT has come out. OpenAI goes on a road
tour of major publications, right?
And they're to try to, hey, here's what's
going on. You need to write about this.
In early 2023, they're like, the next thing
we're going to offer is something like these agents.
They call them plugins. And you can
install plugins that basically can
do actions on
behalf of your LLM queries.
You could have like a book an airline flight
plug in and say, hey, chat
GPT, book me a flight, and
language model can't do anything but produce text, but then the
plugin could take the text and go and
book you a flight. And they're like, this is the future, obviously.
And that project disappeared because,
oh, it's incredibly unsafe
and unreliable to have
code that can interact with the real world
that's following commands from
an unreliable hallucinated LLM. It's like
that went away, not because the
technology was hard, but because it's not safe.
So there's nothing new.
My whole article on ages, they tried this
all of 2025, and we're
failing to get this type of agent to work for anything outside of basically like producing computer
code. So open claw is nothing new. It was just a way for other people to build these things.
The only interesting stories were security whole stories. But it was reported, man, I was listening
to the All In podcast. And they were, they said, this is the future of AI. Like, this is it.
Everyone is going to have an OpenClaw agent as like a personal assistant. And this is like the biggest thing in OpenAI.
And I don't know who it was. One of those guys was like, you know,
Yeah, we replaced our podcast producer, you know, with this agent who can, like, email guests on our behalf and book it on our calendar.
And, well, it's costing about $1,000 a day right now to run it.
But, like, I don't know why we're going to need employees in the future.
Hell yeah, brother.
Yeah.
So anyways, like, it's a non.
If you said, what's the technical implications?
None.
If you say, what are the concrete implications, the best story they can have is, like, maybe when people, I don't know what it is.
All the companies have already been trying to build these things for years.
So I don't know.
But it was reported like just something icky is happening.
And that MOLP book was an application that was built for these agents to communicate on like a Reddit-style social network.
They vibe-coded that framework and it was full of security holes, as we mentioned before.
And there was like a small number of users that created a huge amount of agents.
And they were just kind of like prompting and prodding their agents to produce like, let's talk about creating a own church or killing humanity.
It's like, guys, this, you have Python code asking LLM.
to write text in the style of like the matrix and you're posting it on a fake social network.
The real story here should be where are they, don't these people have things to do?
Don't they have jobs?
Like what is this?
Yeah, yeah, yeah.
It really, it really, my model train.
I added a tunnel.
So that one really got to me.
I actually will push back on that.
Model train listeners, I think, make up 20% of the listenership of this podcast.
Of course.
Also, model trains are cool, signaling to my fellow artists.
out there. But, no, model trains are a physical thing, which you build it, you build a little
city. I think they're delightful. Respect to those. With this, it's like, if they, and you know what,
if they were just saying, hey, I've been fucking around with some software and I did something cool,
I'd respect the shit out of them. I'd be like, yeah, enjoy yourself. It's going away. This is all
subsidized rates, but have fun with that. Don't know why you need a Mac studio, but good luck.
No, they are like, this is the future. But I'm going to be honest, Cal, my real question is,
what does OpenClaught actually do?
Because when I went and read what it actually,
I read so many posts,
you said it books,
calendars,
and does this.
I could find no proof
that anyone successfully did that.
And it doesn't.
It's just a series of library calls
that you can use to build your own program
in theory that would do that.
So OpenClaught itself is just like
a series of like interfaces and hooks
that makes it easy to write a program
that talks to other services,
and makes calls to LLM.
So it's just like a rapper.
in which you can write your own code.
And some people are trying.
Yeah, like you can write, you can have it talk to your calendar.
You can have it talk to your email in theory.
But does it work?
But does it work?
Well, it's just asking an LLL.
You can simulate.
I mean, I did this for my New Yorker piece.
I was like, look, I can just simulate being an agent.
Just ask an ELLLM.
Here's what I'm trying to do.
Give me like the steps for doing this or whatever.
And like anything else you ask of an LLM,
it will give you answers that sound very reasonable in general.
and then have like a lot of issues in the details,
which is why agents based on LLMs have struggled
because if you, it's fine if you as a human
are talking to an LLM because you don't realize
how much filtering and tweaking.
You're like, well, that's kind of ignore that
and this is good or let me ask you to redo it.
It's really an issue if you write a program
that just says, I will do whatever,
I'll ask the LLM for a plan and then just do whatever it says.
Because it doesn't know, that doesn't really make sense,
or like this sounds generally, generally reasonable,
but with like issues in the details.
Does it work well when you execute things?
I mean, in a practical sense, they said on the all-in podcast, three dumb bitches saying,
exactly, that's a meme reference.
Don't use that word usually.
Those people sitting around going, blah, blah, blah, we've made it and replaced our podcast
producer with an LLM that can make appointments and send emails.
In practice, is that true?
I severely doubt that.
I just, I, also, you're spending $1,000 a day.
you're paying $365,000 a year for this, right?
Yeah.
Are you really or are you just, have you, is it complete your talk?
It probably isn't.
And that's the thing.
I don't, I get why they all in guys do it because they probably have investments and their boosters.
TBPN, same fucking deal.
It really rules that the two largest, like, tech things in the valley are just state media,
but for Silicon Valley.
what gets me is when you get like the Atlantic, CNBC,
Business Insider and places like that doing stuff like this.
And it bothers me because they don't even need the hate on it.
They could just say, yeah, it could do this.
It's pretty cool, right?
Yeah.
But I guess that that doesn't get the clear.
It's not an interesting story.
The real story is not always that interesting.
It's like, look, hobbyists are building these tools that kind of do cool things,
but make mistakes and it's a little expensive.
Like, the most interesting story out of OpenClaw is,
the only thing I think is going to be impactful out of it
is because it was so expensive,
right, like that $1,000 a day,
what it's forcing people these hobbyists to do
is to turn to like cheaper alternatives for models.
And that I do think is significant.
This idea that there's open source models out there,
they're significantly cheaper than trying to use like OpenAI or Cloud.
More and more people are running their own local models
because it turns out for, you know,
most specific uses you might use an LLM.
You don't need like a super,
a super fine-tuned trillion parameter beast of a model running in some data center somewhere.
It's like, you know what?
I'm parsing my email that try to extract like suggested times.
I'm fine with like a 20 billion parameter model that can easily fit in a single GPU
on a thing I have, you know, we share in our office or something like that.
So to me, that's the most interesting story out of OpenClaught is the bad news it could be
for the big companies.
As people get more comfortable with, we don't need these Formula One car versions of
language models for the stuff we're actually doing, we're fine with the Ford Focus, right?
And I think that's a transition that's going to, that's a transition that to me is more
that's what we should be writing about.
Like, that's an interesting idea to me is that you have these huge high valuation companies,
but you also have all these open source models, like the weights are just out there in the
public domain that can do 98% of what people care about.
And you're beginning to get low-cost competitive services where people can just spin those up
and cheaper data centers.
That's interesting to me.
That's an economic story.
AI creating its own church is not as relevant to me.
Another podcast from some SNL, late-night comedy guy, not quite.
Unhumor me with Robert Smygel and friends, me and hilarious guests from Jim Gaffigan to Bob Odenkirk to David Letterman, help make you funnier.
This week, my guest, SNL's Mikey Day and head writer, Streeter Seidel, help an acapella band with their between songs banter.
Who's that worst singer in the group?
The worst?
Yeah.
Me.
Is there anything to the idea that because of your...
from Harvard.
You only got in
because your parents
made a huge
donation.
The group.
The yard birds, right?
That's the name.
The Harvard yard, but they're open.
Do you have a name suggestion?
We're open.
Since you guys are middle-aged,
one erection.
Listen to humor me with Robert Smigel
and Friends on the I-Heart Radio app,
Apple Podcasts, or wherever you
get your podcast.
Humor me.
I need some jokes to make me
See funny.
Run a business and not thinking about podcasting, think again.
More Americans listen to podcasts than ads supported streaming music from Spotify and Pandora.
And as the number one podcaster, IHearts twice as large as the next two combined.
So whatever your customers listen to, they'll hear your message.
Plus, only IHeart can extend your message to audiences across broadcast radio.
Think podcasting can help your business.
Think IHeart.
Streaming, radio, and podcasting.
Let us show you at iHeartadvertising.com.
That's iHeartadvertising.com.
American soccer is about to explode.
The World Cup is coming.
Ramos sending on to Ernie Stewart for Chip.
I'm Tad Ramos.
I'm Tom Boe.
On our podcast, Inside American Soccer,
you'll get the real storylines.
I'm not worried about Policic.
I'm not worried about Balagan.
I'm not worried about McKinney.
My only concern is what happens in the back.
The biggest decisions.
If you're going to look at stats and numbers,
he has no shot at making this World Cup team.
And the truth about the U.S. national team.
It wouldn't be a huge surprise if our team ends up in the quarterfinals
or potentially a great run into the semifinals.
The World Cup is almost here.
Experience it all with us.
Listen, Inside American Soccer with Tom Bogart and Tab Ramos
on the IHeart Radio app, Apple Podcasts, wherever you get your podcast.
Hey, I'm Deanna Maria Arriva, actress, mother, lover,
and a Gen X woman walking through life won Hot Flash and Horn.
hormonal crying jag at a time. You ladies know what I mean. I'll bet you a perimenopausal chin here you do.
So let's talk about it. Join me on my new podcast. How hard can it be with the Adamani Arriba, where I call on my
Gen X squads from Ohio to Hollywood as we navigate midlife's most fantastic BS. All of a sudden I'd
had hanginess happening on my own. I was like, what the hell is that? I was married when I had her,
so I didn't even consider how empty that nest was going to be. Mood swings, night sweats,
It's Fupa's sex drive.
Wait, what sex?
Dating at 45.
How hard can it be?
Getting naked at 50 with the new guy.
That one's kind of hard.
Well, that's lighting.
They say we can't polish a turd, but we're sure going to try.
So let's get blunt with laughs, tears or tears of laughter, and dive into it, unfiltered and unbothered and ask, how hard can it be?
I cannot believe I'm about to say this out loud in public.
Listen to How Hard Can It Be with Diana Maria Riva as part of my Cultura podcast network available on the IHeart Radio app.
Apple Podcasts or wherever you get your podcasts.
I'm already working on a story this Friday, in fact,
on my premium newsletter about the fact that actually the margins of serving GPU compute
are dog shit.
Like the best of the best of like a co-location place, like Applied Digital,
it's like 27% gross margins.
And that's at 100% utilization.
Anything below 0.7, they're burning cash.
I hear there's a day it's sent around in North Dakota losing a million dollars a day.
That's the thing.
this is even with these lower cost models.
On device could be interesting.
That's what's going to happen, I think.
I think on device is what's going to happen.
I just remembered something.
This is a classic bullshit story that I see every so often.
So have you read any of the stories that are like, yeah,
Claude can now work for hours uninterrupted?
Have you read about these?
You heard this?
You've seen this?
Work for hour?
Oh, yeah, yeah.
Yeah, yeah.
We're talking about the multi-step agenda execution.
Well, no, it keeps coming back.
Yeah.
Yeah, AI on the CNBC, AI on the verge of eight-hour job shift without burnout or break. Is it, is 24-hour AI workday next?
Going to just censor myself what I was going to say there? Because it's just like, yeah, it can work for hours. Is the output good? Does it? Yeah, it's just a lose. Does it? Does it just keep, call it, recall it, recall it. Like, I can, I can write a Python program that calls an out for 24 hours.
If you need me to burn something for hours on end, I just give me some gasoline.
I got you, baby.
I could sort this right out.
It'd be much cheaper.
But it's like, yeah, by September 2025, Claude Sondit 4.5 was reported to run autonomously
for up to 30 hours, reported with, I mean, but that's more vibe reporting.
It's you're meant to read that and go, wow, this thing is performing tasks that are useful,
that execute code, that do something.
And in that 30 hours, that is equivalent to 30.
human working hours versus they sat there and pissed their pants for like 29.5 of those at least.
And also that those are like specialized tasks typically that have clear milestones and testing.
So it's like it can do something.
The loop can try and try until it passes the test.
And then you know that's done.
And then you can move on to the next step.
And then you can keep calling to LM and executing until it passes the next test.
And you can move on.
And in theory, like it's, yeah, they're avoiding error.
cascade because it's testable and they can keep retrying or something like that. I mean, this was
like the meter had this problem with their graph of how long of tasks AI can now do. And it was
like this sort of like super arbitrary decision of like, oh, here's a task that takes a human five
hours. Now AI can do that. Here's a task. Whereas really more about like how many things in a row
can you do without the errors cascading out of control or something like this. Right. It's very different.
You're being way nicer than you should be, in my opinion, just because they don't even.
even explain. They don't even say like, yeah, and we managed to make it do it. And this, they just go,
that's right hours. CNBC just fucking, just like, woo, just for tearing their shirts off and
screaming. But it goes back to the two things we need this report. And you need technical, the details of
the relevant technical innovation and a discussion of concrete implications and your future implications
of that breakthrough. That's what I'm always looking for. So if you want to cover a story like that,
like, well, what does that mean? What is the technical breakthrough, right? What does this mean? What does this
technically you can do 30 hours of work or eight hours. What is the work? What was changed? What did
they figure what was happening before? What technical breakthrough made this possible now? And then what are
the concrete implications? What specific things now can we direct? This will now allow us to do. Tell us what
jobs are going away. Tell us what tool we're going to see. Like you have to. But we avoid that,
because then you're putting your chips down. And then those things don't come along. And so I'm always
looking for that. What's the technical innovation and what are the concrete implications? If you don't
have that, you're mining emotions.
you're mining emotions or just helping boost stock.
That's what really bothers me because I hate that they're scaring people.
I really, really hate that they're doing marketing.
Like they're just doing, like, I run a P-RFA.
I've dealt with early-stage startups for like 15 years.
There isn't a single early-stage startup that would get a percentage point of this bullshit.
Like you email a reporter about like a series A-stop.
like, all right, all right, motherfucker.
How much revenue?
Are they profitable yet?
Why not?
Why not?
Explain to me right now.
Anthropics like,
oh, we're going to burn $100 billion on training, I guess.
What do you think?
And they're like, yeah, I love it.
I actually think that's future.
That's great.
I'm not, and to be clear,
I think that this scrutiny should be from the beginning to the end.
I think that everyone should face this scrutiny.
Yeah.
I'm not saying that I should get an easier.
I'm saying actually everyone should.
Anthropics should face.
the most brutal scrutiny.
And I guess that they don't because they want access.
It's just very dull and annoying.
And it's just helping already rich guys like Mario Amadei,
who should not have more money.
Listen to him speak.
He needs to face some stress.
I think some stress would help him grow.
But, Cal, as we wrap up, I did actually have like a technical thing that was an idea of
being percolating.
So I think that this term training with models is being misused and used in a way that is kind of vibe reporting style, which is they use training as a word that suggests that it will stop, that they will stop training these models.
But correct me if I'm wrong, training is everything from building a new model to updating a model's current parameters, correct?
Yeah, there's pre-training and post-training is the right way to think about it.
So the pre-training is unsupervised. That's where you take all the text that's ever been written.
and you will take a real piece of text written by a human,
and you'll cut it off at an arbitrary point,
and you'll tell the language model,
you guess the word that came next.
There's a real word that came next.
This is real writing.
Guess the word that came next,
and then it guesses,
and then you adjust the weight so it gets closer to the right answer.
That's pre-training.
And when you adjusts the weights, what are you doing?
You're running a training algorithm called back propagation.
This is a Jeff Hinton innovation,
where you're going through and you're adjusting the weights
all the way through the layers in such a way that
the answer it gives for this particular test
gets a little bit closer to the right answer.
And that's pre-training.
That's pre-trained.
It's unsupervised.
So you take Hamlet or you take Dickens.
It's like the best of times.
It was the worst of.
And then you give that to the thing,
what word should come next?
And, you know, it says bacon.
Like, we're going to adjust these weights now in a way
that like your answer gets a little bit closer towards times.
Right.
Okay, that's pre-training.
Then you get post-training where you already have trained.
So you have this network,
all the weights have been set through this massive,
multi-month, you know, billion-dollar pre-training.
And now you want to go through and you want to tweak this to avoid certain types of
behaviors or to influence towards certain types of behaviors.
So for post-training, it's almost always based off if you have inputs like prompts and
correct answers.
This is the right way to respond to this question, right?
So you have pairs of questions and answers.
You give the prompt to the LLM.
It spits out some answer.
And now what you're doing, you're using it's called re-enner.
reinforcement learning is a general technology, but you're using techniques from reinforcement learning
to sort of like zap it, like you would zap a dog when your dog training it.
If it's a bad answer, you zap it, so you get those weights away from that answer.
And if it's a good answer, you give it a treat.
And this is post-training.
And so that's where you've moved past the word-guessing game, which is where all of the sort of
general smarts comes from these models.
And now you're doing this sort of zapping and treat training around very specific things, right?
This is where...
But they do that. Keep going.
Yeah.
So, like, so you'll go through and, like, ask it questions where the answers might be, like,
about building bombs or whatever.
And every time it spits out an answer about a bomb, you give it, like, a really bad,
negative shock.
And you know, like, definitely turn off those circuits.
Like, we don't want you to spit out answers by bombs.
Like, that's where all the guardrails come from.
Or if you want it to get better at doing, like, a particular math exam.
You can give it, like, lots of questions from that math exam.
And then you have the right answer, and you can kind of zap it to move it towards
what the answers look like on this math exam.
So post-training is more focused.
You have particular types of behavior
you're trying to sort of instill
in this already pre-trained massive network.
That's mainly where the focus turned after GPT-4.
So GPT-4 was like the extent of pre-training
making it smarter.
After GPT-4, trying to make those models even bigger
and pre-trained them longer
didn't lead to much performance increases.
So everything we got between four in the lead-up to five,
was post-training.
And that's when they began focusing on metrics.
Because if I have a particular metric,
I can post-train a model to do well on that test.
And so everything became about metrics and post-trane.
Now we can do this thing better,
or look at this thing we do better.
And so that's kind of the game that's played now
is we do lots of post-training.
That requires much more specific data
because you need like right answers,
pairs of prompts and correct answers.
Right.
So only certain things we can do this with.
But that's the game we've been playing since like 2024.
do they do that with models that exist now?
So this is basically updates, right?
Yeah, they do it on a semi-regular basis, yeah.
But usually there'll be a name change.
Like GPT-5-2 is different than 5-1, different than 5.
But don't they update the current models?
They don't re-pre-train them.
That's too expensive.
No, no, I'm not saying that.
I'm saying, do they post-train the current models to make them better at stuff?
Yeah.
To tweak things.
Yeah.
So that's, this is a very long way to get to a point.
kind of making, which is one of the problems with vibe reporting on this is training is framed
as this thing, like inference is framed as OPEX that is permanent that you cannot avoid inference being
creating the outputs. Training is framed as this R&D mysticism, which is just out there. And you know,
they never say this, but you hear training, you think, oh, you train and then you perform.
And so training would end. But from what I understand, training is as common and
necessary in expense as inference at this point.
Yeah, it's the only way you improve or update things, right?
So if you had a Microsoft 365 software,
you're constantly sending updates and patches and whatever
as you like add new features or whatever.
In the AI model world, it's, yeah, it's post-training.
It's the only way to make any sort of improvement or fixing bugs.
Like you're like, oh, here's something it's saying that we're really upset about,
okay, let's go in and do some zapping.
Let's get out the zapper.
Give it a bunch of examples of saying that bad thing.
and let's zap and say, don't do that.
And now it's like very unlikely to do that.
So yeah, they're constantly,
they're constantly doing that.
Otherwise, you're a stasis.
Which is different than the way most people actually think of it differently.
They think that the model is somehow like learning online.
Which is absolutely not.
It's absolutely not true.
That as you talk to it, it's learning and it's getting better.
And they're like, but wait a second, it remembered something I talked about earlier.
It must be getting smarter.
The model doesn't care about you.
Just your local software.
you don't realize this.
It's including, like, huge bits of stuff you've talked to it before in the prompt that's going to the model.
It's like, here's a bunch of stuff that Ed has submitted to you in the past.
Okay, now here's his current question.
So the model hasn't changed.
The model doesn't have memory.
It can adjust its, it doesn't adjust its memory in real time.
It's all static.
There is no dynamic memory involved in these models.
They're fixed until you go out and post-trained it, and then you bring the new weights back in the data center, and that's the thing to an inference now.
But that's the thing.
like this the reason I bring it up as the kind of closing vibe story is because training is very clearly getting more expensive what they say profitable on inference which again doesn't really make sense to me but putting that aside they're not but even if they were if training never stops then who cares about profitable in inference like it just means that you will get more expensive forever.
inference is just, yeah.
Inference is P,
training is poo.
I'm not putting that one in an article.
Yeah, okay, it's an interesting question, right?
It's also getting like the, what's an Altman who was making those comments about,
well, if we just didn't have to pay for the training, this would be profitable.
If I didn't have all these expenses, I'd be so profitable.
The one distinction that's maybe relevant there, not to be an apology,
but the one distinction that's relevant is pre-training versus post.
So pre-training is insanely expensive, right?
Because you're training something on all the words in the world.
And it takes months.
So it's just like you're running a data center that's going to have nowadays up to six-digit GPUs running full-time for months just to get that pre-training done.
And you have to pay for all of that.
Right.
So that's all time.
You're not getting money.
You're just paying for training.
Post-training is also expensive.
It's not that expensive, though, because as expensive as pre-training because each post-training.
because each post-training session,
it's a way, way, way, way smaller data set
that you're post-training it on.
It's like, all right,
we generated like 10,000 examples
of people, you know,
responding to questions in a racist way.
And those 10,000 examples
will use to reinforcement learn
and try to move it away
from answering those type of questions
in a racist way.
That's like not that big of a data set
compared to we're going to train this
on every word written
that we have access to.
So the post-training is not as expensive
as pre-training.
Unless you're doing it all the time,
Yeah, that's true.
Like, that's the thing.
If you're doing it all the time and it takes months of pre-training,
but you're constantly doing something like post-training for months.
Yeah, I mean, post-training not functionally the same thing.
It's the same thing.
I think this is a fair point.
When you have a particular example that you're post-training on, like one prompt with an answer,
it's kind of like you're doing inference in reverse, right?
So you're going from, you're back propagating from one side of the network to the beginning
as opposed to going from the beginning to the end.
Now, it's more expensive than that because when you're,
just doing inference, the fundamental
operation is
basic multiplication. You're just multiplying numbers
in the big table. Back propagation
which you use to training, it
is multiplication when you do a bunch of, it's
a bunch of derivatives because you're constantly
you need to calculate like the derivative
of these. I mean,
like it doesn't really matter, but you kind of need
that you want the derivative because you want the
gradient descent to be towards like better
and away from worse. And derivatives,
my understanding is this is
like more expensive operation
per weight that you're trying to change
because you're not just multiplying a number,
you're having to calculate derivatives
and becomes a little bit more complicated.
So yeah, it's like inference and reverse,
but also like a little bit more expensive.
So if you have 10,000 sample question responses,
you're going to use the post train.
It's kind of like 10,000 users sent prompts
and they're particularly expensive prompts
and you had to pay for that instead of them paying for it.
So, yeah, it's good to think of it as like inference and reverse.
it's also an ongoing cost.
Like, it's, everyone is leaning on this idea that this stuff will magically become profitable.
I don't know if you've seen the cash flow diagrams of Anthropic and OpenAI, but there's a
mysterious math going on where year 2028, 2029, they just become profitable.
But I was going to ask you about this last month.
There's this announcement that OpenAI had some massive increase in revenue.
Yeah, well, this is actually a great.
vibe reporting thing and that's annualized revenue. They said they hit 20 billion in annualized revenue,
which would mean 1.67 billion in a month now. Important details. We don't know how they're
defining a month. We don't know if they mean 28 days, 29 days, 30 days. We don't know if they mean
a calendar month, so the month of November or December. Or do they just been any 30 days? We don't know
if they're doing insane math, which happens very rarely, that this company feels like one that might do
just my gut instinct, is they may be doing, here's a seven-day period, and we're going to turn it
into a month. Like, they, we don't know how they're doing this. And they also coupled this by saying
that as compute grows, revenue grows. I don't know if you've seen this. No, this is their formula.
Yeah, okay. Yeah, it's an insane formula that does not map to any economics. Like, it's just,
it's the kind of thing that if we had a functioning business and tech press that would just be
scrutinized to the bone that would just be ripped apart and say, what the fuck does that mean?
Because if this were true, if you simply add more gigawatt, add more revenue, then you would
simply print more money, like it would be a money printer versus money.
Well, it would be like my movie theater did well. Therefore, if I build 100,000 movie theaters,
we're going to make 100,000 more times the amount of money. And it's like, well, wait, that theater was
in Manhattan. And it was like really well run. And they're, yeah. Yeah. Well, okay.
I assume this much.
Trying to understand that story.
Okay.
Yeah, the annualized revenue stuff.
I think I even used that term talking to someone I credit you.
I was like Ed Zetron would say look out for annualized revenue.
It is, the funny thing is with that as well is it's more viable reporting because
ARR standard.
It's very standard in SaaS companies that sell on a per seat basis.
So a Salesforce would do AI, well, even they are doing annualized now with the AI revenue.
But it would be a software company has 100 seats they sell to a company and they charge 15 bucks a bit, a seat per month. And they charge that annually. And the actual cost of each user is fairly measurable because they're doing CPU-based stuff. Like it gets expensive at scale because there's a lot of people. But it doesn't get multiple. I can't say the word. It doesn't get much more expensive as you grow. With AI, it's actually because of the way large language models work.
your most excited customers are your most expensive.
Yeah, because they blow past whatever their monthly revenue is, whatever their cost is, they blow past it.
So you can't even do a per se revenue.
It's just every, all of these things, when I say them out loud, I'm like, I feel like this should be more obvious.
It's why I loved your video because it was like, thank you. Someone else.
Well, here's the question. Here's my, here's my, here's like the dangerous question. I actually put out a video last week.
It was like dangerous question. We're now on year.
three or four of new like the, I'm counting new year, starting like New Year 2020 or whatever,
of people saying, oh my God, this is so cool.
These massive disruptions, they're going to change everything is imminent.
And year after year we've said that, I'm not yet seeing the massive disruptions.
Like not the, not the stories of what might be disrupted or the stories of what's different.
But like, how many years do we have to go without industries,
crumbling or major new economic players that didn't exist before or complete restructuring
of huge companies around this technology.
How many years do we have to go?
So I did a video where I found the Reddit thread where someone just asked this question.
This was from like earlier in the month.
They were like outside of the vibe coding stuff, what are the, like what are people,
what are the big tools that have come out of this technology that are changing things?
And I read through this whole thread and it was interesting.
there's not, people don't have much, like, well, you know, like, it could, these, like, really small case studies.
I used it to help, you know, gather cleanup data that I got from whatever. I was like, this is,
like, such a nerdy specific use case. And so I went through that thread in a video. And this
has kind of been my question. It's like, it is very cool technology. But how do we know where this is
going to fall? Like, to me, the scale is, would go like this. Like, blockchain software,
then Oculus VR,
then maybe internet, then electricity.
So we're going to have a scale of disruption, right?
Blockchain software is something where
the premise made no sense
and it was never going to get off the ground
and there was going to be no impact on the world.
And because my training in CSs
is in distributed system theory,
I was there in 2020 saying,
guys, let me just tell you,
this is nonsense.
No, Web3 is not about to take off.
None of this makes sense.
And that was true.
That did nothing.
Then you have like Oculus.
VR. It really is cool, right? You put on these things like, that is awesome. Like, I love this technology, but it's having a hard time have any real major impact because people aren't sure what they do with it. Also, most people don't necessarily have a great experience initially because it's extremely dependent on where you are, who you are, the size of your skull. That kind of. Yeah. For a big head, for a limited group of people, it's really cool, but it fails to come out. Then the internet is like really disruptive, changed a lot of things. More, it's not so much as,
like whole industry has disappeared or whatever,
but it changed the way a lot of industries actually functioned,
and then, like, electricity, you could say,
like, it just completely changed what day-to-day existence and business was like.
How existence of the right?
Exactly, yeah.
And so the big question, like, everyone should be asking,
is where is Genervae going to fall on this?
And, you know, I would say right now,
this is what gets me yelled at.
I'm not saying this is the prediction necessarily going forward,
but right now, I don't think it's got past,
much farther past the Oculus part of that scale.
I actually, where it's really cool,
there's very cool things.
Pat GPD is very cool.
It's very cool that it can have that comprehension,
and no one thought it could do that.
But we haven't yet figured out what to do with it.
Comprehension.
It's a...
Text comprehend.
I mean, we take it for granted,
but for CS people, the ability that I can...
Hey, give me text that, like, whatever,
in the style of a poem that does whatever
and that includes a character from Star Wars.
And then it can give you text that does that.
That comprehension, like, for a computer scientist,
was like, oh, we didn't really know how to consider
On a technical level, yeah.
Yeah, that's like very cool.
Yeah.
But we haven't got past...
This is like the surprising thing of this field is we're not really past the Oculus stage yet.
Like where there's like for certain...
This is vibe coding is really cool.
The comprehension is cool.
So like Sora is weird, but like it's cool.
You can do that.
But the markets are not...
None of these have markets yet, right?
Like there's not big markets in any of these yet.
And we'll...
How far will it go from Oculus to the internet?
To me, that is like the number.
one question, the number two question, the number three question of all reporting on this,
and almost no one's talking about that. It's just hype laundering. We'll take this hype,
we'll extrapolate it, we'll react to that extrapolation. That's kind of what reporting is right now
in AI, whereas to me, this is the hugest question. If this ends up Oculus, retail investors
are going to get screwed. If it ends up internet, all right, that's like a really interesting,
significant story. If it ends up electricity, obviously that really matters, but like I don't
know anyone who actually thinks it's going to be that disruptive, not the current
technology. That's the story to me, not let's like extrapolate us, you know, hey, what these
things are creating a church or let's let's, let's hype londer, extrapolate that and react to
our extrapolation. That's not really reporting so much as speculative fiction writing, I guess.
I don't know quite what to call it. But this is the real question. Where exactly are we now?
And what are the possibilities of like where this is going to go positive and negative?
I don't we have enough talk on that. I fully agree. Cal, it's been.
such a pleasure having you. Thank you for joining me.
Always happy to talk shop.
Always happy to hate with you, I guess.
Aater season, I like it.
Hater season is the best.
We will be back this week with either a monologue or an interview.
I have not decided, because I've got a wonderful
Corey Quinn interview I just did, so I'm considering putting that in the monologue.
You'll find out on Friday.
Anyway, this has been Better Offline. I'm at Zittron.
Subscribe to the premium.
Download a T-shirt, whatever you desire.
Thank you for listening to Better Offline.
The editor and composer of the Better Offline theme song is Mattersowski.
You can check out more of his music and audio projects at Mattisowski.com.
You can email me at E-Z at Better Offline.com or visit Better Offline.com to find more podcast links and, of course, my newsletter.
I also really recommend you go to chat. Where's Your Ed dot at to visit the Discord and go to
R-S-S-Better Offline to check out R-R-R-R-R-R-R-R-R-R-D.
Thank you so much for listening.
Better Offline is a production of Cool Zone Media.
For more from Cool Zone Media,
visit our website,
coolzonemedia.com,
or check us out on the IHeartRadio app,
Apple Podcasts, or wherever you get your podcast.
Another podcast from some SNL late-night comedy guy,
not quite.
Unhumor me with Robert Smygel and friends.
Me and hilarious guests from Bob Odenkirk to David Letterman
help make you funnier.
This week, my guest,
SNL's Mikey Day and head writer Streeter Seidel
help an a cappella band with their between songs banter.
Where does your group perform?
We do some retirement homes.
Those people are starving for banter.
Listen to humor me with Robert Smigel and friends
on the IHeart Radio app, Apple Podcasts,
or wherever you get your podcasts.
There are times when the mind becomes a difficult place to live.
This is David Eagleman with the Inner Cosmos podcast,
and for Mental Health Awareness Month,
we'll talk with singer-songwriter Jewel about anxiety.
I started living in my car,
and then my car got stolen.
I was having panic attacks.
I was agoraphobic.
This is a month of deeply personal
and honest conversations
about what happens
when the brain goes off course.
Listen to Inner Cosmos
on the IHeart Radio app,
Apple Podcasts,
or wherever you get your podcasts.
Hey, everyone, it's Ryder Strong
and Wilfredell from PodMeets World.
And now the PodMeets Twirled podcast.
We're two men who were completely clueless
to reality TV,
and we're gearing up for the season finale of Survivor.
I know we annoyed a lot of our listeners by our severe lack of survivor knowledge.
That is the point of the show.
I'm just going to remind you.
Again, we are experts.
Listen to Podmeets Twirl on the IHeart Radio app, Apple Podcasts, or wherever you get your podcasts.
Real talent is defined by what people can do, not where they learn to do it.
So by stopping at the education section of a resume, you might throw away the perfect hire.
Skills first hiring helps you see talent others miss, like more than 70 million.
stars, skilled through alternative routes.
Let their story unfold and gain a competitive advantage because hiring managers who start
with skills are 60% more likely to find a successful hire.
Higher skills first.
Learn why at tear the paper ceiling.org.
Brought to you by Opportunity at Work and the Ad Council.
This is an IHeart podcast.
Guaranteed Human.
