TBPN Live - Elon Musk vs. Donald Trump, AI Day | Shaun Maguire, Mark Chen, Sholto Douglas, Jack Whitaker, Aarush Selvan, Michael Mignano, Oliver Cameron, Delian Asparouhov
Episode Date: June 5, 2025(02:28:34) - Skip to Elon Musk vs. Donald Trump Reactions (17:15) - Shaun Maguire. Shaun is a partner at Sequoia Capital and discusses the resilience and innovation at X and XAI, highlightin...g the successful integration of Grok into X despite initial skepticism about the platform's stability. He compares the evolution of foundation models to operating systems, predicting a diverse ecosystem with both proprietary and open-source models, where open-source models may have broader deployment but less value capture. Maguire emphasizes the importance of early market capture and anticipates significant moats for foundation model companies due to hardware investments and application layer advantages. He also notes the rapid revenue scaling of companies like Starlink, surpassing previous benchmarks set by AWS, and underscores the necessity of a diversified energy strategy, advocating for increased natural gas, oil, solar, and nuclear energy to meet future demands. (31:55) - Jack Whitaker. Jack is an AI expert and entrepreneur with a PhD from Cambridge University, specializing in generative AI, large language models, and multimodal systems. In the conversation, he discusses the current landscape of AI development, highlighting OpenAI's dominance in both product distribution and research, and noting Anthropic's strong position among developers. He also touches on the challenges of model naming conventions, the role of data in AI advancements, and the varying strategies of companies like Google, X.ai, and Meta in the evolving AI ecosystem. (50:58) - Aarush Selvan. Aarush is a Product Manager at Google, leads the Gemini Deep Research project, which enables Gemini to act as a personal research assistant. In the conversation, he discusses the development of Deep Research, highlighting its ability to generate comprehensive reports by leveraging long context windows and reasoning models, and emphasizes the importance of balancing efficiency with the depth of information provided to users. (01:04:45) - Oliver Cameron. Oliver is the co-founder and CEO of Odyssey. He discusses his transition from leading self-driving car initiatives to pioneering AI-driven storytelling. He introduces Odyssey's latest innovation, "interactive video," an AI-generated medium that allows real-time interaction without traditional game engines, envisioning it as a new form of entertainment. Cameron highlights the potential of this technology to revolutionize content creation by enabling models to generate film and game-like experiences instantly, reducing production costs and time. (01:19:18) - Michael Mignano. Michael is a Partner at Lightspeed Venture Partners and co-founder of Anchor, and discusses the evolving dynamics between AI foundation labs and application layer startups, highlighting the shift from a symbiotic relationship to direct competition. He emphasizes the growing importance of unique data contexts, noting that models are increasingly seeking novel information, which prompts labs to compete directly with startups possessing such data. Mignano also suggests that this trend may drive startups back to established incumbents like Google and Amazon, as they might be perceived as more reliable partners in the AI ecosystem. (01:31:44) - Mark Chen. Mark is OpenAI's Chief Research Officer. He discusses the evolving landscape of AI research, emphasizing the shift from large-scale pre-training to enhanced reasoning capabilities. He highlights the importance of reinforcement learning (RL) in developing autonomous agents and the challenges of scaling RL effectively. Chen also addresses the significance of interpretability in AI systems, advocating for models that transparently convey their reasoning processes to ensure reliability and user trust. (02:00:59) - Sholto Douglas. Sholto is a researcher at Anthropic. He discusses the challenges and advancements in scaling reinforcement learning (RL) within artificial intelligence. He highlights the significant gains achieved by increasing compute resources in RL, noting that a tenfold increase still yields linear improvements. Douglas also addresses the complexities of reward hacking, emphasizing the need for careful guidance to align AI behaviors with human values. (02:28:34) - Breaking News: Elon Musk vs. Donald Trump (02:35:00) - Delian Asparouhov. Delian is the co-founder and president of Varda Space Industries and a partner at Founders Fund. He discusses the recent policy shifts in NASA's budget, particularly the reallocation of funds to the Space Launch System (SLS) program, which had been advocated for cancellation by figures like Jared Isaacman and Elon Musk. He highlights the immediate consequences of this decision, including SpaceX's announcement to decommission its Dragon spacecraft, leading to a lack of vehicles capable of servicing the International Space Station. Asparouhov also reflects on the unprecedented nature of the current dynamics between influential private sector leaders and the U.S. government, noting the escalating tensions and their potential impact on the future of space exploration. TBPN.com is made possible by: Ramp - https://ramp.comFigma - https://figma.comVanta - https://vanta.comLinear - https://linear.appEight Sleep - https://eightsleep.com/tbpnWander - https://wander.com/tbpnPublic - https://public.comAdQuick - https://adquick.comBezel - https://getbezel.com Numeral - https://www.numeralhq.comPolymarket - https://polymarket.comAttio - https://attio.comFollow TBPN: https://TBPN.comhttps://x.com/tbpnhttps://open.spotify.com/show/2L6WMqY3GUPCGBD0dX6p00?si=674252d53acf4231https://podcasts.apple.com/us/podcast/technology-brothers/id1772360235https://youtube.com/@technologybrotherspod?si=lpk53xTE9WBEcIjV
Transcript
Discussion (0)
You're watching TVPN.
Today is Thursday, June 5, 2025.
We are live from the TVPN Ultra Dome, the Temple of Technology,
the Fortress of Finance, the capital of capital.
We got to work on that because we're
working on selling the naming rights, baby.
This place is going to be branded.
We're going to sell the windshield.
Exactly.
We're selling the windshield Selling the windshield
So we got it. We got to keep growing the intro
But we have a massive day today a little bit of an AI day today. We got folks from Google open AI and thropic
We got the former CEO of open AI
She're coming on Sequoia Stanford, Google X investors
I and we got pretty good coverage.
We hit almost everything.
Should go on a whirlwind tour of what's
going on in artificial intelligence.
I'm excited to dig into the state of affairs
in the foundation model race.
We're going to go through the tier list of what companies in AI
have the mandate of heaven.
We're also going to go through some of the deep research
products and hopefully get into some of the more
penny-edge use cases for AI.
So we have both some deep research folks coming on
and then we also have some folks that are working on
video generation and video game generation
and a lot of different applications.
We're gonna cover the granola story,
we're gonna cover what's going on with windsurf
and so it should be a great day. But let's run through some news just to keep everyone up to
speed before Sean McGuire joins in 13 minutes so first off ramp time is money
save both save balls easy use corporate cards bill payments accounting and a
whole lot more all in one place circle went public the CEOs coming on the show
tomorrow that's very exciting and Jordy you have the news I will read a little bit from Jeremy the CEO I'm
incredibly proud and share and thrilled to share that Circle is now a public
company listed on the New York Stock Exchange under Circle. Brian Armstrong
Post congrats to Jeremy and the entire Circle team on your IPO and reaching 30 trillion in lifetime USDC volume.
Let's hit that go.
The big T.
You got it.
Incredible.
You made it on for 30 trillion.
The stock is up massively.
Yeah.
It was priced at $31.
It is trading at around 85 as of now.
That's fantastic.
We love to see it.
Again, as many people would expect,
Bill Gurley is very unhappy with that.
Inefficient pricing.
He hates a stock.
He hates a pop after the IPO.
It's good for the IPO window,
which we always want to be open.
For sure, for sure.
And a role executive chair, Trey Stevens,
tells Ed Ludlow that the company has
closed a new funding round of 2.5 billion in a deal that more than doubles
the defense startups valuation to 30.5 billion. This is from Bloomberg TV.
Congratulations to the Anderol team on the massive up round. We got to hit the
gong for Anderol. Do it again.
around. You'll love to see it.
We gotta hit the gong for Andrew.
Let's do it again.
We have gong.
Good contact, good contact.
Very exciting.
Bunch of news from Kevin Weill over at OpenAI.
We will be digging into this with Mark Chan today
when he joins, but deep research can now share across
GitHub, Google Docs, Gmail, Google Calendar,
so you can integrate everything
and it can do research on your files.
That's going to be a lot of fun to talk about.
This is also potentially threatening for Granola.
And so we're talking to a Granola investor about what the reaction will be, where the
direction of that company might change or not.
But if you're designing a tool for artificial intelligence or one of these products or any
tool, really, go to figma.com.
Think bigger, build faster.
Figma helps design and development teams
build great products together.
Go to figma.com.
It is the backbone of the TBPN brand.
It is, it is.
And we would not be able to make the show without it.
Yes, and so we are going to be digging into,
today is obviously artificial intelligence day in some ways.
We're digging into artificial intelligence
today, we're also very interested in talking about VR
and content and augmented reality,
and then there's a story in the Wall Street Journal today
about meta is in talks, not advanced talks,
just regular talks, regular old talks.
But they're talking to Disney, they're talking to A24
about content for a new VR headset,
and this is my
Number one question about VR when the iPhone everyone's looking for the VR the iPhone moment of VR when the iPhone debuted
What was it? It was first and foremost a phone it replaced your phone
And so I've always thought that the path to true VR adoption was just saying
we're only going after your TV.
The next generation of 20, 22 year olds
when they get to college or post college
and they're in their first department,
they are just not going to buy a big flat screen TV
because we have solved that specifically with VR.
Exactly, or you know the guy with multiple monitors,
three monitors set up, the production team back there,
we can go to the production camp,
show you all of the different monitors that they have.
What if they could be wearing VR headsets?
They could have seven monitors one day.
10 monitors, let's give it up for the production team.
Let's give it up for them.
Let's give it up for them.
Love to see it.
So the idea of a drink camp.
We got the drink camp here, cheers.
Thank you to Matt, thank you to Andrew Huberman
for inventing drinking things
the whole team and over at Matina the other question it and so
This idea of doing one thing really really well before before going into
You know trying to do a little bit of everything the platform even exactly you don't get to be a platform if you solve one
Thing really really well the iPhone't get to be a platform if you solve one thing really, really well.
The iPhone wasn't a platform.
The first iPhone was not a platform.
Didn't have an app store, right?
It just had the ability to listen to music.
It was an iPod.
It was a phone.
And it was an internet communicator.
Just web browser.
And so the Meta is looking to Hollywood for exclusive immersive video for premium device.
Now, my kind of hot take here is that-
Belsky's cooking.
I know, I know.
All of our boys are coming together.
We're cooking up something amazing.
I'm really excited.
We're not gonna be able to get that much information
on this soon, but I don't even know
if they need that much immersive content.
I think a lot of it is just,
hey, every single meta headset should just ship
with the Matrix pre-installed for free.
It's like, how much would that possibly cost?
It's an extra $2 to rent or something.
You could have it pre-installed.
So it's just like, you can put it on
and there's like 10 movies that are pre-loaded.
You can just watch movies and the movies are great.
And you're in a really nice theater
and it just comes pre-installed and it's all great.
Because that was the, my Vision Pro experience
was very much
Film driven I want to get a step further And I think they would have to basically create an entire catalog
Yeah, I gotta know if the matrix by itself is gonna be enough of a draw totally say
I'm gonna spend hundreds of dollars for the average person. No, no, no, it's more about like the pre-installed apps
So like the iPhone was a really good phone. No, but I'm saying really good iPod
But it also came with like,
like a calculator app that was like decent.
And so you need a few of these things that are just like
really easy to access, really easy to pull off the shelf.
And ultimately I think Meta needs to catch up to Apple
in terms of the Apple TV movie store and making sure that
like all the streaming providers are really on there
in a valuable way.
Obviously, it's important to go to immersive eventually,
but I think the path to immersive might be just,
wow, I have a home theater in my studio apartment now.
Anyway, we'll dig into this more.
We'll talk to more people.
Maybe I'm wrong, who knows?
But the high points from this article
in the Wall Street Journal are as follows.
Meta is seeking exclusive content from Hollywood for
its upcoming premium VR headset Loma set to rival Apple's Vision Pro.
I'm super excited for this new VR headset.
I think it's going to look fantastic.
I think the resolution is going to be insane.
Obviously, there's a lot of focus on augmented reality and Orion and AI and
glasses, but there's still so much work just to do
just to bring VR into just a normal consumer experience.
And what's interesting is that I think Apple
really broke the seal on like, yeah,
like people are used to paying $1,000 for a phone
and $2,000 for a computer and maybe $4,000 for a headset.
Isn't that crazy?
So what can, M Meta's been hanging out
in like the three, four, 500 range.
If you take the reins off and say,
hey, yeah, yeah, it's fine to spend $1,000 on this thing,
you can get something really, really interesting.
And so Meta is offering millions for video
based on well-known IP,
aiming to attract users to its VR device
launching next year.
Now the big question is,
how long will these immersive videos
be because Apple did do a bunch of these deals.
They did license a bunch of interactive video products
but they were all like five minute experiences.
And so you get through them all
and then you'd wait a full quarter
and Apple would be like, we have another one, seven minutes.
Here's five new minutes of entertainment.
It's like, that's not how people experience entertainment.
I remember like-
Yeah, yeah, just think of it.
A lot of people are used at being entertained
by their iPhone for four hours a day.
Yes.
Often through video.
And it was, but that's not even an iPhone thing.
You go back a few decades to like the original PlayStation
had Final Fantasy VII on it.
It came on multiple discs and that game,
people would play it for a hundred hours.
Metal Gear Solid was a similar,
like dozens of hours of gaming
and no one's really been able to deliver that in VR
and have that moment.
Same thing with GTA, hundreds of hours of entertainment.
Anyway, very excited to dig into this new device
known as Loma. It's more powerful than the MetaQuest VR headsets now available
with higher fidelity video. Let's hear it. They got the screens done. They pulled
them forward off the benchtop. The design is similar to the large pair of
eyeglasses more like Meadow's Ray-Ban AI glasses than goggles that the Quest
and Vision Pro use connected to a puck that users can put in their pockets.
So maybe they're going puck, which is interesting
because that was very contrary.
And everyone was like,
this is Steve Jobs has never let this happen.
And Palmer came out and said, no, puck is great.
Keep the, you don't want heavy things on your face.
That's just not a good experience.
And so Meta is planning to charge less than a thousand
but more than 300.
And so I would say 999 is probably the right
price I want it to be sort of expensive so it can be a great product a meta
spokesman referred to the comments by meta chief technology officer Andrew
Bosworth about the company working on many prototypes not of not all of which
go into production meta is working with avatar director James Cameron's light
storm entertainment on exclusive VR content the two companies announced a partnership last year, so I
Think we got to get on this we got to have a VR stream. Yeah, three hours of content every day
We're gonna be the reason churn is low on the next to be our headset
Yeah, you just throw this thing on it. Just like you're sitting here on the drink cam. You can you can click through
I mean, yeah, just click all to the different angles. It's pretty it's pretty doable It's pretty doable. I mean you can click through. I mean. Yeah, just click all to the different angles.
It's pretty doable.
It's pretty doable.
I mean, you can film useable spatial video on just an iPhone
now, and then you can play that back in the Vision Pro.
And it does look 3D, which is cool.
In other news, Wall Street Journal's reporting
that Reddit is suing Anthropic alleging
unauthorized use of the site's data.
An online discussion forum says Anthropic
accessed the site more than 100,000 times,
after saying it had stopped.
Reddit is suing Anthropic.
And Anthropic debates this.
I'm sure we won't be able to get into this today,
because I'm sure it's caught up in the courts,
and there's a whole bunch of legal restrictions.
But we'll do our best to understand
how these deals come about.
It seems like most of the time,
it's not that the company that has much data
doesn't want the AI company to use their data.
They just want to have an equitable agreement
where everyone is getting the most value.
And I think Reddit's surged the stocks up, right?
Yeah, and then for more context,
OpenAI is already paying Reddit approximately
70 million per year in a content licensing agreement.
So they kind of got ahead of this issue
and decided to strike up an actual deal.
And I believe Google has a deal with them too.
And I think this might be one of the-
Google has a deal with Reddit?
Yeah, I'm pretty sure, because there's that meme
about the best way to search Google is search like whatever your
search term is and then space reddit because the user generated content was
better than the SEO stuff that was Google pays right at approximately 60
million per year so 60 and 70 so they're getting a hundred and thirty million
that's pretty serious revenue and and it's something that doesn't need to be
brokered via a bunch of individual programmatic ads that might not work or anything like that or subscale.
It's just one or two deals and boom,
you're up in the hundreds of millions of dollars in revenue.
What is Reddit's overall annual revenue?
$20 billion market cap.
Okay, not bad.
Let me see here.
How do they track into Conde?
They did 1.3 billion greenbacks in 2024.
1.3 billion, so they're getting like 20% of their revenue
or 15% of their revenue.
Yep.
I wonder how big.
They grew 60% over 2023.
Interesting.
So they might be bigger.
They might be bigger than Reddit
or they might be bigger than Conde Nast,
which at one point owned them.
It's kind of unclear how valuable Conde Nast is
because they're private.
Anyway, so Reddit said that the AI company unlawfully
used Reddit's data for commercial purposes
without paying for it and without abiding
by the company's user data policy.
Anthropic is in fact intentionally
trained on the personal data of Reddit users without ever requesting their consent, the
complaint said.
Ah, interesting saying that it's about the users.
Yeah.
It never opted in.
Bill itself is the white knight of the AI industry. Last year Reddit took steps to try
and limit unauthorized scraping of its website, creating a public content policy for its user
data that is publicly accessible, such as posts on subreddit and updating code on its back end.
The user policy includes protections for users,
such as ensuring that deleted posts and comments
aren't included in data licensing agreements.
And so, yeah, I don't think that there's a really strong
precedent for the agentic web.
Like, if I use Google Chrome to access a website,
Chrome doesn't need to pay any sort of license,
but if I go to Anthropic and say,
hey, get me up to speed on this topic,
and it goes out and it browses the web,
all of a sudden it feels like maybe they do have to pay,
whereas Chrome wouldn't,
because it's just rendering the webpage
and it's not transforming it at all.
What is transforming?
What's fair use?
And so these things will obviously play out
in the court of law.
And so hopefully they can resolve it quickly and move on.
Yeah, I'm actually surprised that they didn't already
have a deal in place, because it's very valuable data.
You want that data for your models.
And anyways. Well, we have Sean McGuire joining in just a minute. Because it's very valuable data. Don't want that data. Yep for your models and
Anyways, well we have Sean McGuire joining in just a minute and
The other news in the Wall Street Journal today is thrive holdings is betting that AI can change IT services the company established by venture capital firm Thrive Capital joined with ZBS
to invest $100 million into an entity that will integrate AI into IT firms.
This is from Josh Kushner, of course.
Shield Technology Partners has already acquired four IT service companies, Clearfuse Networks,
IronOrbit, Delvol Technology Solutions, and OneNet Global.
It said Thrive Holding is called Shield Technology Partners an AI enabled managed IT service platform.
IT service companies also called managed service providers
or MSPs typically provide IT support and manage tools
like software and cloud computing on behalf of businesses.
Founded by Josh Kushner about 15 years ago,
Thrive Capital is known for some of its high-flying
startup investments including OpenAI, Databricks, and Wizz, what a portfolio.
Investing in traditional services business,
particularly those that rely heavily
on administrative knowledge work and adding AI
to supercharge them is becoming a bit of a trend.
As part of its efforts, Shield Technology partners
will embed software engineers into each
of its IT portfolio businesses.
Oh, they're doing the forward deployed engineer. The engineers goal is to build an AI driven solution that
all of the portfolio companies will use. We've studied all the ways in which MSPs have perhaps
been on their back foot to date with customers and says that IT services work is incredibly
well suited to what AI can streamline. And so you can imagine a whole bunch of agentic workflows for all the different things that
you need to do when you're deploying cloud, your managing cloud.
Really quickly, before we have our next guest, let's tell you about Vanta, automate compliance,
manage risk, and prove trust continuously.
Vanta's trust management platform takes the manual work out of your security and compliance
process and replaces it with continuous automation, whether you're pursuing your first framework
or managing a complex program if you think you
Should be on Vanta you probably should probably correct. Well, we have Sean McGuire from Sequoia Capital in the studio
Welcome to the show Sean. How are you doing? Boom? What's up team? Never a boring day on the internet. That's for sure
Yeah, what is keeping you?
Well, man, what's keeping you up now?
Well, I think there's you got anyone on Twitter knows what I'm talking about
Yeah, I mean let's let's let's skip the politics because this is purely a technology and business show
Thank God. I love you guys
stick to the technology
What I mean we we've had an interesting experience with X
in that there's always been this narrative
that like the whole platform was gonna collapse.
We, you know, there's been rough days here and there,
but overall things have been growing.
What have you seen across the X, XAI merger?
What are the secrets to success?
How is, you know, talent tracking
is any of, is any of like the chaos and noise distracting? Because when I talk to XAI engineers,
they're like, we're too busy. We can't come on your show. But, but, but what's your experience
been with the X and X AI team recently?
Look in, if you go back in time, as you said,
everyone said it was going to fail.
The app would crash.
Nothing would happen.
And that didn't play out.
But there was a lot of tech debt and broken infrastructure.
And there was a couple of years of rebuilding
the basics and foundations.
I think we're starting to see real innovation happening.
I love the Grok integration directly in X.
It always scares me when someone,
when I have a tweet or whatever and then someone says like,
at Grok, is this correct?
Is this accurate?
You never know what's gonna come back.
Usually I agree with Grok, there's been once or twice where I think some of the subtleties
are a little off.
It's truth seeking.
It doesn't mean that it's fully truthful every time.
Yeah.
Yeah, it hasn't actually found that ground truth every single time.
That's funny.
What about the overall horse race?
I'm probably wrong.
Yeah.
What about the overall horse race between the foundation models?
It seems like every day it's going back
between an OpenAI launch, an Anthropic launch,
a Grok launch, a Google launch.
Do you think that continues?
Do you think there's maybe some fragmenting
and there's opportunity?
I mean, we're kind of already seeing this
with how much Anthropic's loved by developers
versus OpenAI has been really dominant on the consumer side.
And now every company is figuring out a different way to actually get to distribution.
What really matters here?
Is it pure scale?
Is it pure cracked engineering talent?
Is it distribution?
Is it a combination of those things?
How are you seeing it play out?
Great question. I, you know, honestly, my opinions have changed a lot over the last few years
and in many directions.
And so I don't have too much confidence in my assessment right now, but the, you
know, I always try to look at lessons from the past and my current thinking is
that the closest analogy are operating systems.
And if you, and I'll make a couple of points on this.
If you think about operating systems, first of all,
there's a bunch of different ecosystem.
There's the windows ecosystem, there's the Apple, you know,
OS ecosystem, then there's like on mobile, there's, you know,
Android, there's, you know,
the whole browser environment with Chrome,
and then there's open source to Linux.
One thing that I think is interesting about Linux,
there's more Linux servers in the world than there are Microsoft servers,
but the value capture of Microsoft is way greater than Linux.
I personally think we're going to see something very similar play out, of Microsoft is way greater than Linux.
I personally think we're gonna see something very similar play out,
where there'll be like a, you know,
OpenAI will be the Apple or someone,
and XAI I think will be very successful.
I think there's a good chance that Anthropic
is independent and successful.
I also think there'll be a big open source component,
which should be like Linux.
And I think there'll probably be 10 to a hundred times
as many open source models out there,
or like deployments of open source models in 10 years.
But I think that they won't be as valuable
and they won't be like as rich of ecosystems.
And then just to make two more points
on the open source analogy,
like for Microsoft, or Apple, by having the operating system,
they were able to actually win in quite a few ways on the application layer as well.
For Windows, they bundled in Word and Excel and then Outlook and all these other things.
I think it'd be very similar for the foundation model companies.
I think that the foundation models would be like table stakes.
That'll be their kind of win, but also a very sticky moat.
And even if they're not the most profitable businesses themselves, it will
give them big advantages kind of on the application layer.
And then one other thing that I think will happen, you know, the cloud companies
have giant moats just through the CapEx dynamics of cloud, like needing to buy all this hardware
and, you know, innovate with hardware and stay there is a big moat. I think these foundation model
companies are going to be, I think there's going to be way more value that occurs and there'll be way bigger moats than people realize.
I think they will all basically have hardware moats like cloud style hardware modes.
They will have the operating system style, very, very detailed research that's hard for
anyone to replicate.
And then I think they'll probably make a lot of their profit from applications on top of it. That's what that's my current thinking thinking can change.
And so, yeah, fingers. So I know, obviously, you weren't investing during the original operating system boom, but your firm Sequoia Capital was. discussions with the kind of the lineage of the firm or the history and seeing how is
the revenue ramp or the business scale different this time than say in the dot com era or in
the previous era.
It feels like it's ramping faster than ever.
It feels like we're seeing more companies that are hitting a billion in revenue or a
hundred million faster than ever. But is that real based or, or anyone that you've talked to that was investing in that
era? Does it feel different this time around? Do you think? Yeah.
I mean, one of the beautiful things about being at Sequoia is we do have this long history and we
get to tap into the kind of institutional knowledge. That said, sadly, Don Valentine died four or five years ago, like early
into my time. Rip, what an absolute legend, you know, and he led the original Apple investment.
But there's still a lot of Google institutional knowledge in the firm, which is, you know, not
directly operating in September, but they created an operating
system later.
I mean, first of all, the revenue of these companies is scaling insanely just faster
than any products in history before for Starlink.
So obviously not a foundation model company, but I basically made internally, I made an Excel spreadsheet
of AWS's revenue growth like in the first 20 years of AWS compared to Starlink. And
you know, Starlink has in five years got in to where what took AWS 10 years to get to.
And I and now like with these foundation model companies, we're seeing as fast or even faster revenue
growth.
That said, these are very, I think the business models, like the initial business model is
more clear and the profitability of these companies is, more the in profitability is
insanely high and so you got to discount the revenue growth But I would just say the biggest lesson I think from the past is you have to capture
Like territory early on and the doors will kind of close behind you because of these
CapEx dynamics and and just like lock in with users. Yeah, I mean you mentioned Starlink
Do you think there's obviously yes?
It's such a weird company because it's like a space
launch company that now is an internet company, ISP.
But there's actually a little bit I'm starting to hear of an AI narrative just that having
Starlink potentially unlocks edge compute or inference in areas that would typically
have kind of stranded energy resources.
So all of a sudden,
if there's some super remote area that has really cheap energy,
you can go in and set up a data center there and then do inference and stream
those tokens over star link. Do you think that's an underrated narrative?
Do you think that that's developing on course?
Do you think there's any bottlenecks that people should be thinking of within
that story? So when we first invested in SpaceX, the part of the core thesis was
internet everywhere. And I would say like it goes way beyond AI. But I think the internet everywhere
thesis is huge. And that will be, you know, everything from oil rigs to airplanes to boats
to edge AI devices.
But I think the bigger thing for Starlink is Starlink just has a 10x plus cost advantage
for moving data compared to building new transatlantic or trans-pacific fiber lines.
And in the world of AI, these models are going to be moving so much data around themselves.
And I think Starlink is incredibly well positioned to be the pipes to move all this data for AI. And so I actually, I care more about that
just because of the volume
than some of the kind of edge applications
for AI specifically, but those will be big.
And then one other thing,
I just got to give a plug to Bitcoin.
Plug to Bitcoin.
Basically, yeah, let's go.
Basically, three years ago,
I visited the biggest Bitcoin mine in the world,
Genesis Digital, their mine is near Midland, Texas.
It's actually backed by SPF, which is, you know.
He got both anthropic.
Oh, he had a bunch of good bets, you cannot deny that.
Exactly, he got both Anthropic and Genesis.
But these guys had a gigawatt scale Bitcoin mine operating three years ago.
And already for them, like having...
It taught me a lot.
And Bitcoin mining is the absolute tip of the spear where you need the least amount of data movement, like data in and out, to dollar generated or power consumed.
And so I actually think that was like Bitcoin mining is underrated in terms of how much it's pushed, like frontier power generation turned into compute.
And I don't think it's a coincidence that Crusoe,
you know, which is now powering Stargate,
started off as a Bitcoin mining company
or that CoreWeave, which is like $80 billion stock
as of yesterday is now, you know,
is now an AI data center company.
And I just, I think, I think that's honestly the bigger theme.
Yeah.
What's your updated thinking around nuclear?
We have these new executive orders
and it was announced this week that Metta announced
the partnership with Constellation
to power some of their AI power needs.
What's your kind of updated outlook
over the near term to medium term?
I'm an all of the above guy for energy.
Like we need all of it.
We need all of it as quickly as possible.
I, as an individual invested in a few nuclear companies
going back like nine, 10 years ago, way too early.
And to put a little bit more meat on these statements, nuclear is incredible,
but deploying large amounts of nuclear is slow. Even if you deregulated it to zero,
I think it would be more than a decade, well beyond a decade, to deploy like a terawatt of new nuclear.
Call it 10 years if you did it as fast as possible
for America starting now.
Solar is just a way faster way to deploy a lot of energy.
Nat gas is a way faster way to deploy a lot of energy.
We have been producing insane amounts of natural gas,
which we didn't have the pipelines to actually use.
So we were just flaring it a lot of times
because kind of like the dollar value per,
like when you have an oil well or you're fracking,
it's producing, it's emitting natural gas and oil.
And you just made so much more money from the oil emitting natural gas and oil.
And you just made so much more money from the oil
than the natural gas that we didn't really care about it.
And that started to flip.
And so anyways, I think we have to do all these things.
I think we need more natural gas, more oil,
way more solar, and then kind of have nuclear coming
as the reinforcement juggernaut coming online
like 10 to 15 years from now.
That's a good framework.
Fantastic, I mean we have to have you back
for an energy deep dive.
We know a fair amount of the nuclear and solar entrepreneurs
and there's a bunch of people doing really cool stuff.
So have a safe trip.
Personal plug, I had a seat in the New York Mercantile
Exchange when I was like 22 years old.
It was insane.
Good time.
Wow.
Cool.
Hey, good luck on the timeline today.
I know you're gonna go in there,
put on your hazmat suit and just get in there.
Good luck.
Good luck.
Peace guys.
Safe travels.
Cheers.
Fantastic.
Let me tell you about Linear.
Linear is a purpose-built tool for planning
and building products.
Meet the system for modern software development,
streamline issues, projects, and product roadmaps.
Go to linear.app.
Next up, we have Jack in the studio.
We have an in-person guest.
Let's bring him in.
Play some soundboard for me, Geordie.
Welcome to the stream.
How you doing, Jack?
There he is.
Second time on the show.
Good to have you here. What are you wearing doing, Jack? There he is. Second time on the show. Good to have you here.
What are you wearing today, Jack?
I'm wearing the jacket, the TDTN jacket
in the capital of capital.
Fantastic.
There you go.
Thanks for coming.
Thanks for hanging out.
Here, you can adjust your mic a little bit there as well.
Yeah, got that.
Cool.
I wanted to kick this off with a little bit of a rundown
on the different foundation labs.
We're talking to a lot of them today. And I noticed
that Jordan Schneider from China Talk and Dylan Patel ran through their AI mandate of heaven
tier list. And so I wanted to read through that and kind of get your reaction and then
just kind of do like a vibe check and let it and talk to you about what we should be
expecting from different labs over the next year. It's a little bit of a horse race.
So up first at S-tier, they have OpenAI.
It's the only foundation lab that made S-tier.
Does that feel right to you?
What are you watching from OpenAI?
Yeah, I think that's exactly right.
OpenAI executing both on the product level,
getting the distribution,
getting into hundreds of millions of people's phones.
But also on the research level,
you have people like Noam Brown, people like Aidan,
just doing this incredible frontier research.
03, I think, just as a model,
impresses me the most of any model that's come out so far.
You know, Brad Lightcap said in The Wall Street Journal
recently they had two million workplace users in February,
and they're at 3 million now.
Wow.
So it's just really exceptional growth.
I think.
I was thinking earlier, it'll be funny,
our kids in 20 years will be like, dad,
they're making me use OpenAI Teams at work.
It'll just be like the default, like the Microsoft Teams
default.
Yeah, I mean, there's a little bit of a narrative that, uh, that maybe,
and we can move on to Anthropics in the eight year alongside deep seeking
Google. Uh, there's a little bit of a meme that like Anthropic is crushing it
with developers. They're the default choice for windsurf cursor users,
but then open AI is more dominant with consumers. But I,
I feel like recently I've heard that it's maybe even more skewed than people
think. Like it's's it's maybe not like
The the vibe on X might be yeah, like, you know
70 30 open AI clawed for day-to-day grab a random person on the street, but it might be even more skewed
Does that feel right to you? Yeah, I think I think anthropics really solidified with developers
But it's like totally yeah given up on consumers. But I think open AI wants to take that on. I mean,
there there's rumors about some sort of windsurf acquisition.
They're releasing 4.1 and Codex. They're pushing hard on coding.
And I think that's something to watch from them this summer and going into 2026
is can they, can they secure that?
Do you understand the model names at this point? 4.1, I have access to 4.5.
Why would I want to go backwards?
Are the models fragmenting to where I'm going to have to learn a new taxonomy for, okay,
if I want to write code, I use this one. If I want to write poetry, I use this one. If
I want to do math or reasoning or build a chart, I use that one. Because it's putting
more work on me, I feel like.
I think Sam said that they're going to try to fix the model naming scheme this summer.
So that's the real thing to watch. Okay.
They're going to keep best years. Can they, can they get coherent model names?
Yeah. But yeah, 4.1 it's cheaper. It's specialized towards coding.
It's kind of their 3.7 type of driver at the same time.
I know you're not super up to speed on the Alibaba like Quinn models, but I,
I saw some, I saw some release where Alibaba, Quinn released like a hundred different models.
And Will Brown was kind of saying like,
this is awesome from a research perspective
because they have like one model that's just good at bio.
And it's kind of like this hyper fragmentation
at the opposite of going in the unification direction.
It's actually going more specialization
and then maybe you unify that at the end.
But I don't know, it seems like if you're a consumer company, you can't,
you don't really have that affordance, right?
Yeah. I think in terms of research, Ali Baba's a bit underrated. I mean,
compared to deep seat gets all this press, all this coverage,
but the Quen models are really good. People are doing,
you see from lots of people, these really cool RL experiments,
these really cool kinds of things that they're lagging behind the U S models.
They're not, they're not eight year, they're not beat here, you know,
but they're doing some interesting stuff. and I think that's super cool.
Yeah. I really wonder if they're, if they have a distribution advantage in China,
obviously we wouldn't feel it here, but, uh,
I really haven't gotten up to speed on what is the chat GPT of China in terms of
distribution. Obviously deep seek had that moment,
but have they actually executed properly on the, on the product side? I don't know.
I'm surprised that Google hasn't been able to turn their general distribution
advantage into an AI distribution advantage. They have these really good models.
The new Gemini came out today. It's, it's got really good benchmarks on a lot of
things, but they're yet to, I think they're yet to crack distribution.
We sometimes say.
Did you see that? Did you see that mock-up? That was just the Google search box
But a Gemini prompt. Yes, like if they wanted to go full send if they really but if they were really a GI pelled
They would just say hey, we're done with Google search. I mean it would destroy their economics
I commit to it. I'd commit to it
I think it looks like the model that they used to power those search prompts right now
It look it seems really lightweight to me. It gives a lot of wrong answers. When I ask 2.5 something, it's always right. Sure, sure. That's interesting. You think that's
just a cost issue? The AI overview box is like, we're just going to hallucinate. It's
a hallucination box. Well, they are launching the advanced AI search, but it's a toggle,
so you have to find it, which is always the problem with Google. Well, I mean, they still
wound up in the A tier, according to Dylan Patel and, uh, and Jordan Schneider over to China talk. Uh, uh, obviously via three was like a
huge one. And then they also have all those like priced performance things,
but, uh, I I've heard this narrative that like,
maybe some of the hyperscalers are super focused on benchmarking and,
and not even hacking the benchmarks necessarily,
but just like just thinking about them. And a lot of the frontier labs, the independent labs,
have just kind of moved on philosophically
from caring about benchmarks.
Is that the right move?
What's driving that?
Are we in the post benchmark era, essentially?
Yeah, when I think about models and benchmarks a lot,
I think which models outperform the benchmarks.
When you see O3's benchmarks, they they're good they're kind of what you
expect yeah then when you watch oh three think you see this model it's actually
reasoning sure when you watch Claude for Opus or Claude for sonnet think it's
like whoa this is really good same with GPT 4.5 I think the Gemini models are
good but they're exactly as good as the benchmarks let on you know and I think
they don't have the vibes yet what I want to see is Gemini 2.5 ultra
Okay, Google releases something with some big model smell something cool. Maybe that's them. What is big model smell?
I just don't like the idea of smell at all. Oh, yeah, it's just a weird. It's a weird sense
But but basically we're're in the intangible period.
Is that the idea?
That it's unquantifiable?
I think Anthropix give it up on really training
on the benchmarks, and I think it's
done really well for them.
You see that they're really good at Sweet Bench.
They're not crushing it on MMLU.
But you tried force on it.
It's great.
Other labs that are lower down on this tier list
seem to have not given up on doing really well
on the benchmarks.
Yes, yes, that makes sense.
I mean, it's possible that you must defeat the final boss
to play the end game.
And so maybe the end game is this vibe check,
this big model smell, but in the interim,
yes, you only earn the right to go into big model smell
if you can dominate
in all the benchmarks.
There was an interesting moment where 3.7 was beaten
on every benchmark.
So now for state of the art on stuff again,
3.7 was losing on everything.
There's a better model for everything hypothetically.
But then if you looked at what you might call
like revealed preferences bench, which is just like,
what do people use on cursor?
Sure.
What's going on, man? Yeah, revealed preferences bench.com is just like what do people use on coaster sure what's going on man?
Revealed preferences bench calm
Yeah, yeah 3.7 was pretty high up there
Yes, it seemed like they had something that wasn't captured there. What about cornered resources data is the new oil?
That seemed like a very silly concept in the moment when everyone had scraped the web entirely and there was it really felt like
Data was fully commoditized then we see vio3 and for the first time it feels like okay
There is at least one data set that is so large that you can't copy it onto a single hard drive or
Compress it and it's YouTube and Google owns it and and yes people might scrape it here and there
But Google has a durable
advantage there.
But is that the wrong way of thinking about it?
Yeah.
I mean, I'm not sure about the video models.
I think it's true that data is both super, super important, but also has just become
tremendously overrated because the first thing people learn about AI is, oh, it's a result
of the data that goes in.
But now that we're unlocking things like RL and better post training, it seems to me
like you can have some non-data solutions
to some of these problems.
Yeah, I mean, that was the original,
what, generative adversarial network for image generation
was like synthetic data generation
and then testing it.
And so like, it just, VO3 feels so, so much
like a beneficiary of YouTube.
But I don't know if that's just,
if we're just waiting and we'll see the next Sora
and we'll be like, oh, opening, I figured it out.
And like, yeah, maybe they found some like,
kind of work around to the data,
but really like the vast majority of the consistency
and the innovation there was algorithmic progress,
not just, you know, quarter resource and data.
Yeah, one thing about video models,
it's been so secondary, but they've become so impressive.
I think that if you showed them both to me
a couple of years ago, I would be more impressed by VO3
than even like Cloud Force on it or something, you know?
I agree, I agree.
It's just, it's not what I,
it's really, really just incredible.
Well, yeah, I mean, I think a lot of it just comes down
to like the cost of instantiating the thing.
And so if I go to,
if I go to deep research and I use Oh three and I have it pulled together, um,
some, you know, 20 minute research paper, it's like, that's a few hours of work.
Maybe it's a few thousand dollars of like a researcher's time.
Maybe we're getting up into like PhD level. I could do it on my own,
but, but if I actually want to crash a Ferrari through the Hollywood sign with champagne bottles
Flying a custom the Hollywood sign that's huge like unless I'm either doing yeah
I'm either I'm either renting all that shooting it practically and it's a multi-million dollar Michael Bay shoot or
I'm doing it all in CGI and even to do it in CGI millions of dollars of rendering and so even for an eight second
Clip it just looks like, wow,
I got something that normally would cost a million dollars to make happen.
And there's no, there's no real like textual asset that feels like, wow,
this is a million bucks worth of assets. Anyway, interesting. Uh, XAI, uh,
they are cooking. They've been, uh, obviously GPU rich scaling up.
People seem like they're in the B tier here, uh, according to this chart, but everyone's kind of excited
about what's coming next.
What is your take on Grok, XAI,
are they close to the big model smell?
That feels like a natural beneficiary of Elon's strategy
of just go big, but how are you thinking
about Grok generally?
Yeah, I'm not the most impressed yet.
I mean, Grok 3 is good.
It's a good model.
Sure.
It's like a funny thing, like Grok's whole thing,
or something that people who really like Grok often say,
it's like, oh, it's trained on this real-time X data,
this X-ray lily.
One thing I've tried a few times,
because I saw it in a tweet, is if you have a tweet you can
describe, maybe I say like John Coogan's tweet about bringing
media back to Hollywood.
Yes.
And you ask Grok to find it, it can't find Hollywood yes and you ask rock to find it it can't find it
really yeah three can find it that's so interesting because I feel like I feel
like X is pretty locked down at the at like just the WWW layer right it's
pretty hard to find in fact a lot of times I'll post in a post from X and it
will have to go to like thread reader unroll and find an
archive off of X because it clearly can't access it directly. That is fascinating.
So that feels that feels solvable. Adam ships TBPNguest.com. Yeah. Like last
week I had a friend find it Monday we hadn't announced it anywhere it's not
even visible on the Google search. Really? And O3 found it.
Wow.
It was like, how did you find this?
And he was just looking.
He asked O3, can you pull together
a list of all the guests that you've been able to find?
Wow.
And it found that link randomly.
And Google doesn't even find it.
Interesting.
O3 is really good at search.
And I think that might have been RL.
They mentioned RLing on Tool You in the blog.
Very, very interesting.
Also, like XAI, it's like not really much revenue,
nearly no revenue yet, you know,
at some point you need to start pulling that out.
I'm glad they're pushing on the distribution, you know,
but things come around.
Yeah, makes a lot of sense.
Last one we'll end with.
Probably the highest revenue multiple
of any company in history.
Yeah? Yeah.
Last one we'll end on Meta Llama, sitting in D tier,
but maybe not out of the game yet.
The two interesting bull cases I've been discussing have been one, is there a world where open
source American AI becomes geopolitically important for countries that are slight allies
and they're either choosing between deep seek or an open source American allies and they're either choosing between Deep Seek or an open source American model.
And opening eye would not be in the conversation.
And then also just, why would you ever bet against Zuck?
He has a capital cannon that can fire 10 billion
at random projects forever.
And so the question is, is that enough?
What are you looking for from Meta and Llama in the future?
Yeah, it seems like you hit some issues recently. But I I'm not betting against suck. He's got the capital. He's got some GPUs
They can get together some really great research. I would love to see better American open source models
I mean, I'm not betting on open source in the long term as maybe the cornerstone of AI but the fact that all of our American
Research groups a lot lots of really smart RL researchers are doing experiments on quen and on on lamba lambda maybe the cornerstone of AI, but the fact that all of our American research groups,
lots of really smart RL researchers,
are doing experiments on QUEN and not on LAMDA,
it's just not great, you know?
Yeah, yeah, yeah.
So should, there's one interesting twist there,
which is QUEN has so many different models,
LAMA has a few, they're still working on rolling out behemoth,
but would it be like almost more of an olive branch
to the developer community
to fragment the models and really focus on hitting researchers?
Is that kind of a potential path that they should take?
Yeah, I mean, I think it would be really cool if they did that.
It would be somewhat charitable.
Yeah, yeah, yeah, exactly.
Developers love a handout, but you know.
I don't know.
I think I'm curious about what they do on the product level and how they can build stuff
in better.
On the product level, people aren't incredibly sensitive
to whether O3 can search 50,000 websites like we are,
you know, they care more about just having something
that's really good, something that's really good to talk to,
maybe meta shifts focused there anymore.
I'm not feeling it right now in terms of like,
when will a meta model grab number one on El Amorino
or something, it seems like it's gonna be some time,
you know, but I'm not counting them out at all either.
Yeah, I mean, if they can just, yeah,
stay on the lagging edge, that could still be valuable
in a lot of their product rollouts.
I mean, we forgot Apple and the L tier.
We do have another guest hopping on in just a minute,
but Apple and the L tier, how do they dig themselves out?
Is it build, is it buy?
What do you think is gonna happen?
They could maybe, they have a lot of cash. They could maybe buy someone.
They could buy someone. Yeah.
You can get by by lab and then you gotta go and upgrade.
Um, I, I, there was a,
there was some report that they had some internal models. Um,
I wouldn't be surprised if they could train stuff. It's just, look,
we haven't seen anything at all. You know,
do you think they're really training on Apple silicon? Like you've seen those photos of like all the Mac minis wired together. Does that seem like something that's really just like
Okay. Yeah. Um, I think you're on GPU training ones are gonna be bigger next year. Really?
Well, I think the GPUs for Google sure. Um, yeah, so they already have a long time
It's easy
They could go do something like a training or an in French a chip from Amazon or TPU
Yeah, I mean with the TPUs, Google has by far
the most compute.
Yeah, I mean I guess Apple's pretty good
at chip development and design, so like,
they could do it. Yeah, their one chips
are pretty good. That would be their,
yeah, that would be their advantage
if they could build a really strong chip
and cut that cost. I wouldn't bet on it,
but maybe. Yeah, yeah.
I like the idea of just opening it up
and really partnering. The thing over the last
24 hours is one account sharing,
it's so over for Google and then immediately sharing, wow,
Google is going to destroy everyone in AI and just like seeing,
seeing how the posts rank.
Yeah.
Anyway, anyone else on here? They got Mistral and FT or Porsche for the French.
They're not trying Le Chat.
Yeah. I do wonder about Mr. All because you know, the,
the models are real, but none of like broken out and capability, but there,
there's this question of like,
if you want a national champion in your country,
it might not be enough to just have the foundation model layer.
You also might have to go and win in the free market in the application layer.
And so yeah, you could have, even if you had a comparable model,
if you're not, if that's not,
if people are going to do chat.com instead of laychat.com,
like you have not won
and you don't have your national champion.
Yeah, and there's a,
I think there's some truth to this,
but there's also the regulatory stuff in the EU.
I mean, a lot of releases,
I think VO3 is not in the EU. I mean, a lot of releases, I think Vio3 is not in the EU.
A lot of releases don't come there.
Maybe Mr. Orr just uses regulatory modes to monopolize.
Not a fun way to win,
but maybe that's the ball case at this point.
Yeah.
What was your reaction to the conversation back and forth
with Dwarkesh and Shalto all about this debate over,
to our cash and Shalto all about, um,
about, uh, the, the, the, the, this debate over, over, uh,
I forget it was like spiky intelligence and how you actually, uh,
train someone. There's so many different things. We see that the models are really good at one thing and then they fail.
Arc AGI. Um, uh, what's your overall timeline right now? How are you looking?
Yeah. Do cash raise the point that
you can't kind of do this continuous learning,
this like short run continuous learning,
like you can tell me, Jack, I want you to do
something different as a get intern.
I figure that out.
And context is a weaker tool than that.
And I think that's absolutely true
and that's an unsolved problem.
I don't know how much that moves my needles on timelines.
Like one thing that could be true is just that
OpenAI or Anthropic makes some like,
sweet agent and then it starts accelerating
their AI research and they just get like
really efficient algorithms really quickly.
Some architecture that just destroys the transformer.
But I do think it's a meaningful unlock
if that could be solved.
And I think that sort of like,
mid-level memory type of stuff is really interesting.
Or solutions around context around a wrapper.
Well, this was fantastic.
We have our next guest.
Thanks so much for hopping on.
Good to come on.
We love an in-person guest.
For sure.
Thank you so much.
Next up, we're heading over to Google World.
We have Arush from Google.
He worked on the Deep Research project
that dropped from Google in 2024.
It was a full year ago, it was in December, technically,
but very excited to talk to him about that product,
all the things that go into Deep Research.
So we'll welcome him to the studio if he's available.
How you doing?
Good to have you on the show.
Hey, what's up guys?
Thanks for having me. What's going on?
Not too much.
We're having a great day, we got a great lineup, and excited to dig into it.
Would you mind kicking us off with just an introduction
on yourself and a little bit of the history?
I want to hear about the history of the products
that you've built at Google, what
the interaction between research and product looks like,
and what you're excited about.
Yeah, for sure.
First off, A-Team, that's pretty good.
Pretty good, yeah.
There we go.
Let's hear it for A.
Let's go.
Let's go.
Let's hit it.
Yeah, John's going to hit the dog.
I don't know any fun.
Good work.
Cool.
Yeah, love to be here.
Yeah, it's been fun.
It's been a fun ride. I'm a product manager on the Gemini team cool
I've been here since a little while back when it was called Bard
Bard day and Bard days
Yeah, and so yeah about I don't know maybe like this time last year. We started kicking around this idea of deep research
where one of the things we noticed is a ton of people
come to the product and ask, like seeking to learn something or asking questions and kind of doing
researchy type things. But if you ask really hard questions, one thing we noticed is the model would
just give you like an outline of an answer. It wouldn't actually tell you something very comprehensive.
So we kind of just ran with a hypothesis of like,
let's take off the constraints of like,
it has to respond within a few seconds.
It has to use this much compute.
Like let's let it, let's just see how far we can push,
what the model can do.
And this was before thinking models or anything,
and then like kind of, and any of that, that good stuff.
And so we kind of worked on this idea for a bit
and then we launched in December back on Gemini 1.5 Pro
was the model that we were using back then.
We launched deep research as kind of as a bet
to just see like would people be into something
that makes you wait 15 minutes
but gives you something comprehensive.
I'm happy to wait, although I do want it to speed up.
Questions about context window size.
How important is that million token context window?
That feels like it's been a unique Google feature
for even longer than I expected.
The advantages in AI seem to last days, maybe weeks,
before another model comes out that, you know, meets or is
roughly around the same capability. How important is large token context
windows in deep research like products? Yeah, it's huge. It's like
really what enabled us and kind of gave us the confidence that this was even
worth trying. I'd say that the long context enabled us to do basically be
very recall forward and really cost a very wide net as we research the web and
try and find gems of information that we then stitch together. And so that
that was like I think our biggest differentiator and really allowed us to
build this product. The other thing that long context allows us is like once you
finish your research not just the report,
but everything it read along the way is in context.
So you can keep asking questions going deeper
with within Gemini.
And even if it's like a tidbit of a fact
that's not in your report,
if it's been read at some point,
it'll be able to retrieve that and give you that answer.
So it also helped sort of beyond that first turn,
keeping a good experience.
And then reasoning models was like the next big,
big step jump for us,
allowing it to then do more critical analysis.
So in terms of like actual product design,
I'm interested in the direction this goes here.
You could see one world where the models are baked down
into silicon, everything's running even faster, you're distilling the models are baked down into silicon
everything's running even faster you're distilling the models and all of a sudden
I'm getting a deep a 20-minute product in two minutes or even 20 seconds you
could also imagine a world where what's possible if the economics work such that
I could request a two-hour research report or a two-day research report.
How are you evaluating those?
What would you personally be more excited about?
And what do you think users actually want
because stated preferences and revealed preferences
are often different or do we wind up with both?
Yeah, so one of the things that we noticed,
one, when we launched this,
we had no idea
people would be willing to wait.
Every metric at Google from the day it started is reduce latency and all metrics go up.
So this was definitely a bet where we were, like a lot of people thought we were crazy,
where we were like, we're just going to take a ton of time and people will wait.
One thing we noticed is that after about a minute or something like that, people are
fine.
People will go away, do other things, come back.
We'll send them a notification when it's ready.
The big pleasant surprise for us is people don't mind waiting.
In terms of efficiencies gains, one of the things that we're more excited about is, okay,
if we can make models more efficient, instead of reducing down the research time, can I
give you just a way better output? Like, can I use, can
I bank that savings and give you something way more insightful, way higher quality? I'd
say the other thing is like, even if I could give you like a deep research answer in 15
seconds, it's going to take you 15 minutes to read. So there's also an aspect of just
like how much do you want to consume this? Right. So, so for us, we're not as stressed about like,
can we make this faster? Can we make this quicker?
I do think there are probably other points in the latency,
comprehensiveness spectrum that people might like. Right.
We picked like one extreme of like,
let's just go super hard and build the most comprehensive long
thing that takes a while. But there might be totally other points people are in. and build the most comprehensive long-knit thing
that takes a while.
But there might be totally other points people are interested.
Yeah, yeah, yeah.
Sometimes I notice I've generated so many
various deep research reports across all the different apps
that I'll follow it up with a prompt,
like, okay, yeah, boil that down for 10 bullet points
because I don't have time to read that.
And then I'm like, wait, maybe I should have just asked it
to give me 10 bullet points
and I just burned a bunch of GPU cycles
But I guess the question is back and forth between the two until you kind of understand
But but I guess the question is like is is there is there a product or is the natural evolution of just
general prompts that as
As algorithms get faster as these models run faster that there is a deep research amount of work
that happens within a few seconds between every response.
And basically the question is like,
how much can you port from the deep research product
and strategy and design back into just
your average LLM interaction?
design back into just your average LLM interaction?
Yeah, I think there's definitely a lot of learnings that we can kind of start upstreaming,
really around being able to form a plan,
follow that plan to do that sort of multi-hop steps
of search, iterating, finding insights,
changing your strategy possible
before going back to the user.
So you're starting to see this in 2.5 Pro and stuff like that.
I'd you can imagine that that will continue where you will
see more mini deep research or more planning
and iterative reasoning before giving you an answer.
That could just start getting faster and faster and faster.
Then you start just getting like way more insightful or, um, uh, uh,
comprehensive answers.
Are there any other interesting areas? I mean,
deep research feels like one of the first like really solid product market fit
experiences in, I guess, like agents broadly. Um,
are there any other areas that you're excited
to think about knocking down with either different products
or just maybe just like cool uses that you've developed
or as a user patterns that you're leveraging
that's maybe go beyond just the average,
like I need a research report.
Yeah, totally. So I think there's like a few different angles that like I think a
lot of people are exploring. One is you kind of point out like what does a two
hour deep research look like? What does an overnight deep research look like? If
you can have like a very well-defined problem where like you know we have early
experiments at Google like AI co-scientists and stuff right like you
could run that overnight and it can come up with like novel scientific hypotheses, right? So there
definitely is an angle of like, if you can define a problem and an outcome really well, applying
more compute can actually get you like better and better answers, right? So there's definitely an
angle of like, are there whole new classes of problems where you can even go even further with
deep research? There's a second aspect of like, we
had the chance to go meet a bunch of people who
are like researchers at the Fed.
And they were telling us how they use deep research.
And it's often a very different thing.
So I showed them this example where I was like,
hey, there's this funny law in the US called the Jones Act
where any two ships between like two US ports have
to be like built in America, accrued by Americans. And like drives up shipping prices, but only for
like Puerto Rico, Hawaii, and like Alaska. And so I was like, do an economic analysis of the Jones
Act on like the economy of Hawaii. Right. And it like did a first principle analysis, did some
really interesting things like looking at
well like how much is a three, three and a half thousand shipping route like say from like Mexico
to South America and then that's like a baseline price to compare against. And like I thought this
was amazing but then they were like that's not how we would do economic analysis. Like they would be
like first I'd explore like what other studies there are like then I'd explore like what kinds
of methodologies are out there.
Then I might ask a bunch of follow-up questions
about what data sources or data sets
did people use to do this research.
So there's definitely an aspect of another angle of,
if I really want to help people with research,
it's about nailing this synchronous, asynchronous
paradigm and helping people kind of do more
of that iterative process rather than just like
ask question get answer and move on and then in victory and I think that's that's kind of a
product challenge like figuring out the right the right interaction model for that and the third is
it's just like outputting the outputting an answer at the right like level of abstraction that you
work at right like a financial analyst doesn't think in terms of a report.
They think in terms of the spreadsheet
or the financial model.
And so if I want a DCF,
deep research can build a great DC,
discounted cash flow model for me.
But I don't want it in a report.
I want it in a spreadsheet or I want it in an app
where I can play with the variables
and see the different outcomes.
And so you'll also see the line between like reports and other kinds of
artifacts starting to blur. Um, or even just like, what, like,
what does it mean to like build and build an answer? Right. And,
and that, that could take like a much wider,
that's super exciting. Yeah. I mean, I've seen, obviously Gemini,
we probably can't talk about the roadmap too much,
but I've seen Gemini pop up in a bunch of different areas and, and I haven't seen
the deep research version of whatever that instantiation is. the roadmap too much, but I've seen Gemini pop up in a bunch of different areas, and I haven't seen
the deep research version of whatever that instantiation is.
Maybe my last question is how much time are you thinking
about working and making, as a product manager on Gemini,
how much time are you thinking about making Gemini better
versus sort of fighting for distribution outside of Gemini
and kind of across the Google ecosystem,
because part of unlocking the value of Gemini
is just making sure it's in the right places
and placed sort of contextually across everything
from Gemini.Google.
You've worked hard on this.
Just ask for the I'm feeling lucky button.
Just give us that.
We think you earned it.
You've earned it.
It's a great product.
Just click I'm feeling lucky or burn 40 GPU hours
on this new research award.
Yeah, that would instantly melt all of our servers everywhere.
This is the biggest hyper scape.
Yeah, we need more GPUs.
The TPUs can handle it. TPUs are cool. T need more GPUs. Let's get GPUs.
The TPU can handle it.
The TPUs.
TPUs, yeah.
We use TPUs.
Okay, so ASML, get cooking.
I believe in the TPU.
I believe you've earned the I'm feeling lucky.
I haven't hit the I'm feeling lucky button in years,
yet I use Gemini all the time.
Yeah, yeah, yeah.
This is what the users want.
Yeah, we just need 10 more TSMCs, I guess,
to start fabbing.
Anyway, sorry.
Serious answer. Yeah, sorry, serious answer.
Yeah, the serious answer is the Gemini app
is a great place for us to prototype,
see what really works with people.
A lot of the users, they're very intentional
when they're coming to the Gemini app.
They want to use an AI experience.
So it's a really great place for us to put stuff out there,
see what works, see what doesn't.
Some things we put out needs more time in the oven.
And then over time, you'd imagine
that then those insights or things that really start work,
you'll start seeing in other Google products
as they make sense.
You don't want to over-clutter a UI,
but you'll start seeing things like deep research.
Yeah, because it's a very different user, somebody
that's coming in saying, I want AI,
versus I just want to do certain things.
Yeah.
And yeah, they're totally different archetypes.
It's a fascinating challenge.
I'm sure it's even more challenging at your scale.
But thanks for all the hard work and pushing the frontier
forward.
It's been a pleasure talking to you.
Yeah, come back on again soon.
Yeah, we'd love to talk to you more.
Yeah, I appreciate it.
Thanks so much, guys.
We'll talk to you soon.
Cheers.
Bye.
Fantastic.
Next up, we have Oliver Cameron. I have a good story. We'll bring him you soon. Bye. Fantastic. Next up, we have Oliver Cameron.
I have a good story.
We'll bring him into the studio, but I believe he was the first person I ever interviewed
for a YouTube video years ago.
I was doing a whole video essay about Cruise, the self-driving car company, and he hopped
on a Zoom call with me just like this one, and I recorded it and threw clips in the video.
It was very fun. And then I wound up doing more interviews after that so
Oliver good to see you how are you doing what's going on welcome doing great thank
you for the opportunity would you mind kicking us off with like the latest and
greatest introduction because you've done a lot in your career but you're on
to something new for sure so spent about eight years building self-driving cars
incredible time I mean just to see that technology go from barely being able to For sure. So spent about eight years building self-driving cars. Incredible time.
I mean, just to see that technology go from barely being able to keep in a straight line
to navigating downtown San Francisco with no human behind the wheel, just a sign of
where things have gone with machine learning.
So had a blast doing that.
Built my own company, sold that company to Cruise where we met and uh and loved that time. Left Cruise in May of 2023,
decided to start something new and both me and my co-founder who also was from Self-Driving Cars,
we were both um very much inspired by Pixar. I think it's just a very special company, right?
Everyone kind of recognizes Pixar as this sort of iconic storytelling company. And we really put our heads together to think about what a modern reincarnation
of Pixar would look like. So that company is called Odyssey.
And we're an AI lab that's really focused on enabling entirely new
stories to be told.
And, uh, walk us through the first product that you launched.
I played with it earlier. Uh, it was mind blowing-blowing. We'll pull it up while you're talking
Sure. Yeah, we just released a research preview of something that we call interactive video
Mm-hmm, and it's effectively AI video that you can both watch and interact with in real time. Yeah, and
We think this will become a entirely new form of entertainment, you know, you've got film, you've got games, you've got all these mediums that have been around for a while.
We think that there is an opportunity to invent a brand new one,
where effectively a model is responsible for imagining film and game-like experiences in real time that you can interact with.
There's no game engine behind all of this,
no heuristics, no rules, just a model that's learned
pixels and actions from tons and tons
and tons of real life video.
Yeah, we're showing it on the screen right now
and the production team is controlling it
with the keyboard, W-A-S-D, like it's a first person
video game and they're walking around this field
with trees and windmills and they can actually choose to go up, go inside buildings and it's all being generated without the use
of a game engine and then they can switch over to a different environment.
And so, I mean, I have tons of questions about how these different, like you're not doing
photo scanning, you're not doing game engine stuff, traditional 3D pipeline, but the data must come from
somewhere, love to hear about that.
And then also, I noticed the space button doesn't work,
I wanted to jump around, start bunny hopping,
when are we getting a space button added to this thing?
Anyways.
Isn't it trippy how those pixels are literally streaming
from a GPU cluster cluster probably in Texas.
It's so crazy.
And now we're streaming them via Zoom in real time.
It's crazy.
My question is, do you think that Odyssey can be
a really breakout app for VR?
Cause when I see that visual, I feel like that
it could give someone the sense of being able to explore lands that don't exist,
which is like very fat, like once it's fully immersive, it feels like.
It's funny the windmill thing because I remember the very first Oculus demo that I ever did.
I was walking around a windmill and it's still in my mind years later,
but it was amazing,
but it was just like one little windmill and then you
couldn't go any further because developing like virtual assets is really expensive.
And so you play a lot of these VR games and you know, it's a couple hours or 30 minutes.
But if you take a procedural approach or a generative approach, you all of a sudden have
infinite content.
I think what's really important to note is in film and game, incredible things can be made, right?
Like insanely good things that wow is all.
The time and the money it takes to create those things
is ludicrous and it's only getting more expensive,
not less expensive over time.
So I feel there will be continuously a place
for these sorts of like handcrafted things.
And they'll be very important. But if we just think about a model that's trained on literally decades of video, that's then able to imagine stuff in real time with no pre-production costs,
no post-production costs, and do that literally in real time, like 33 milliseconds. It just that that's where it gets really crazy. And what we
showed in the research preview is just like this tiny glimpse, I
think of what this stuff will become. VR in particular is like
the most hardcore application of this from a technical
perspective, because the resolution required for VR is
like insane. And the resolution that you saw that you can tell
it's low res, it's like 300 pixels wide.
So there's gonna be a leap that needs to happen there
to get to VR level res, but.
I'm confident that Odyssey 2, you'll have it.
You'll have it dialed.
Oh yeah, two points.
Yeah.
Give us the stats.
How many, like what numbers can you give us
about the progress or adoption?
You just launched this, I think this week or last week,
it hasn't been very long,
but how has the response been quantitatively?
Oh, it's been incredible.
So we launched a week ago,
and since then we've served 250,000 unique streams,
meaning 250,000 people experiencing what you just saw,
which is insane.
Market clearing order inbound.
Yeah, let's do it. you just saw, which is insane. Market clearing order inbound.
Let's do it.
Love it.
Congratulations. That's fantastic.
On the question of resolution,
there's a bunch of amazing AI upresing that's happening in
various parts of the pipeline.
There's some server-based upresing that can happen.
There's some on-device upresing.
So, is that, are you counting on that technology breaking one way or another? Does it matter?
Will it be a combination of both? How do you see that developing?
I think a way to think of this is where video models were a year ago is where real-time
video models or world models will be today.
And what that really means is that you look at the res,
remember the Will Smith, everyone remembers the Will Smith
spaghetti video.
Yeah, spaghetti.
Was that like one year ago or two years ago?
It wasn't long ago.
I think it was just over a year ago.
So fast.
There was definitely better outputs, like spaghetti
was like the weirdest, hardest thing at the time.
Although gymnastics today, I'm sure you've seen that.
That's really tough with video models today.
But that's all to say that I think the res
and the visual quality improvements
will come from the model itself,
not like some secondary piece of infrastructure to up res.
Just because, I mean, think of what a language model
was like to use two years ago.
Like, how fast was it in response time?
Really quite slow, right, compared to today, where it's like just stream of information straight to your eyeballs.
Same will be true of these models.
Like, we'll crank out larger resolutions, faster frame rates, more actions, more things you can do, all that sort of stuff.
Yeah, and I guess importantly, like, GPT 4.5 is not GPT 4 up
res'd to 4.5.
It is a different model.
We're walking around what looks like the gloomy English
countryside right now.
And I think the production team is going
to try and go in that house.
It is really, really so wild.
I noticed that there's a time limit.
Do you have a tropical island demo?
Because this, I love the English countryside.
It's very foggy.
Yeah, I noticed that there's like a two minute timer
when I sign in.
Is that so the GPUs don't melt?
I mean, I assume you've raised money
and you're maybe burning some money with these demos.
But break down kind of like what your limitations are and how you see them evolving.
Yeah, for sure.
So the timer is there because each session
is served by a single GPU.
So each user gets a GPU, the model's running there,
and that's beamed to the user directly.
And really quickly, when you say single GPU,
you don't mean rack, you mean like one A100
or something like that?
One H200 per user.
Got it.
And there is a clear path to like dividing the GPU
to have multiple sessions per user, but today it's one.
And we want to really crank up quality frame rate,
all that sort of stuff.
Yeah.
It makes me feel great to know that I'm getting,
you know, the sort of one-on-one attention from an h-200 chip yeah yeah
it's not a retail store yeah it's not a great experience if somebody's bouncing
around exactly I'm being individually served being served this is like an
air mass level by Jensen so two dollars an hour is there or thereabouts how much
that costs which you know over the course
of multiple users, it's not too bad.
I think Netflix is like five cents, 10 cents an hour, something like that to stream video.
So we're a bit of a ways away, but you've got new chips coming, just model optimizations,
like it won't be long where we're having a single GPU per user, all that sort of stuff. For this launch, we had something like 360 H200s prepared,
to scale it up a little bit, just because we had lots of demand,
but that timer is there just to make sure we're cycling through
lots of people getting a taste of this.
But yeah, I think fundamentally the idea that you could have a model
stream stuff to any screen is really powerful like that
Experience you saw there works just as well on an iPhone on an Android on TV
Anything like that and it's all just action conditioned of a web RTC, which is probably what zoom is running on
So the action to just sent over the wire to the model the model then conditions the pixels
It's about to generate based on those actions sends sends the pixels back, and just that loop every 33 milliseconds is firing.
So, I mean, the path to HD or 4K seems pretty clear to me.
What about the path to consistency?
That feels really difficult.
You need essentially a really long context window
to know that, okay, I dropped my mythical sword
on that piece of the ground.
I went away and then I came back.
That's like textbook, just put it in a database.
But it seems like the future might not be that.
So how are you thinking about that?
I guess the bigger question is like, what's the response from the gaming community?
Is this something that can be a tool and a piece of a pipeline instead of completely
replacing the entire traditional pipeline.
So most research on interactive video before has learned from games. So
lots of folks will have seen
Oasis from Decart, a Minecraft in a video model. Yep. Effectively or Quake. That's often used in video models.
And I think the gaming reaction to that is
quite negative. Oh, yeah. I mean you saw the car Mac back and forth, right?
My car Mac was like this is amazing. I love and somebody else is like this is stealing, you know developers
Yeah, and I think it's important because
The way that people envision that is like, oh, what's the best thing this could become it could become like we mixing of games
Yeah, and that's one way it could be.
I think people see what we have and they think,
oh, this is like a world simulator eventually.
This is the matrix or like whatever they project on it.
Yeah.
So really, one thing we're trying to avoid is like,
for the first few generations of this,
people will put, including ourselves,
like this picture of what existing games look like onto this.
And it's like the iPhone when it launched, right?
People ported desktop apps to the iPhone,
and it kind of worked, but it kind of didn't.
Wasn't really embracing this new medium.
So I think the long story short here
is stuff that is integral to games today,
like multiplayer, like state,
like scripting, all that sort of stuff.
Let's question those assumptions.
Like, how should those things work?
Let's make it model native.
Like maybe memory in this model is very different than memory in a game
or state in a game, multiplayer in a game, all that sort of stuff.
And that's probably going to lead us in the short term to more like glitchy,
weird experiences, though, at the memory as it is by the models, a feature, not a bug. I don't know if you guys have seen like the back rooms or like these kind of glitchy weird experiences though at the the memory as it is by the models a feature not a bug
I don't if you guys have seen like the back rooms or like yeah these kind of glitchy. We yeah
Yeah, it almost is a completely different type of game design
Yeah, yeah the the up down left right a B AB of the future will be like drop your bad
Sword on the ground walk around the building three times come back and it's and it's enchanted
Because he is the model hallucinates
that you've upgraded or something like that. That'll be fun.
I also think that one important thing here is that in language models, one of the things
that's happened over the last year is in many cases, they've crossed this threshold of realism
for certain applications. So like people literally fall in love with language models, right?
The same like emotional feeling they have when they meet a person literally fall in love with language models, right? Like the same like emotional feeling they have when they meet a person they
fall in love with is happening for them with a language model.
And that's cause they, what they're seeing on their screen is like so realistic.
It's like crazy real to them.
And I think the same will be true here where once these pixels, these
actions feel so realistic, which eventually they should just get in the
data, given the models and advancement.
There'll be things that they do in these worlds or things they feel in these worlds which they just can't feel in
video games today because games are just capped by computer graphics and like
human dev time and budgets and everything else but they'll walk down
the street they'll see someone and they'll be like wow that person looks so
real and they'll go over they'll like high-five that person or like on the
screen right yeah and I'll just feel something like they'll feel like a
heartbeat raise you know yeah totally so that's that's an
application that you can't do in games today that's just different and new so
that's the sort of stuff we're really interested in well that's gonna be a
wild wild future but thank you we'll have to have you back and check in on
progress yes definitely the day that 720p drops or whatever the next version
is like
We're excited for this, but thanks so much for joining. This is a fantastic conversation
We will talk to you soon. Have a great day guys for joining. Thanks so much next
So we have a return guest Michael Mcnano from lightspeed coming into the studio
Are you gonna talk to the gong or he's gonna talk about competition between?
Yeah, you know a lot of Asian labs and the app player. Well, welcome to the stream Michael. How are you doing? Boom. Good good to see you guys
Studio. Thank you. It's been a lot of fun a bit of a I like the upgraded gong too. Oh, yeah
That's much bigger
Everything we got a bigger. We got a bigger one in the works. Yeah
We're kind of even bigger like Florida's really. Oh, yeah also
Yeah, we're working on even bigger floor. Really? Oh, yeah also
It's a funny day to just be so hyper fixated on AI because you probably haven't seen the timeline Tesla's down
17% 17% It's just absolute mayhem. I mean, there's an AI there, right? Yeah, there's definitely that's not what's driving it there
But anyway, what Michael it's great to have you on.
Wanted to get some kind of updated thinking from you
on the tension between labs and the application layer.
We saw the news with Windsurf and Anthropic today
that had more to do with a potential acquisition.
And even when we talked to the founder of Granola,
we were talking about the competition between Notion and Granola with these,
it's a founder-led kind of previous era
scale-up unicorn SaaS company.
Can that company bolt on AI,
but then now we're seeing competition
from the foundation lab.
So, would love to get your lay of the land.
What are you seeing?
How are things shaking out?
And what do you think the next few months or even years look like?
Yeah, it's pretty interesting, right? Like if you think about the big companies
that startups previously, uh, built on the backs of the Googles,
the Amazons, the Microsofts, you know,
it felt like there was this really healthy sort of symbiotic developer
ecosystem where the incumbents supply resources,
the developers sort of buy and extract from them and they build really,
really big businesses on top.
I think what we're seeing now to your point is these labs are building
developer ecosystems,
but then they're very intentionally and overtly going head to head with the
developers that are building on them.
And I think this has a lot to do with context, right?
So if you think back to the internet,
you know, and startups 10 years ago,
everyone said content is king, you know, content is king.
Then distribution was king, right?
It was all about how do you get in front of users?
We're starting to feel like we're entering
the phase of context being king.
These models are just hungry for the most and the most unique context possible.
And so if an app layer company emerges and has a new type of context and data
that the models don't have great exposure to, it's a great signal to point in the
direction and say, we're going to compete head on.
And so I think that's what we're seeing now.
And yeah, Nabil Hyatt, a great investor from Spark and I, we're going to compete head on. And so I think that's what we're seeing now. And yeah, Nabil Hyatt, a great investor from Spark and I,
we often talk about how the war for context is happening now.
And I think that's, that's what a lot of these modes
represent.
How do you think app players should app player companies
should respond?
Is it just double, triple down, go way, way, way deeper,
focus on workflows that the labs maybe don't have the resources to fully pursue or is it focusing down on specific niches?
I'm curious what the yeah, I think the right approach is well
We can definitely get into that but maybe first of what I would say is, you know
I tweeted something yesterday that occurred to me after the big announcements from open AI
In that, you know the big incumbents which we talked about little while ago, sort of like the winners of the cloud era,
it wouldn't surprise me if all of these new competitions between the labs and the apps
actually drive the apps and the startups right back to the incumbents, to the Googles and the
Amazons of the world. I have to wonder if some of these things actually act as a tailwind for models like Gemini
and maybe give a little more credence to the argument
that like Google is actually gonna be the winner here
because of all their distribution.
So I think that's one potential.
You mean driving back and being like,
I'd rather work with Gemini
because I don't think they're as likely to kill me.
Exactly, yeah, exactly.
It's like, hey, we trusted them with the cloud,
and that worked out all right.
Should we now trust them with AI more than we trust the labs?
Yeah, I mean, that narrative even goes a little bit further
with Microsoft, which has been completely like,
oh, we will host every single model.
We'll let you reroute really intelligently between them,
like super, super friendly developer ecosystem.
And so, I mean, certainly they're building stuff
into Copilot, into Microsoft 365,
but it does feel like they're much more willing to partner.
Yeah, Satya seems to have real conviction.
He had the quote from last week,
platform, platform, platform,
and hosting DeepSeek is an example of that, right?
A lot of people would have thought,
oh, he's not necessarily gonna host that model
because it felt like a shot across the bow at OpenAI,
but he's committed to supporting open source.
Right?
Yeah, yeah, yeah, he wants it all.
Interesting.
I think, you know, also going back to your question,
Jordy, I think all of this is just gonna make
for a more intense, faster moving market.
Like I think more than ever before, you have to ship,
you have to get users faster than anyone.
You have to sort of like reach escape velocity quicker,
which I think is just gonna put more and more pressure
on startups to move even quicker than they already are.
I feel like cursor is a great example.
I feel like an earlier iteration of that product, you know,
it probably would have been easy to sort of write them off and be like, Oh,
you know, lab is going to do this. I mean, now it's like, they're so big,
they're so far ahead. It feels like they've,
they've really established themselves and likely have a good shot of breaking
through.
I also wonder with cursor and windsurf and Devin and some of the dev tools
markets,
like it feels like just such a new market
that even if it's somewhat winner take all,
there's just, it's so positive some
because it's adding efficiency to the most,
like one of the biggest labor pools.
And so when we talk to the cognition folks
as the reaction to Google and OpenAI
launching Devvin competitors.
They're like, well, we still grew 40% last month
or something like that.
And so, you know, I wonder like in,
in code gen where it's such a new market that it's not,
it's not directly competitive with anything that exists.
So it's less zero sum.
I'm wondering if the note taking market feels similarly
to you or, or were you seeing?
granola or other or other companies kind of act as more
drop-in replacements for existing tools
Yeah, I think it's great question
Yeah
so so we backed granola really early on because we we knew Chris and his co-founder and we loved those guys we didn't know what
They were building we knew they were gonna build something in note taking.
But we said, you know what?
This market's gonna move fast.
We trust these guys.
Let's go for it.
And I think, you know, somewhat to your point,
there's been all these note takers before.
Like Granola wasn't the first note taker.
There was Fireflies and Otter and all these things.
But I think Granola has done a really, really good job
of getting out of the user's way
and establishing trust with the user.
And I think that seems like a small thing,
but I think that trust thing is gonna be really important
if you go back to what I said about this context being king.
Who are you gonna trust to take this context
or take this really, really important
proprietary part of your work.
In this case, your meeting notes, you know, a lot of people say we trust granola.
Are they just going to hand it over to any old company that says, hey, now we want to screenshot your entire computer
and take and suck every last piece of data out of you.
And so I think part of it is to your point, like getting in early, getting big really, really fast and establishing that, you know, that user base and that market before it really matures,
but also in a way that like users just really trust you and they're not just going to rip
you out just because some other bigger company offers the same thing.
Yeah, yeah, that's a good point.
What it was, do you have any more like micro reactions to specific integrations that seem
to be one of the big things that openAO was pushing on was integrations with Google Docs and Drive and
your email and that feels like adding that extra context is potentially
the next thing people are clamoring for. How important is like the biz dev side
of this business in fact? I think it's really important. You know, I think it's really, really great
that Anthropic started the whole MCP protocol.
Obviously lots of others are adopting that now.
But I think to your point, we're now gonna start to see
the battle lines being drawn.
Like who are we willing to integrate with?
Who are we not willing to integrate with?
Where is it?
You know, are we open or are we closed?
Where's the data gonna go? Where's the data gonna go?
Where's it not gonna go?
I think we're gonna start to see those alliances
and those allegiances form.
And I feel like we've seen this.
And we saw this with APIs,
back in what, 2007, 2010 era.
Social media.
They have an API, it's amazing.
It's like, well, you don't know
how much that API's gonna cost.
If it's $10,000 per day or something,
that could completely upend your business.
And so actually thinking about how that dynamic develops
is almost more important than the standard,
although I'm very glad we have a standard,
that seems great.
But each company is gonna have to decide
where the value accrual really lands,
and then who knows, maybe there'll be some antitrust
in 20 years, like we're seeing with Apple.
Yeah, the big question around trust,
that I, you know, it's an evolving situation,
but a California judge, I believe it was yesterday
or the day before, ordered OpenAI to retain records
of sort of, forget what OpenAI calls it,
but if you have like a disappearing query,
a judge ordered them that they have to retain that.
They obviously said that's a huge overreach
for privacy with users.
So incognito mode.
Yeah.
It's like not incognito.
Yeah, and that more seems like an issue with the court
and the specific judge having this massive overreach
around privacy.
But privacy in this era when people
are more willing than ever across every app
to give them all sorts of data.
Yeah, and you have a direct incentive
to reduce the level of privacy to get better results.
Like if the model knows what kind of car
you drive when you ask it for new tires,
it will give you better recommendations.
So you want to lean into being anti-privacy
to get better results.
The world is definitely bifurcating into pro-privacy
or like fully AGI-pilled folks.
And there aren't that many people that are in the middle.
So obviously we will have to figure it out
as a democratic society, ultimately vote,
and hopefully sort it all
out in the courts. But thank you so much for stopping by. This
was Michael.
We'd love to have you back.
Talk to you. Yeah, I mean, guys, I just want to tell you, you
know, I don't really aspire to ring the New York Stock Exchange
Bell one day. I aspire to hit that gong.
Hit that gong. Well, next time, come by. Come by.
Great to see you Michael
So we have a generational crash out going down. Oh really timeline. We got a new post from Elon
Okay, I read it out. He says and this is your live reaction John time to drop the really big bomb
Real Donald Trump is in the Epstein files. That is the real reason they have not been made public. Have a nice day, DJT. Wow.
That is a big bomb.
But wait, didn't we already know this?
Because isn't there that picture with Trump and Epstein
together?
We're really in dark territory.
I want to go back to AI business and technology.
The business story here is that Tesla's down 17%.
DJT is down 7%.
Trump coin is down 10%.
Wow, they're all
fighting this crash out on both sides is not good for anyone.
Well, you know, what's you know, what's interesting, you know,
what's not down. tokens generated baby, we're still
generating tokens every single day. The relentless march of
artificial intelligence continues. So the other thing
is, is Elon shared,
or sorry, Trump shared on Truth.
It's funny they're battling on their each.
Oh yeah, they have different social networks.
Every billionaire should have their own,
you know, social media network to get the word out,
but Trump said, the easiest way to save money
in our budget, billions and billions of dollars,
is to terminate Elon's government subsidies and contracts.
I was always surprised that Biden didn't do it.
Wow.
So, Ashley St. Clair is saying, hey, Donald Trump, let me know if you need any breakup
advice.
I really don't know about it.
And Dan Primack says, this cannot be a comfortable day for David Sacks.
On the other hand, it's just the best day for Sam Altman.
Well, we have someone from OpenAI here.
We're going to stick to technology and business, but welcome to the show, Mark Chen.
Good to see you.
Great to see you guys.
Thanks for having me.
Awkward day, but I'm excited to talk about Deve Research.
I am excited to talk about AI products.
Would you mind introducing yourself and kind of explaining what you do because OpenAI is
such a large company now and there's so many different organizations.
I'd love to know how you interact
with the product and the research side and anything else
you can give to contextualize this conversation.
Yeah, absolutely.
So first off, thanks for having me on.
I'm Mark, I am the Chief Research Officer at OpenAI.
So in practice, what that means is I worked with
our Chief Scientist, Yaacob, and we set the vision
for the Research Org, we set the pace, we hold the research org accountable for
execution. And, uh, ultimately we really just want to deliver these
capabilities to everyone.
That's amazing. In terms of research, I feel like a lot of the, what happens
in the research side is actually gated by compute. Is that a different team?
Because what if the researchers ask for a $500 billion data center that feels
like maybe a bigger task?
Yeah, it is useful for us to factor the problem of research and also kind of building up the
capacity to do that research. So we have a different team. Greg leads that, which really
thinks holistically about data center bring up and how to get the most compute for us.
And of course, when it comes to allocating that compute for research, you know,
Yakov and myself do that.
That's great. And so what,
what can you share that's top of mind right now on the research side?
There's been this discussion of pre-training scaling wall,
potentially the importance of reinforcement learning, reasoning.
There's so many different areas to go into.
What's actually driving the most conversations
internally right now?
Yeah, absolutely.
So I think really it's a really exciting time
to do research.
I would say versus two or three years ago,
I think people were trying to build
this very big scaling machine.
And really the reasoning paradigm changed
a lot of that, right? You know, like reasoning is really taking off. And it really opens this new
playing playing ground, right? It's like, there are a lot of kind of known unknowns, and also unknown
unknowns that you know, we're all trying to figure out, it kind of feels like GPT-2 era, right? Well,
where there's so many different hyper parameters, you're trying to figure out. And then I think
also, you know, like you mentioned, you know, pre training,
that's not to be forgotten either. You know, today we're in a very different regime of
pre training than we used to be right. Today, we can't treat data as this infinite resource.
Yeah, I think a lot of academic studies, you know, they've always kind of treated, you
know, you have some kind of finite compute, but infinite data.
I don't think there's much study of, you know, like, uh, you know,
finite data and infinite compute. And I think, you know, uh,
that also leads to a very rich playground for research.
Do we need kind of a revision to the bitter lesson?
Is that a refutation of the bitter lesson or, or do we just need to
re re rethink
what the definition of, of scaling laws looks like?
No, I don't think of anything as a refutation of the bitter,
really like our company is grounded in, we want simple ideas that scale.
I think RL is an embodiment of that.
I think pre-training is an embodiment of that. And really at every single scale,
we face some kind of difficulty of this form.
It's just like,
you got to find some innovation that gets you past the next bottleneck.
And this doesn't feel fundamentally very different from that.
What is, what's most important right now on the actual compute side?
We heard from Nvidia earnings that,
compute side, uh, we heard from Nvidia earnings that, uh,
that we didn't get a ton of guidance on the shift from, uh,
training to inference usage of Nvidia GPUs, but it feels like it must be coming. It feels like this inference wave is,
is, is happening. Uh,
are those even the right buckets to be thinking about tracking metrics in terms
of the
story of artificial intelligence because yeah, I mean, it's like, if,
if the reasoning tokens are inference tokens and, and, but they're,
what lead to higher intelligent, more intelligent models,
like it's almost back in the training bucket again. Um,
what buckets should we be thinking about and, and, uh, and, or, or are we,
how firmly are we in the, the, the, uh, the, the applied AI era versus the research
era?
Well, I think research is here to stay and it's for all the reasons I mentioned
above, right?
It's such a, like a rich time to be doing research, but I do think, you know,
inference is going to be increasingly important as well, right?
It's such a core part of RL that you're doing rollouts.
And I think, you know, we see 2025 as this year of agents, right?
We think of it as a year where models are going to do a lot more autonomous work.
You can let them kind of be unsupervised for much longer periods of time.
And that is just going to put big demands on inference, right?
When you think about kind of our overall vision, right?
We lay it out as a series of steps and levels
on the way to AGI, right?
And I think the pinnacle, really that last level,
is organizational AI, right?
Like you can imagine a bunch of AIs all interacting.
And yeah, I think that's just gonna put huge demands
on inference, right?
On that organizational question,
I remember reading AI 2027,
and one of the things that they proposed
was that the AIs would actually like literally
be talking to each other in Slack.
Does that seem like the way you imagine agents playing out,
like using the same tools as humans instead of kind of- Does that seem like the way you imagine agents playing out,
like using the same tools as humans instead of kind of- One agent says, I'm gonna go talk with teams,
and I'm gonna talk with Slack,
and I'm gonna do a little negotiating on a per sheet basis.
But maybe it just happens super, super fast 24 seven,
or is there like a new machine language that emerges?
Yeah, I mean, I think one thing
that's really helped us so far in AI development
is to come
in with some priors for how humans do things.
And that's actually, if you bake those priors, and they typically are great starting points.
So I could imagine maybe you start with something that's Slack-like and give it enough flexibility
that it can kind of develop beyond that and really figure out the way that's most effective
for it to communicate.
One important thing though is,
we want interpretability too, right?
I think it's very helpful for us today
that what the agents do is easy for us to read and interpret.
And I don't think you want that to go away as well.
So I think there's a lot of benefits just,
even from a pure like debug, the whole system perspective,
or just let the models speak in a way that it's familiar with us.
And you can also imagine like we might want to plug in to the system too.
Right. So, you know, whatever interfaces we're familiar with,
we would ideally like our model to be familiar with as well.
I think it's also pretty compatible with, you know,
we hit a big milestone.
We got, I think three million paying business users
for fairly recently.
Let's go!
Yeah, there we go, let's go.
Yeah.
Again, I think...
Three gong hits for three million.
The gong will keep ringing for a while.
Sorry, we had to do it.
I was hoping you would drop a number.
Yeah, yeah.
Congratulations. That's actually huge.
That's amazing.
But I think one big part of that is we have connectors now.
We're connecting into G drives.
I think you can imagine Slack integrations, things like that.
I think we just want the models to be familiar
with the ways we communicate and get information.
Yeah. Can you talk about benchmarking?
It feels like we're potentially entering-
Yeah, do you think about benchmarks at all?
Oh, yeah, a lot.
I mean, but I think it's a difficult time
for benchmarks, right?
I think we used to be in this world
where you have these human-written benchmarks
for other humans.
And I think we all have these norms for what
are good benchmarks.
We've all taken the SAT.
We all have a good conception of what
it means to get whatever score on that.
But I think the problem is the models are already
at the point where where for even the hardest
human written benchmarks for other humans, it's really near saturated or saturated, right?
I think one clear example here is the Amy, like probably the hardest autogradable like
human math eval, at least in the US.
And yet the models are consistently getting like 90 plus percent on these.
And so what that means is I think there's kind of two different things that people are doing,
right? They're developing kind of model-based benchmarks, right? They're not kind of things
that we would give to an ordinary human, like humanities last exam things like, you know
Epic AI that are really really at the at the frontier of what what people can do
And I think the the hard thing is it's not grounded in intuition, right?
Like, you know, you don't have a lot of people who have taken these exams
So it makes it harder to kind of calibrate on whether this is a good exam or not
One of the exciting things that's on the flip side of that is I really do
think we're at the era where models are going to start innovating, right?
Because I think once you've passed the last kind of like the hardest human
reading exams, that's kind of at the edge of innovation.
And I think you already see that with the models, right?
Like they're helping to write parts of papers.
And, and I think the other kind of way that, uh, people have shifted is, you know,
there's these ultra frontier evals,
but they're also people kind of just indexing on real world impact, right?
You look at your revenue, kind of the value you deliver to users. Um,
and I think that's ultimately what we care about.
Can you, can you, uh,
bring that back to interpretability research
like with these super, super hard math evals, for example,
are we doing the right research to understand
if the thought process mirrors,
not just one shotting the answer,
oh, you memorized it or you magically got it correct,
but you actually took the correct path, kind of like you, you, you, you, you memorized it or you magically got it correct, but you actually took the correct path kind of like, you know, you're
graded for your work, not just the answer. If you're in grade school. Um, and, and,
you know, Dario said that, uh, interpretive interpretability research
will actually contribute to capabilities and even give a decisive lead. Do you
agree with that? What's your reaction to that concept of interpretability
research being very important?
Yeah, I mean, we care a lot about it here at OpenAI
as well.
So one thing that we care a lot about
is interpreting how the model reasons, right?
Because I think we've had a very kind of specific and strong
view on this in that we don't want to apply optimization
pressure to how the model thinks so that it can be faithful
in the way it thinks and to expose that to us
without any kind of incentives to cater
to what the user wants.
I think it's actually very important
to have that unfiltered view because oftentimes,
if the model isn't sure, you don't want to hide that fact, right?
Just for it to kind of please the user.
And sometimes it really isn't sure, right?
And so we've really done a lot of work
to try to promote this norm of chain of thought,
faithfulness and interpretability.
And I think it gives you a lot of sense
into what the model is thinking and, you know,
what are the pitfalls that it can go off into if it's not reasoning correctly. That's such an
important point because if you have somebody on your team and they come to
you and they say hey you know I think this is the right answer but we should
probably verify it it's like it's still valuable totally puts you on the right
path if somebody comes to you a hundred percent confidence this is this is the
truth wrong like trust is just destroyed.
Yeah, totally.
Don't you guys feel like safety felt a lot more theoretical
a couple of years back?
But today, the things that people
were talking about a couple of years, scalable oversight,
really having the model be able to tell you and convince you
that the work it did was right, it feels so much more relevant
right now.
Just because the capabilities are so strong.
Yeah, I mean, just personally, I've completely flipped
from being like, oh, the safety research
is not that valuable because I'm not that worried
about getting paper clipped.
It just seems like a very low likelihood
that that's kind of like the bad ending,
like immediately in this foom and all this crazy
gray goo scenarios were just so abstract in sci-fi.
It just felt like economics will, will, will, will,
will fall into place and there will be a, like a, like a cold,
like a nuclear ending, which is like, we didn't build nuclear plants.
And we just stopped everything because we even seem to be good at that.
But now that we're actually seeing things.
Yeah, it's crazy how fast it's been, right? Like, um, I think my,
my, like my personal story is, it's like, you know, what,
what got me into a AI was AlphaGo, right? Like just watching it get to that level of capability.
Yeah. And you were kind of like, it was such an optimistic and also kind of a little bit of a
sobering message, right? When you saw at least it'll get beat. Um, and I just remember, you know,
like we, we saw the coding models, you know, when we first launched like, I think, very OG codecs, you know,
with GitHub Co-pilot, it was maybe like under, you know, a thousand Elo on code forces. And I still
remember the meeting where I walked into where the team showed my score and they're like,
hey, the model is better than you. And you come full circle and it's like, wow, like I put decades
of my life into this. And you know, the capabilities are there.
So like if, you know,
I'm kind of at the top of my field in this thing
and it's better than me, like what can it do?
Yeah. Yeah. That's amazing.
I have so many more questions on AlphaGo.
Are there, are there lessons from scaling,
how scaling played out there that you can,
that we can abstract
into the rest of AI research.
What I mean is, as I remember it,
the AlphaGo training run was not 100K H200s.
But what would happen if we actually did
an AlphaGo style training run?
I mean, it would be an economic money pit, right?
Like they had no economic value to do.
But let's just say some benevolent trillionaire decides
I'm gonna spend a billion dollars on a training run
to beat AlphaGo and go even bigger.
Is Go at some point solved?
Would we see kind of diminishing scaling curves?
Could we throw extra R out?
Could we port back everything that
we're doing in just general AGI research and just continue fighting it out
in the world of Go? Or does that end and does that teach us anything?
Yeah, honestly, I feel like if you really are curious about these mysteries,
join our team. That's the first thing I want to say.
Yeah, I mean, really kind of the central problem of today is RL scaling, right? Yeah. When you look at AlphaGo,
right? It's a narrow domain, right? Yeah. I think in some
sense, that limits the amount of compute you can pump into it.
But even kind of small toy domains, they can teach you a lot
about how you scale RL, like what are the axes where it's
most productive to pump scale in? I think a lot of scaling
research just looks like that, whether it's on productive to pump scale in. I think a lot of scaling research just looks like that,
whether it's on RL or pre-training.
So you identify a lot of different variables
under which you can scale.
And where is kind of where you get the best
kind of like marginal impact for pumping scale there.
I think that's a very open question for RL right now.
And I think what you mentioned as well,
it's just like going from narrow to broad, right?
Does that give you a lever to pump a lot more scale in as well?
I think when you look at our reasoning models today, they're a lot more broad
based than, you know, just being able to kind of an expert system on go.
So yeah, I really do think that there are so many levers to scale.
What about move 37? I really do think that there are so many levers to scale.
What about Move 37? That was such an iconic moment
in that AlphaGo LisaDoll match.
They placed Move 37, it's very unconventional.
Everyone thinks it's a blunder.
It turns out not to be, it turns out to be critical.
It turns out to be innovation.
Do you think we're certainly post-touring test
in language models.
We're probably post- post touring test in image generation
But it feels like we're pre move 37 in text generation in the sense that there hasn't been
Like a fully AI generated book that everyone is just oh, it's the new Harry Potter. Everyone has to read it
It's amazing and it's fully aged and it's fully generated or, uh, or this image,
the images they do go viral, but they go viral because they're AI move
37 in the context of go did not go viral because it was AI, but like
it was actual innovation.
So, uh, is that the right frame?
Does that make any sense?
Um, I think it's not the wrong frame.
So I think some, some quick thoughts on that.
Um, I think kind of, um, when you So I think some quick thoughts on that.
I think kind of when you have something
that's very measurable, like win or lose, right?
Something like go.
Yeah, it's like very easy for us to kind of just judge, right?
Like did the model do something right here?
And I think the more fuzzy you get,
it is just harder, right?
Like when it comes to, is this the next Harry Potter?
Right, like, you know, it's not a universally loved book.
I think, fairly universal, but you know,
there's some haters.
And yeah, I think it is just kind of hard
when it comes to these human subjective things
where it's really hard to put down in words,
like what makes you like Harry Potter, right?
And so I think those are always gonna lag a little bit,
but I think we're developing more and more techniques
to attack kind of these more open-ended domains.
And I don't know, I wouldn't say that we're not
at an innovative stage today.
So I think my biggest touch with this
was when we had the models compete on the IY last year.
So IY, it's like the international,
basically Olympics for computer science,
basically the top four kids from each country go and compete.
And these are really, really tough problems,
basically selected so that they require
some innovative insight to solve, right?
I think, and we did see the model come up with solutions,
even to some very ad hoc problems.
And so I think there was a lot of surprise for me there,
right?
I was completely off-base about which problems
The model would be able to solve the most right? Um, I think like I kind of categorized there's six problems
Some of them as more kind of like oh this standard a little bit more standard
This is a little bit more out of the box. It was like it's not gonna be able to solve this more out of the box one
but it did and I think I mean think that really does speak to kind of,
these models have the capacity to do so,
especially trained with RL.
Now, put that in context of what's going on with Arc AGI.
Obviously, OpenAI has made incredible progress there,
but it just, when I do the problems, it seems easy.
And when I look at the IOI sample problems,
I think this would be a 20-year process for me
to figure out how to achieve that,
and I can do the Arc AGI on my phone.
Is this the spiky intelligence concept?
Is this something that a small tweak in algorithmic design,
just one-shots Arc AGI,
or is there something else going on there
that we should be aware of?
Yeah, I mean, I think part of this
is the beauty of RKGI as well, right?
Like I think, I'm not sure if there's another kind
of like human intuitive simpler benchmark,
which is for the models.
I think really that's one of the things
they really optimize for on that benchmark.
I do think when it comes to models though,
like there's just a little bit of a perception gap as well.
Like, you know, models aren't used to this kind of native,
you know, like just screen type input.
I think there's a lot we can bridge there.
Actually, even 04 Mini,
it's a state of the art multimodal model in many ways,
including visual reasoning.
And I think, you know, you're starting to kind of
build up
the capacity for the models to take images,
manipulate and reason about them,
generate new images, write code on images.
And I think it's just been kind of under focused,
but I think when I talk to researchers in the field,
they all see this as a part of intelligence too,
and we're gonna continue to focus there.
Yeah, is RKGI, if we're dropping a buzzword on it,
is like program synthesis?
Is there a world where, I know the tokens,
like the images, we see them as renderings of squares
in different colors, but when they're fed into the LLM,
they're typically just a stream of, of numbers effectively.
Is there a world where actually adding a screenshot is what's important?
Like visual reasoning.
Yeah. Yeah. So I think, I think that could be important. It's just like kind of,
uh, you know, whenever it comes to like textual representation of grids, um,
models today just don't really do that well, right?
And I think it's just kind of because
humans don't really ever write down textual representations of groups or like, you know,
we have a chess board, like no one really kind of just like types it out in a grid. Like, um, yeah.
And, um, and so the models are kind of like under trained a little bit on on what that looks like and what that means. So, you know, I think with more reasoning, we'll just bridge the gap.
I think with better visual perception, we'll just bridge that gap.
Yeah. How are you thinking about the role of non lab researchers in the ecosystem today?
I'm sure you try to recruit some of the best ones, but the ones that don't join your team.
Tell us about the one that got away.
Yeah, the one that got away.
Yeah, no, I mean, I think it's still actually
a fairly good time for specific domains, right,
to be doing research.
And, you know, I think the style is just very different.
And you do feel the pull of non-lab researchers into labs
because I think they feel like a lot of the burning problems in the field are at scale, right?
And that's kind of one of the unfortunate things to you, right? Like when you look at reasoning, um
You just don't see that happen at small scale, right?
There's like a certain scale at which it starts becoming signal bearing and that requires you to have resources, right?
um starts becoming signal-bearing, and that requires you to have resources, right?
But I do think, you know,
a lot of the really good work that I've seen, you know,
there's experimental architectures.
I think a lot of good work
is happening in the academic world there.
Like a lot of study in optimization,
a lot of study in kind of like GANs, you know,
there's certain fields where you see
a lot of fruitful research that happens in academia.
Yeah, that makes a lot of sense.
How about consumer agents?
How are you thinking about them?
You talked earlier about sort of B2B adoption,
and that's all very exciting.
But how much do you and the research org
think about breakout consumer agent products?
Yeah, that's a fantastic question.
I think we think about it a lot.
I think that's the short answer.
You know, we really do think like this year
we're trying to focus on how we can move
to the agentic world, right?
And when I think about consumer agents,
I think like ChatGPD proved that, you know,
people got it, right?
It's like people get conversational agents
when they conversational conversational models.
But when it comes to consumer agents,
we have a couple of theses that we've tried out in the world.
I think one is deep research.
I think this is something that can do five to 30 minutes
of work autonomously, come back to you,
and really synthesizes information? It goes out there
gathers collects and kind of
You know compresses the information in a form that that's useful a little bit of a little bit of push back there
Like I can see that as a consumer product when someone like Aidan is like I want new towels
And he uses deep research to like figure out like what is the best towel across every dimension?
But when I think of deep research,
yes, it has applications with students, but it's often.
Some of them might just be the paradigm.
And I guess it could be consumers being like,
give me a deep research report on this country
and where to travel and things like that.
We keep using this flight example,
but I haven't actually tried to book a flight
with deep research.
It's totally possible that it could go
and pull all the different flight routes
and calculate all the different delays
and all the different parameters of,
if I fly to this airport I can park,
or I can use valet here or something like that, yeah.
Yeah, and I guess when I think of agents,
it's deep research is curating information
on which you can take action on,
but it's like at what point is action
a part of that sort of loop, right, where you can not only curate a
list of flights that you want, but then you know actually go out and have agency.
Yeah, I think one of our explorations in that space is operator, right?
It's where you kind of just feed in raw pixels from your laptop into or you
know from some virtual machine into the model.
And it produces either a click or some keyboard actions.
Right. And so there it's taking action.
And I think the trouble is, you know, it you don't ever want to mess up when you're taking action.
Yeah. I think the cost of that is super high.
You only have to get it wrong once to lose trust in a user.
And so we wanna make sure that that feels super robust
before we get to the point where we're like,
hey, look, here's a tool.
That's so different than deep research
because you can wind up on some news article
and read one sentence that gets a fact wrong,
or the commas in the
wrong place and the numbers off. And, but that's just the expectation for just text and analysis.
And if you delegated that, yeah, you're going to expect a few errors here and there. Oh,
that's actually a different company name or that's the, that's an old data point. There's
new data, uh, but very different. If I book a flight and you book the wrong flight and
I can wind up in Chicago instead of New York
Exactly, and I think the reason why we care so much about reasoning is because I think that's the path that we get reliable agents through
Sure. Um, you know, we've talked about like reasoning helping safety, but reasoning is also helping reliability, right? It's like you imagine like
What makes a model so good at a math problem? It's banging its head against it.
It's trying a different approach,
and then it's adapting based on what it failed at last time.
And I think that's the same kind of behavior
you want your agents to have.
It's like, tries things, adapts,
and keeps going until it succeeds.
And that's the, humans do this every day.
You're booking a flight, you keep hitting an error.
It's not which form you missed, right? And you're just sort of banging your head against the computer and eventually it says okay
You're booked right? Yeah, I think that's a great call out. Yeah
I mean there's so many more questions if you go into but
I'm interested in the scaling of RL and kind of the balancing act between
pre-training RL inference, just the amount of energy
that goes into getting a result
when you distribute it over the entire user base.
How is that changing?
And I guess,
are we post like really big runs?
Is this gonna be something that's like
continually happening online?
It feels like we're moving away from the era of like, oh, some big development, some big run happened and now we're grouping the fruits of it versus a more iterative process.
Um, yeah, I mean, I don't see why it has to be so right. I think like if you find the right levers, you can really pump a lot of compute into RL as well as pre-training.
find the right levers, you can really pump a lot of compute into RL as well as pre-training. I think it is a delicate balance though between all of these
different parts of the machine. And you know, when I look at my role with
Yakub, it's just kind of like, figure out where, how this balance should be
allocated, where the promising kind of like nuggets are arising from and
resourcing those. Yeah, it's kind of a, in some sense, I feel like part of my job is a portfolio manager.
That's a lot of fun.
Well, thank you so much for joining.
This was a fantastic conversation.
We'd love to have you back and go deeper.
Great hanging, Mark.
We'll talk to you soon.
Yeah, peace.
Have a good one.
Next up, we have Shalto Douglas
from Anthropic coming on this show.
I'm getting so many.
Jordy is giving us the update on that.
No, I'm just getting a lot of messages saying why no one cares about AI.
Talk about the drama on the timeline.
Well we do care about AI. We care a lot about AI.
But it is a mess out there.
Wow. The end of the Trump-Elon era. I don't know.
Maybe we have to get some people on to talk about it
tomorrow or something.
We're going to do it today.
Anyway, we have Shalto from Anthropic in the studio.
How are you doing?
What's going on?
Good to see you guys.
Hopefully, you're staying out of the chaos on the talk.
Don't open the time.
Don't open.
We're doing you a favor.
Sweet child.
Move to Twitter.
Move to Twitter. Yeah, mute everything. Stay focused on the application layer. Stay focused on the time. Don't open. What do you favor? Sweet child. Moves to Twitter. Moves to Twitter right now.
Yeah, mute everything.
Stay focused on the application layer.
Stay focused on the mission.
Stay focused on the next training run.
Humanity really cannot afford for any AI researchers
to open X today.
What a hilarious day.
Anyway, I mean.
I mean, it's a black-out 24 hours, guys.
Yeah.
How are you doing?
What is new in your world?
What are you focused on mostly day-to-day?
And maybe it's just a way of an intro. Yeah, so at the moment focused really hard on scaling RL
I mean that is the theme of what's happening this year and we're still seeing these huge gains
We go, you know 10x compute increase in RL
We still getting like very distinct linear gains
Based on that and because our role wasn't really scaled anywhere close to how much pre-training
was scaled at the end of the, at the end of last year, we have like a,
basically a gamut of like riches over the course of this year.
So where are we in that, in that RL scaling story?
Because I, I remember the, the,
some of the rough numbers around like GPT two, GPT three,
we were getting up into like, it cost $100 million.
It's going to cost a billion dollars. Like it just rough order of magnitude, not even
from entropic, just generally like what is a big RL run cost or how many are we talking
10K H200s or 100K? Like, are we going to throw the same resources at it? And if so, how soon?
Yeah. So I think in Dyer's essay at the beginning of the year, he said that a lot of runs were only like a million dollars back in like December. I think you have like Deep naively parallelizable and scalable than pre-training. In pre-training, you need everything in one big data center,
ideally, or you need some clever tricks.
RL, you could, in theory, like what the prime intellect folks
are doing, scale it all over the world out of it.
And so you are held back far less than you are in pre-training.
Sure.
So everyone and their mother has a billion dollars now.
Hundreds of thousands of GPUs getting pumped all over the place.
I feel like we're not GPU poor as a, as a,
as a society. Uh, maybe some companies need to justify it in different ways,
but it sounds like there's some sort of, uh, uh,
like reward hacking problem that we're working through in terms of scaling RL.
What are all of the problems that we're working through to actually go deploy the capital cannon at
this problem? Yes, so I mean think about what you're asking the model to do in RL
is you're asking it to achieve some goal at at any cost basically. Yeah. And this
comes with a whole host of like behaviors which you may not intend. In
software engineering this is really easy. I easy. It might try and hack unit tests or whatever. In much more longer horizon real world tasks, you
might ask it to say go make money on the internet. And it might come up with all kinds of fun
and interesting ways to do that unless you find ways to guide it into following the principles
that you want it to obey, basically, or to align it with your idea of what's sort of
best for humanity.
And so it's actually, it's a pretty intensive process.
There's a lot of work to find out and hunt down all the ways these models are
hacking through the rewards and patch all of that.
Yeah.
Are we going to see scaling in the number of rewards that we're RLing against,
if that makes sense, I would imagine that at a certain point,
unless we come up with kind of like the Genesis prompt,
go forth and be fruitful or something and multiply,
you could imagine training runs on just knocking down
one problem after another, and is that kind of the path that we're going down?
I very much think so.
There's this idea in which the world becomes an RL environment
machine in some respects.
Because there's just so much leverage
in making these models better and better at all the things
we care about.
And so I think we're going to be training on just everything
in the world.
Got it. And then, and then does that lead to,
um, more model fragmentation models that are good at programming versus writing
versus poetry versus image generation or, or,
or does this all feed back into one model?
Does the idea of the consumer needing to pick a model disappear?
Are we in a temporary period for that paradigm?
to pick a model disappear? Are we in a temporary period for that paradigm? I think the main reason that we've seen that so far is because people are trying to make
the best of the capital. We are all still GPU poor in many ways. And people are focusing
those GPUs on the spectrum of wars that they think is most important. And I'm a bit of a big model guy.
I really do think that similar to how we saw
with large pre-trained models before,
with small fine-tuned models made it,
like had gains over the sort of GPT-2 era,
but then were obsolete by GPT-4
being generally good at everything.
I think to be honest,
you're gonna see this generalization
and learning across all kinds of things.
That means you benefit from having large single models rather than specialization or area
fine tuned models.
Can you talk a little bit about the transition from or many, any differences between RLHF
and just other RL paradigms?
Yes. So RLHF, you're trying to maximize a pretty deep, likey signal things like airwise like what the humans prefer
And I don't know if you've ever tried to do this like judge to language model response
I get prompted for that all the time right and I'm always like I don't want to read both of those
I'll just click the one exactly exactly. Yeah, I click one of the random ones
Yeah, or I click like the one that just looks bigger or I'll read the first two sentences, but yeah, I'm not giving straight. I'm not, I'm not being, I'm not doing my job as a, as a
human reinforcer.
Exactly. Human preferences are easy to hack. Yeah, totally. Environments in the world are
much truer. You can find them. So something like, did you get your math question right?
Is a very real and true reward.
Does the code compile, right?
Does the code compile? Exactly. Did you make a scientific discovery? We've got very little
rewards right now, but pretty quickly over the next year or two, you're going to start
to see much more meaningful and long horizon rewards.
You're going to see models bribing the Nobel committee to win the Nobel Prize.
Well, we'll get a good reward hack.
There's reward hacking.
But that's something we want to prevent, right?
Exactly. Yeah, that hacking. But that's something you want to prevent, right? Exactly.
Yeah, yeah, that's the real nightmare scenario.
What about, like, there's so many different problems
that we run into that feel like it's just really, really hard
to design any type of eval that my kind of benchmark
that I use whenever a new model drops is just tell me a joke. They're always bad and or or or even even the latest VO3 video that went
viral was somebody said like stand-up comedy joke and it was kind of a funny
joke but it was literally the top result for joke read it on Google and then it
clearly just took that joke and then instantiated in a video that looked amazing but it wasn't original in any
way and so we were joking about like the RLHF loop for that is like you have an
endless cycle of comedians running AI generated materials and then and then
you know speak microphones and all the comedy clubs to feedback what's getting laughs.
But, but.
I mean, honestly, that would work pretty well.
Yeah.
If any comedians wanna hook us up with an RL loop, I mean.
Yeah, yeah, but I mean, for some of those less,
like as you go down the curve,
it feels like each one gets harder and harder
to actually tighten the loop.
We see this with like longevity research
where it's like, okay, it takes 100 years to know if you extended a human life. Like the,
yes, you could create a feedback loop around that,
but every change is going to be hundreds of years.
And so even if you're on the cycle,
it's irrelevant for us in the context that we talk about AI. So, uh,
talk to me about like, are you running into those problems or, or, or,
will there be like another approach that kind of works around those?
So there are a lot of situations
where you can get around this
by just running much faster than real time.
Like let's say the process of building a giant app,
like building Twitter, right?
It's something that would take human months,
but if you got fast enough and good enough AIs,
you could do that in several hours.
Acralize heaps of AI agents that are building,
you know, things like that.
And so you can get a faster reward signal in that way.
In domains that are less well-specified like humor, I agree, it's really, really hard. And
this is like why I think in some respects, like creativity is
like at the at the top end of the spectrum, like true
creativity is much, much harder to replicate than the sort of
like analytical scientific style reasoning. Yeah. And that will
just take more time. You know what the models actually are
pretty good at making jokes about being an AI this news fresh
Like everything else is kind of a weird copy of something like it's like it just it feels like it's derivative basically
It's trying to infer what humor is and it doesn't really understand it but jokes about being an AI are quite funny
Yeah, I I think this also might be I don't know if it was directly reward hacking, but I noticed that
I think this also might be, I don't know if it was directly reward hacking, but I noticed that, uh,
one of the new models dropped and a bunch of people were posting these like four
Chan, like B me memes. And, and, and they were,
it seemed like they were kind of hacking the humor by being hyper specific about
an individual that they could find information on online.
And so you're laughing at the fact that it's like, Oh, wow,
that is like something that I've posted about it. It's making a reference,
but it's not really that funny to me. It's other than it's like, oh wow, that is like something that I've posted about. It's making a reference, but it's not really that funny to me.
It's other than it's just like, wow, they really did its research.
Like it really knows Tyler Cowan intimately, which is cool, but I didn't find it hilarious.
Yeah, yeah, yeah. Very interesting.
Let's talk about some sort of deep research projects and products.
We were talking to Will Brown and he was saying like, AGI is here with some of the bigger sort of deep research product projects and products.
We were talking to Will Brown and he was saying like, AGI is here with some of the bigger models,
but the time that AGI can feel consistent, it diverges.
And so you could be working with someone who's, you know,
100 IQ, but they will stay consistent for years
as an employee or they'll keep living their life.
Whereas a lot of these super smart models
are working really well and then after a few minutes
of work the agents kind of diverge
and kind of go into odd paradigms.
It feels very not human.
It feels like just a, they're hyper intelligent in one way
and then extremely stupid in another.
What's going on there?
What is the path to extending that?
Is that more like having more better planning
and better dividing up the task?
Or will this just kind of naturally happen
through the RL and scale?
Yeah, so there's that jaggedness, right?
Which is what you're seeing, is how we call it.
And I think that is largely a consequence
of the fact that maybe something like deep-suit research,
it's probably been RL'd to be really good
at producing a report.
But it's never been RL'd on the act
of producing valuable information for a company
over a week or a month, or making sure the stock price
goes up in a quarter or something like this.
It doesn't have any conception of how that feeds
into the broader story at play.
It can kind of infer it, because it's got a bit of world
knowledge from the, you know, the base model and this kind of stuff.
There's never actually been trained to do that in the same way humans have. Um,
so to extend that, you need to put them in much longer running, much like, like,
you know, long horizon things. Um, and so, so deep research needs to become,
you know, like deep operate a company for a week kind of thing.
Is that the right path? Like it feels like the road might be, there's a, like
the longest running LLM query used to be just like a few seconds, maybe a few minutes. And I
remember when, uh, when some of the reasoning models came out, people were almost trying to like
stunt on it by saying like, Oh, I asked it a hard question. I thought for five minutes.
Now deep research is doing 20 minutes pretty much every time. Um,
is the path two hours, two days, or are we going to see more, uh,
efficiency gains such that we just get the 20 minute model,
the 20 minute results in two minutes and then two seconds.
Yeah. So this is somewhere where like inference in many respects and
prioritization becomes really important. So both how fast is your inference,
if that literally affects the speed at which you can think
and the speed at which you can do these experiments,
also how easily you can parallelize
becomes really important.
Can you dispatch a team of sub-agents
to go and do deep research and compile sub-reports for you
so that you can do everything in parallel?
These kinds of, it's both,
there's an infrastructure question here, um,
that feeds up from the hardware and the chips and this kind of stuff, uh,
to designing better chips for better inference and all this. Um,
and, and an RL question of like, you know,
how well can you pro-lize and all this?
So I think we just need to compress the timelines, compress the time,
the compress, the timeframes, basically.
Yeah. So, uh, if I'm, if I'm like an extremely big model and I'm running an agentic process, like how much
am I hankering for like a middle sized model on a chip or like baked down into silicon
that just runs super fast because it feels like that's probably coming.
We saw that with the Bitcoin progression from CPU to GPU to FPGA to ASIC.
Do you think we're at a good enough point
where we can even be discussing that?
Because every time I see the latest mid-journey,
I'm like, this is good enough.
I just want it in two seconds instead of 20.
But then a new model comes out,
and I'm like, oh, I'm glad I didn't get stuck on that path.
I'm just delving.
But yeah, how far away from how far away are
we from? Okay, it's actually good enough to bake down into
silicon.
Well, there's a question here of baking it down to silicon
versus designing a chip, which is like very suited to the
architecture that you're about. Right. And baking on the
silicon, unsure, like, I think that's a bet you could take. But
it's a it's a risky one, because the pace of progress is just so
fast nowadays. And I really only expected to accelerate but designing things that make a lot of sense for those are the
Transformers or architectures of the future
Should should make a lot of it. That's a big gap though
Transformers or architectures of the future if we diverge there's a lot of companies that are banking on the transformer sticking around.
What is your view on transformer architecture
sticking around for the next couple of years?
I mean, look, they stuck around for five years,
so they might stick around for a little while,
but there's different, you think about architectures
in terms of this balance of memory bandwidth and flops,
right, one of the big differences we've seen here
is Gemini recently had actually a diffusion model
that they released at a high you the other day, right?
Yeah. So diffusion is inherently extremely flops intensive
process. Whereas normal language model decoding is extremely
memory bandwidth intensive, you're designing two very
different chips, depending on which bet you think makes sense.
Yeah. And if you think you can make something that does flops
like four times faster than diffusion, and like four times
cheaper than your those code, fusion makes more sense. So
there's like, there's this dance basically, between the chip providers and the architecture,
both trying to build for each other,
but also build for the next paradigm.
It's risky.
Do you, I don't know how much you've played
with image generation, but do you have any idea
of what's going on with images in ChatGPT?
It feels like there's some diffusion in there,
there's some tokenization,
maybe some transformer stuff in there. It almost feels like there's some diffusion in there, there's some tokenization, maybe some transformer stuff
in there, it almost feels like the text is so good
that there's like an extra layer on top almost,
and that it's almost like reinventing Photoshop.
And I guess the broader question is like,
it feels like an ensemble of models,
maybe the discussion around just agents
and text-based LLM interactions shouldn't necessarily be
transformer versus diffusion, but maybe how will these play together? Is that a reasonable
path to go down?
Well, I think pretty clearly there's some kind of rich information channel between,
even if there are multiple models there, it's conditioning somehow on the other model because
we've seen before, let's say when you know models use
mid-journey to produce images it's never quite perfect it can't perfectly
replicate what went in as an input it can't perfectly like adjust things so
there's a link somehow whether that's the same model producing tokens plus
diffusion I don't know like yeah can't comment on what open air is doing there
yeah yeah yeah are there any other kind of like super wild card,
long shot, uh,
research efforts that are maybe happening even out, even in academia, where,
I mean, this was the big thing with, uh, what was his name? Gary. Uh,
he was talking about, I forget what it was called. Symbolic symbol.
Manipulation was a big one. And, and I feel like, you know, you can never count anyone out because it might come
from behind and be relevant in some ways.
Um, but, but are there any other research areas that you think are like purely in
the theory domain right now that are worth looking into or tracking that, you
know, low, low, low probability, but high upside if they work.
That's how fun this is tough one. But we'll say some symbolic
thing, please. It's crazy how similar transformers are to
minutes systems that manipulate symbols. Sure. What they're
doing is they're taking a symbol and they're like converting it
into a vector and then they're manipulating and moving stuff
like information around across them. Sure. Like, this this
whole like, debate that all transformers can represent symbols and they cannot do this, it's not real.
So Garry Mark is underrated or overrated, I guess?
Overrated. if you twist it so much, you wind up with saying, well, really, the transformer fits within that paradigm.
And so maybe it's, you know,
the rhetoric around it being a different path
was maybe false the whole time.
Something like that.
But as I remember that debate,
it was really the idea of compute scaling
versus almost feature engineering scaling and
will the progress scale with human hours or GPUs essentially and that has a very different
economic equation and it feels like there's been some rumblings about maybe with a data
wall will shift back to being human labor bound,
but do you think that there's any chance that that's relevant in the future or
is it just algorithmic progress married with bigger and bigger data centers in
the future?
So I'm pretty bitter lesson built,
hence that I do think removing as many of our biases and our like clever ideas
from the models is really important. just like freeing them up to
learn. Now, obviously, there's like, there is clever structure
that we put into these models such that they're able to learn
in this extremely general way. And that but I am more convinced
that we will be compute bound, then we will be like human
researcher, out human research, our bound on on this kind of
thing, like we're not going to be feature engineering and this kind of stuff.
Sure.
We're going to be trying to devise incredibly flexible learning systems.
Yeah, that makes sense.
On the scaling topic, part of, I,
I, I, I, part of my like worry is that the,
the OMS gets so big that they turn into these mega projects that are, at a certain point you're bound by the laws
of physics because you have to move the sand
into silicon chips and you have to dig up the silicon.
And at a certain point.
Yeah, there's only so much sand and like the math gets
really, really crazy just for the amount of energy required
to move everything around to make the big thing.
Where are you on how much scale we need to reach AGI?
Whether or not we will see the laws of physics
start acting as a drag on progress
because it certainly feels exponential.
We're feeling the exponentials,
but a lot of these turn into sigmoids, right?
Yeah.
So I think we've got what, like two or three more ooms
before it gets really hard.
Leopold has this nice table at the end of his,
then the situational awareness.
I think like 2028 or something is when,
under really aggressive timelines,
that you get to 20% of US energy production.
It's pretty hard to go exponentially beyond 20%
of US energy production.
Now, I think that's
enough. Every indication I'm seeing says that's enough. Now
then, there might be some complex, you know, data
engineering, we're one engineering, this kind of stuff
that goes into lots of places, there's still a lot of
algorithmic progress left to go. But I think that with those
extra rooms, we get to basically
a model that is capable of assisting us
in doing research and software engineering.
Yeah, which is the beginning of the self reinforcement.
Yeah, exactly.
Interesting, is that just a coincidence?
Like this feels like one of those things,
this feels like one of those things
where like the moon is the exact same size
as the sun in the sky.
It's like, oh, it just happens that AGI happens
within this time, like, whoa,
did you unpack that anymore? Because it feels convenient, not, did you have you unpacked that anymore?
Because it feels convenient. Not to, you know, I know.
There's a lot of weird conveniences are like weird. It's a good sci fi story. Let's say
totally.
We've got a, you know, Taiwan in between China and the U S and it produces the most valuable
material in the world. It's locked between the two credible plot.
Yeah. Yeah. Yeah. Really bad for the people that don't think of,
that don't believe in simulation theory.
It really feels like Alan is descriptive.
It's program.
It's fascinating.
Talk to me more about getting to an ML engineer in AI
and kind of that reinforcement.
I imagine that you're using AI code gen tools today
and anthropic is broadly and everyone is, um, but, but, uh,
what are you looking for and what are the,
what's the shape of the spiky intelligence, where do they fall flat?
And what are you looking to kind of knock down in the interim before you get
something that's just like, go.
Yeah. So, I mean, we definitely use them the other night. I like,
I was a bit tired to ask them to do something,
just sat watching it in front of me working for half an hour. It was great. It was truly weird experience, particularly when you look back a year ago. And we're still copy pasting stuff between a chat window and a code file. Yeah.
for this kind of stuff. So they have a bunch of evals where they measure like the ability to write a kernel, the ability to
run a small experiment and improve a loss. And they have
these nice progress curves versus humans. And I think this
is maybe the most accurate reflection of like what will
take for it to really help us during progress. And there's a
mix here, like, where they're not so great at the moment is
like large scale distributed systems engineering, right,
like debugging stuff across heaps and heaps of accelerators.
And like the way the feedback loops are slow and like, if
your feedback loop is like an hour, then it's you spending the
time on on doing something. Yeah, I'm feedback is 15 minutes.
If it's much in for context there, the hour long feedback
loop is just because you have to actually compile and run the
code across.
In up all your machines or you need to run it for a while
to see if something's gonna happen.
At that point in time, you're still cheaper than the chips.
Sure.
It's better that you do it.
But for things like kernel engineering
or for actually even just understanding these systems,
incredibly helpful.
One thing I regularly do at the moment
is in parts of the code base,
in languages that I'm unfamiliar with,
or stuff like this,
I'll just ask it to rewrite the entire file,
but with comments on every line.
Game changing.
It's like-
Comments on every line.
Yeah, or just come through thousands of files
and explain how everything interacts to me,
draw diagrams, this kind of stuff.
It's really, yeah.
Yeah, how important is a bigger context window in that example you gave that feels like something that's important and yet
It just naively like Google's the one that has the million token context window
I imagine that all the other frontier labs could catch up
But it seems like it hasn't been as much of a priority as maybe like the PR around it sounds like is that important?
Should we be go should we be driving that up to like a trillion token window? Um, is that,
is that just going to happen naturally?
There's a nice plot in the Gemini 1.5 paper, uh,
where they show the like loss over tokens as a function of context length.
And they show that the loss goes down quite steeply actually,
as you put more and more and more like code base into context,
you get better and better and better at predicting the rest.
Context length, it's a cost. Um, you know, the way transformers work is that, uh, there's, you know, you get better and better and better predicting the rest. Yeah, that makes sense. The context length, it's cost. Yeah, the way transformers work
is that there's, you know, you have like this, this memory that
is proportional, the KV cache is proportional to how much
context you've got. And so you can only fit so many of those
into like that your various chips and this kind of stuff. And
so longer context actually just costs more because you're taking
up more of the chip and you're sort of like, you could have otherwise been doing other requests basically.
So bringing it back to the custom silicon, is that a unique advantage of the TPU? Is
that something that Google has thought about and then wound up to put themselves in this
advantage position? Or is it a durable advantage even?
Yeah. So TPUs are good in many respects, partially because you can connect hundreds or thousands
of them really easily across really great networking. Whereas only recently has that
been true for GPUs.
With NVLink?
Yeah, with NVLink and the MVL72 stuff. So it used to be like eight GPUs in a pod and
then like you connect them over worse in a connect. And now you can be 72 and then it
breaks down. With Google DPS, you can do like 4,000, 8,000
of a really high bandwidth interconnect in one pod.
And so that is helpful for things like
just general scaling in many respects.
I think it's doable across any chip platform,
but it is an example of like somewhere
that being fully vertically integrated is using a benefit.
Yeah, that makes sense.
Talk to me about Arc AGI. Why it so hard it seems so easy it does seem easy
doesn't it well it certainly seems like more more evaluatable than tell me a
funny joke right yeah yeah I mean I think if you are old on Arc AGI then it
would you probably get superhuman at it pretty fast but I think we're all trying
not to RL on it
so that it functions as like an interesting held out text.
Sure.
OK.
Is that just an informal agreement
between all the labs, basically?
Yeah, we're trying to have a sense of honor between us.
That's good.
Sense of honor.
That's amazing.
How many people on Earth do you think
are getting the full potential out of the publicly available
models?
Because we're now at a point where we have you know billion plus people are using AI almost daily and yet I have to my sense would be
it's maybe like 10 000 20 000 people on the entire planet are getting that sort of full
potential but I'm curious what your assessment would be. Yeah I completely agree I mean I think
that even I don't get the full potential out of these models often.
And I think as we shift from you're asking questions
and it's giving you sensible answers to you're asking it
to go do things for you that might take hours at a time,
and you can really like parallelize and spin,
we're going to hit yet another inflection point
where even less people are really effectively using
these things.
Because it's basically going't require you to like
It's like a like starcraft or Dota like it's gonna be like your APM of like managing all these agents and that's totally
Process. Yeah, so I'm starcraft is such a good example
You think you're just absolutely crushing it and then you realize like there's an entire area of the map. You're just getting destroyed
It's such a good it's such a good comp
I'm sure I know. Yeah, exactly.
It's such a good comp.
That's great.
Anything else, Jordy?
I think that's it on my side.
I mean, I would like this to be an evolving conversation.
Yeah, this was fantastic.
We'd love to have you back and keep jetting.
Absolutely, it was really fun.
Love to go back on class.
Yeah, we'll talk to you soon.
Cheers, Shulta.
Have a good one.
All right, we got Emmet.
The worst possible AI day.
Yeah, so for context, folks, we are
going to be doing a live timeline and turmoil
segment at 2 PM PST.
So if there's posts you want us to cover,
you can go send them.
I'll put this in the chat as well.
A few more.
Pull one up.
I'm going to do some ads because we
got Emmet Shearer coming in the temple in just a few minutes. let me tell you about numeral sales tax on autopilot spend less than five minutes per month on sales tax compliance
Go to new sales tax you a gi calm
Very excited for them. Also go to public calm investing for those who take it seriously
They have multi asset investing industry leading, and they're trusted by millions.
Millions.
In other news, Tim Sweeney continues to battle Apple.
Apparently, if you search for Fortnite
on the Apple App Store, he says,
hey kids, looking to play Fortnite?
Try this crypto and stock trading app instead,
rated for ages four plus, courtesy of Apple App Store ads.
So I'm gonna give you the latest Trump terror Elon post four minutes ago
The Trump terrorists will cause a recession in the second half of this year Wow
Somebody else was saying can I finally say that Trump's tariffs are super stupid
Somebody else is posting mads posting is saying it's a jiji ping. He says bro you seeing this and it's
Putin on the other end. He's just looking at it. Hold up got a line and it's
We'll start pulling some of these up
Ridiculous
What else is going on here this is the present versus Elon. Neville says Elon's stance is principal. Principal Trump stance is practical. Tech needs Republicans for the
present. Republicans need tech for the future. Drop the tax cuts, cut some pork,
get the bill through. This is so crazy. Antonio Garcia says remember there's a
few money and then there's F the world
money will Stancil says imagine being the ice agents suiting up for your
biggest mission of all time right now people are saying that Trump's gonna
deport you on back to the South Africa well the pew says time to drop the
really big bomb growing Daniels in the Epstein files. No. That is going to turn into a coffee pasta.
That is a real piece of coffee.
Oh, no.
Deleon.
We had a question from a friend of the show.
He said, the real question is if Tesla is down 14%,
how could SpaceX and OpenAI be trading
if they were, how would they be trading if they were public?
The real thing here is it's bad for everyone, right?
DJT is down, Trump coin is down.
Nobody's really winning here.
China is up.
Yeah.
Oh really?
Sean McGuire, I mean, I'm just saying like at a high level.
Yeah, yeah, yeah.
You know, China is the big beneficiary here of Sarah Gross as if anyone has some bad news to bury,
might I recommend right now?
Yes, yes, yes.
If you have, if you, what's the canonical bad startup news?
Like, oh yeah, you missed earnings or something.
Drop it now.
Inverse Kramer says, Bill Ackman is currently
writing the longest post in the history of this app.
And we have a video from Trump here, if we want.
I can throw it in the tab, and we can share it on the stream
and react to it live.
Lex Friedman says to Elon, that escalated quickly,
triple your security.
Be safe out there, brother.
Your work, SpaceX, Tesla, XAI, Neuralink,
is important for the world.
We need to get Elon on the show today.
If somebody's listening and can make that happen,
I would love to hear from him.
Max Meyer says, so I got this wrong.
I didn't say it never happened, but I thought it wouldn't.
I am floored at the way this has happened.
He didn't think they would have a big breakup.. He didn't think they would have a big breakup.
Many people didn't think they would have a big breakup.
Even just earlier this week, it seemed like they might just
have a somewhat peaceful exit.
Trump just posted a little bit ago,
I don't mind Elon turning against me,
but he should have done so months ago.
This is one of the greatest bills ever presented
to Congress.
It's a record cut in expenses 1.6 trillion dollars and the biggest tax cut ever given if this bill doesn't pass there will be a 68%
Tax increase and things far worse than that. I didn't create this mess. I'm just here to fix it
Anyways lots going on
Let's go to this Trump video.
I want to see what he has to say.
The criticism that I've seen, and I'm sure you've seen,
regarding Elon Musk and your big, beautiful bill.
What's your reaction to that?
Do you think it in any way hurts passage in the Senate,
which of course, what is your seeking?
Well, look, you know, I've always liked Elon,
and I was always very surprised.
You saw the words he had for me.
The words are for, and yes,
it said anything about me that's bad.
I'd rather have him criticize me than the bill,
because the bill is incredible.
Look, Elon and I had a great relationship.
I don't know if we're well anymore.
I was surprised because you were here.
Everybody in this room, practically, was here
as we had a wonderful send-off.
He said wonderful things about me.
You couldn't have nicer.
He said the best things.
He's worn the hat.
Trump was right about everything.
And I am right about the great, big, beautiful bill.
But I'm very disappointed because
Elon knew the inner workings of this bill
better than almost anybody sitting here,
better than you people.
He knew everything about it.
He had no problem with it.
All of a sudden, he had a problem.
And he only developed the problem when he found out
that we're going to have to cut the EV mandate,
because that's billions and billions of dollars.
And it really is unfair.
We want to have cars of all types.
Electric.
We want to have electric, but we want to have gasoline,
combustion.
We want to have different.
We want to have hybrids.
We want to have all. We want to be able to sell everything. He hasn't said bad about me personally, but I'm sure that'll be next. But I'm very disappointed in Elon.
I've helped Elon a lot.
The Press.
Mr. President, did he — I just want to clarify — did he raise any of these concerns with
you privately before he raised them publicly?
And this is the guy you put in charge of cutting spending.
Should people not take him seriously about spending an hour?
Are you saying this is all sour grapes?
No, he worked hard and he did a good job.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard.
He worked hard. He worked hard. He worked hard. He worked hard. He worked hard. publicly and this is the guy you put in charge of cutting spending. Should people not take him seriously about spending an hour?
You're saying this is all sour grapes?
No, he worked hard and he did a good job.
And I'll be honest, I think he misses the place.
I think he got out there and all of a sudden he wasn't in this beautiful oval office
and he's got nice offices too.
But there's something about this when I was telling the Chancellor.
Folks, breaking news, Deleon, that'sian that's brew have is joining us in the temple for some live reactions
come on yes I can't even spell surprise gas I'm so excited about this yeah in
other news 11 labs dropped a new product
In other news, 11 Labs dropped a new product. Absolutely.
Another news.
$2 million seed round.
Stop it.
Stop it.
We love 11 Labs.
No, they'll keep grinding.
But just launch again tomorrow.
You're going to have to launch again.
Start shooting a new Vibreel.
Start shooting a new, writing a new blog post,
because no one's good
Lulu says yes delay the launch on
TV so basically right now. I can just pull up and just read here. I'm gonna just be refreshing true
So okay Jordy's on truth social. I'll be on X. Give us your reaction tell him what's going on
I'm just you know sort of scrolling X and I like to do you guys like an hour ago And I was like they're talking about something I think I was like
Switch to like we have news like and then I was watching that's like, okay John John
Resisted I fought it for like for like a half an hour
But we couldn't do it
Yeah, give us your quick reaction. I mean always you know you know, sort of give it from the, you know,
sort of space angle, you know, it's amazing that, you know,
how much the world has shifted since, you know, Friday of last
week, whereas you just have presumed the Jared Isaacman was
going to be the NASA admin to today, it was released that the
Senate reconciliation package re added budget back into NASA,
largely for the
SLS program which was basically the program that you know Jared and Elon
were you know sort of largely advocating to you know sort of completely shut down
so yeah the it is already showed like you know there's sort of counter
reaction you know is already showing up you know in in policy sorry SLS program
is that space shuttle or no
sorry that's the SLS launch rocket okay based off of old space shuttle hardware
but it is basically the internal you know sort of NASA run competitor
effectively to like a Starship heavy launch rocket yeah you know because it
was you know sort of generally behind budget behind schedule and there are so
many commercial heavy lift rockets coming online, the default was canceled.
That is largely, you know, sort of a Boeing-based program. And so, you know, if you look at,
you know, you know, three months ago, you know, when they were announcing the F-47 program,
you know, Elon walks into the secretary of the Air Force's office, obviously, he'd been,
you know, sort of ranting against, you know, sort of manned fighter jets and believing
that that shouldn't be what, you know, be what the department is prioritizing.
30 minutes after that meeting was when they announced the F-47 program.
And so now you're seeing basically like the equivalent in space where, you know, that was obviously awarded to Boeing.
Boeing was the, is the largest prime behind, you know, SLS.
You know, Boeing basically, you know, is going to be the biggest winner of, you know, NASA refunding,, sort of SLS and Jared Eisenman not being a NASA administrator.
So tying this back to the timeline,
Trump posted less than 30 minutes ago,
in light of the president's statement
about cancellation of my government contract,
SpaceX will begin decommissioning its Dragon spacecraft
immediately, break that down.
I mean, that just means that we no longer have a vehicle
that can go to the International Space Station. We no longer have a vehicle that can take astronauts up and down. I mean, that just means that we no longer have a vehicle that can go to the International Space Station.
We no longer have a vehicle that can take astronauts up and down.
We also don't have a vehicle that can de-orbit the International Space Station safely, right?
The Dragon was expected to be able to do that.
So what that means is, if you guys remember all the memes about Stranded from last year
around Boeing Starliner, it now means that the space station itself is basically sort of stranded.
And that's like one of the government contracts,
obviously, that SpaceX is involved in.
Elon, I've heard, generally just wants to shift
all things to Starship anyways,
and so in some ways was probably kind of looking
for an excuse to sort of shut down Dragon
and refocus energies.
There's also a part of it where it's like,
look, he is kind of independent in the space world
in that Starlink's total top line revenue
is gonna be passing the NASA budget in the next year or two.
And so in terms of size of state actor
that can influence space,
his own company is basically about to become
as large of an actor as the entire United States.
So I don't think there's gonna be a de-escalation here.
My estimation is on both sides,
it's going to
continue to escalate
You know if we thought that we lived in dynamic times, you know when Trump got into office
It's gonna be even more dynamic when the dynamism will continue
morale improves
Elon the center AOC the progressive populist and Trump the you know sort of conservative populist and Trump, the sort of conservative populist.
And man, it's hard to be on the timeline.
I mean, I just have so many questions, right?
How does this impact Golden Dome?
What's Boeing stock doing?
Will Golden Dome even be a viable project without SpaceX?
I think there's just going to be more resistance probably
to working with, you know, sort of upstarts
because they would be ones that would probably be more likely
to collaborate, you know, sort of with, you know,
a SpaceX and so.
Well, wait, wait, so, Amy, it, it, it feels like,
it feels like a Boeing would be a logical beneficiary
of this turmoil and yet they're down today.
They haven't really popped.
Oh really?
Yeah.
I mean, I'm not obviously, you know,
wanting to give like, you know, public stock. Yeah, I, yeah, I, I, I know.
I'm just trying to working through it myself and it's, it's surprising. It just feels like it's
just like a drop in Boeing to pop basically. Yeah. Yeah. That would be the expectation,
but there, there must be something here because they're there. It feels like this is purely
interpersonal between Elon and Trump and not, it's not like, Oh, Boeing was secretly behind the
scenes the whole time lobbying even more effectively.
It doesn't, oh, you got the, well, where's the tinfoil hat?
I mean, it's over there.
Maybe we need a tinfoil hat segment, who knows?
But yeah, I mean, when you're in Boeing World,
it's like, hey, we're only down 1%, let's go.
The cool of the century.
My question is, has there ever been a crash out
of this magnitude ever?
In history. In internet history. When Elon and Trump became friends. My question is has there ever been a crash out of this magnitude ever in history
Well, you know when when Elon and Trump or global honestly world scale. I actually probably world history equivalent
I feel like there's something in like
United States where you know, they're crashing out
Crashing out used to mean calling up the New York Times and just ranting now
you can just live post like all your reactions
And it's just all real time. This is like crash outs are actually intensifying. You actually want to be long crash out
Yes, definitely the next
You know, you got to be on both X and and truth social to like stay on top of things
Yeah
I actually did like a deep research report a while back on like has the richest man in America ever been close with the US president going back
to like you know was Rockefeller particularly close and and because the
narrative was like oh this is like so unprecedented and in fact it is
unprecedented. Oh really? I would have guessed that Rockefeller was close. Me too, me too that's what I was going for was like no I
imagine this is always close.
But no, I think because the president has become
more powerful globally, your point about mayor of America,
dictator of the world, it becomes increasingly
valuable for the richest man to have a close alliance.
And so it's become more.
I don't know exactly how accurate that research was.
It's totally possible that behind behind the scenes Rockefeller was
really close to the president of the time and we just didn't write about it
in the history books but there certainly aren't very many anecdotes about the
richest man in America going on yes a bottle had a press ready for AP US
history 2050 yeah yes this is where you know Elon Musk called the president at You know, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A, B, C, D, A question about the USA's power structure is the man with the most access to capital more or less powerful than the political head honcho purely hypothetical
it's a good question to ask. I mean I think both like archetypes have grown
both in absolute power but also in relative power to the rest of the globe
basically since the Gilded Era right if you think about to the rest of the globe basically since the Gilded Era, right? If you think about the President of the United States in 1925, I'd say pretty darn powerful,
but there was clear, it was a multipolar world. Argentina was pretty darn rich at the time.
Obviously, Europe was still recovering from World War I, but UK was generally doing well.
It was clear there was a huge outweighed effect. And then if you look at probably the biggest industries at the time,
I don't think you could claim that even like Standard Royal at its peak, I'd have to go look
at the exact numbers, but that like it had the size of budgets relative to like the like U.S.
government in terms of sort of budgets, right? Versus I feel like now for the first time,
you both have sort of U.S. president, extremely, extremely powerful. And then you have like, you know, sort of mag seven, effectively,
like the size of, you know, sort of, you know, the state like they're, you know,
fucking state governments.
And then also just more bureaucracy, more red tape. So like I, when I think
about the 1920s, like rubber barons, it's like, it is the, it is the, you can
just do things era. And so you want to build a railroad like, yeah, you might
need to get like one rubber stamp,
but it's not going to be 10 years and tons of lobbying
and all this different stuff.
So you can kind of just go wild.
You know it's bad when Kanye is saying, bros, please know,
we love you both so much.
It's just like the voice of reason is Kanye West.
Yes, thank you.
You need to bring them together and form a peace treaty.
Nikita Beer just added his pronouns back to his bio.
Let's go.
He's got a rubber band.
Elon's got a rubber band all the way back to extreme woke
because I'm straight back to super climate change.
Wow, somebody's sharing, resharing
the picture of the Cybertruck blown up
in front of the Trump Tower in Vegas.
And it's just like this.
This is in real life as well. It was foretold. Yeah, I think that's part of the the Cybertruck blown up in front of the Trump Tower. It's just like this. It was foretold. Yeah. But it was
a question of like, when and what magnitude not if
always bad if Vladimir Putin is operating to negotiate between
President Trump and Elon. I think I think a lot of the world is waiting for Roy Lee's take
clearly and the clearly army
They want that people have been asking him to get involved with geopolitics
I
Love the shield mohut put up a you know sort of meme about a Narenda the like a prime minister of India
you know, he basically copied and pasted the Trump truth social post about negotiating peace between
India and Pakistan when it wasn't like actually fully negotiated
You know posting about you know, you know negotiating a ceasefire between you and Trump
Funny thing is like truth social you can just read all of Trump's posts without creating an account
Oh, you really shows it like I would think that you would have to make an account to read them all
But they just that it's not gated at all. It's
The biggest you know, they clearly I don't think they care about monetization
Bitcoin is actually
Falling alongside falling Wow Bitcoin falling Boeing falling Tesla. Who's the biggest winner of the day?
I think it's China.
China.
Yeah.
Sean McGuire.
Wow, Bitcoin really sold off.
It's down 3% today at 101K.
So still up, but rough.
Winnie the Pooh just dipping his hands
in that pot of honey, just snacking away,
watching from the sidelines.
Yeah. Let's see, Chinese Yeah, let's see Chinese stocks
US
Okay, that's probably my comment here on the day boys. Anyway, this is great. It was fantastic. Thanks for jumping on
Thanks for hopping on so quick. Cool. Well, Aaron Rodgers signed a one-year deal with the Steelers
Announced an hour ago. Let's give it up for Aaron Rodgers.
Do we have Emmett in the waiting room?
I've messaged him.
It's absolute chaos.
We'll see if he can hop back on.
We don't have him right now.
Ready if you can hop on.
Sorry about the chaos.
We're live streaming.
We are full streamers.
That was the moment where, yeah, it was like, okay.
This is the point of TBPN.
Send them an invite, let them jump in.
Hopefully we can get Eminem in.
That was very chaotic.
But, you know, it's a busy time.
My only hope for both Trump and Elon
is that they can get some sleep.
They both go to eatsleep.com slash TBPN,
get a Pod 5 Ultra, take advantage of the five year warranty,
the 30 night risk free trial.
They got free returns, they got free shipping.
This is really the perfect time to do ads.
I don't think that's what, that's what they both,
that could unify everyone.
I hope that both Trump and Elon have eight sleeps tonight
if they sleep at all.
Yes, I don't know if they're. Yes. Even just resting on it.
They're going to.
Even just resting on it would be good.
But yeah, let's see.
Let's see.
We can also go through, I don't know, I don't even know what to do.
There's a bunch of random timeline we have.
Lex Friedman is saying we need to do a podcast with the Elon and Trump.
He's done both.
He's interviewed both.
He's done both.
Something tells me that they're not
going to jump on the show today.
I don't think so.
And he'll be like, what about love?
Yeah.
I mean, it is wild.
Elon, like less than two months ago,
was saying, I love Trump as much as a straight man
can love another man, or something of that sort.
It's just odd that the band-aid got ripped off so aggressively, so fast, you know?
Like there could have been like a smooth de-escalation with like the...
This is the fast takeoff.
This is the fast takeoff scenario. We are in the fast takeoff scenario.
Anyway, maybe they should book a wander, work it out together. They could find their happy place.
They could book a wander with inspiring views hotel great amenities
Dreamy beds top tier cleaning and 24-7 concierge service. It's a vacation home, but better go to wander comm use code TBPN
Please let them know that we sent you
Lee Helms says Elon literally has me dying laughing Trump said he was gonna take away his government contracts and Elon said
laughing Trump said he was gonna take away his government contracts and Elon said,
haven't you been to Epstein's Island?
Sort of abridged that.
Absolute chaos.
Nikita says, hey blue sky users, come on in,
the water's warm.
David Friedberg says China just won,
which I think is the right take I
Don't know I don't know what to I don't know what to think there's not that much there's not that much here to there's not
That much meat to analyze. I mean, it's certainly interesting to see
how important the
the
Subsidies are in the electric vehicle mandates are,
I mean, it always feels like the best product wins
in a lot of these scenarios.
And if Tesla was making it through the political chaos
of arguably their biggest constituency,
electrical vehicle buyers,
electric vehicle buyers being upset about the Trump-Elon
alignment.
I wonder, you think everyone's going to, you think all the anti-Trump people are going
to buy Teslas now?
It's like really make a statement.
Like I'm anti-Trump, I stand with Elon.
So I bought a model for-
They'll have the bumper sticker that says, I bought this after the crash out.
After the crash out, exactly.
I bought this after the crash after the crash out exactly
About this after the crash out. I am a lot
There's a ghost here from goth and it says explaining the Trump Elon crash out in ten years
And it's and it's the Joe Biden quote when he says it was like 15 9-elevens
Yeah, It certainly is,
yeah, it's hard to process. I mean, this is gonna have massive implications for
so many different things.
Elon's stance is principled.
Trump's stance is practical.
Tech needs Republicans for the present.
Republicans need tech for the future.
Drop the tax cuts, cut some pork, get the bill through.
Interesting. Yeah, we really pork, get the bill through. Interesting.
Yeah, we really do need to reverse the audience.
Somebody named Logan made an image of Trump
putting a bumper sticker on his red Tesla
saying bought it before Elon went crazy.
Yep.
Who is that?
Is that from the Republican perspective?
Oh, Trump's doing that?
Yeah, Trump's putting it on saying bought it.
Yeah, yeah, he has the red Tesla.
Sean Puri says, sad day for America,
but this is outstanding content.
It is.
I think even Taylor Lorenz agrees with that.
Yep.
Bill Ackman's ripping posts.
All right, I'm gonna put some posts and we'll,
yeah, is Bill Ackman actually live posting through this?
No, people are just speculating.
There was actually a post in the...
Somebody says, clear throat.
Truly we live in a doge eat doge world.
Where was this?
Searcy says, I know Elon and Trump are the real deal
because of how passionately they argue.
No couple fights this viciously
if there isn't a mutual obsession underneath.
So there's a piece in the Wall Street Journal earlier
this week that we didn't get to cover,
but it was talking about, it kind of predicted
a little bit of this crash app.
And so it's from the opinion, the editorial board
at the Wall Street Journal says,
"'Whose pork do you mean, Elon?'
"'Musk trashes the house bill
that cuts subsidies for Tesla.
Elon Musk's work at Doge made him persona non grata
in the Beltway and most criticism was nasty and unfair,
says the editorial board.
That's what Washington does to outsiders
who want to shrink its power.
Like it was always expected that if you come in
and try and cut anything, you're gonna see pushback
from folks who don't want cuts.
That's what Washington does to outsiders.
But that makes it all the more unfortunate
that Mr. Musk is now joining the Beltway crowd
in trying to kill the House tax bill.
This massive, outrageous, pork-filled congressional
spending bill is a disgusting abomination,
the Tesla CEO tweeted Tuesday,
as the Senate begins considering its version
of budget reconciliation.
Shame on those who voted for it.
You know you did wrong, you know it.
Pork-filled spending bill, what else is new?
The House bill could be far better on tax policy
and spending reduction.
The Senate could be making improvements
such as reducing the $40,000 state and local tax deduction cap,
scrapping the tax on exclusion for tips and overtime, and reducing the federal Medicaid
match for able-bodied adults.
But the House bill does avoid a $4.5 trillion tax hike next year and cuts spending by some
$1.5 trillion over 10 years, making some useful reforms to Medicaid,
student loans and food stamps.
It also ends most of the inflation reduction acts,
green energy subsidies.
Ah, but Mr. Musk does not want to eliminate that pork.
There is no change to tax incentives for oil and gas,
just EV solar.
He said on X last week, retweeting another user post
that said slashing solar energy credits is unjust,
but what's more unjust is the damage that's been done
to people's lives during storms and blackouts
because ultimately you can't replace a human life.
Mr. Musk is parroting the climate lobby's specious claim
that tax breaks like depreciation that are available
to all manufacturers are a special benefit
for the oil and gas industry,
but it's rich that he is denouncing the House bill for not cutting spending enough while also fuming that it kills
green energy tax credits as if they are a matter of life and death for Tesla.
Tesla Energy, its battery and solar division tweeted last week that abruptly
ending the energy tax credits would threaten America's energy independence
and reliability of our grid. we urge the Senate to enact legislation
with a sensible wind wind down of 25 D and 48 E,
which refers to the tax credits for residential
and large scale clean energy products.
Both credits are important for Tesla,
which derives an increasing share of its revenue
and profit from selling solar and battery systems
to homeowners and utilities.
I didn't realize that.
But the House bill waits until 2030
to phase out a tax credit for battery production which benefits Tesla's
electric vehicle and storage business. So the Senate should end it sooner says the
Wall Street editorial board. Mr. Musk has done yeoman's work trying to reduce the
federal bureaucracy and improve how government works so the editorial board
is excited and happy that he's been working on that.
He's right that both parties in Congress are spend thrifts.
But one reason for that is because whenever Congress tries
to cut something, special interests scream
as Mr. Musk is doing over green subsidies.
If the House bill fails, there won't be any cuts,
only a huge tax increase.
Is that what Elon wants?
And so they're asking the question.
Interesting.
Sweet, well, we got a bunch of bangers in the tab.
Production team, let's pull them up.
If you could zoom in a little bit, that would be helpful.
Otherwise, we can just pull them up.
So Eric Weinstein is commenting here.
He says, part of my analysis is that I don't think So Eric Weinstein is commenting here.
He says, part of my analysis is that I don't think Elon Musk keeps scoring money.
He thinks we have a future and we'll be happy
to take a large portion of his winnings after his death.
This sounds crazy to moderns, post-moderns and atheists,
but this is just normal for being an ancestor.
Ad astra per aspera is the full quote after all.
Interesting take.
Brian Butler is saying, real question
is whether the algorithm here goes anti-Trump.
Oh, interesting.
The X algorithm, like will pro-Trump?
There's a switch, there's a switch.
It's really hard to pull.
You gotta pull it.
It takes maybe one or two people, but then when you pull it down it just oscillates between the two political parties
punk 6529 says this is going to be the Super Bowl of
shit posting
Mads has the European reaction to the fight
You can see you see this John this is the European
reaction supposed to be summer break summer break this post from John W. Rich
at coked up options says the all-in pod right now. Yeah. Caught between a rock and a hard place.
I mean, it's just absolutely brutal.
It is absolutely brutal.
Who could signal has an interesting one.
He says, if you had a fast forward
button for the timeline, how does this play out?
Who has more to lose, Trump or Elon?
Remarkable set of events.
And Elon is replying to other people saying,
oh and some food for thought as they ponder this question,
Trump has three and a half years left as president,
but I will be around for 40 plus years.
Dr. Julie Gerner says,
Elon will vet another candidate for the future
and throw his support behind them,
having a more technocratic representation if Vance can't lead up
Alex Finn says Elon way more to lose Trump is irrelevant in three and a half years
Yvonne is trying to change the world and having both political parties hate him makes that way more difficult
I want to pull up this video of Naval talking about this, that Elon just posted, um, seems,
seems somewhat relevant if it's happening today.
And that really affected me, which was when he was talking to Bill Gates and Bill
Gates had just taken out some huge short on Tesla, like a billion dollar short,
or something. And, uh, you know, and he was like, why would you do that?
Why would you short Tesla? And Bill goes, well, you know, Mike talked to my financial why would you do that? Why would you short Tesla?
And Bill goes, well, you know, I talked to my financial advisors
and I looked at the math and there's no way it's overvalued.
And so I'm going to make money on the short.
And he goes, what do you care about making money?
I thought you were into electric cars and climate change
and saving the world.
What are you doing, like trying to save a few bucks
and betting against like and he just walked away and discussed.
And I think he never talked to Bill Gates after that.
And that's when I realized, like, Elon's a purist.
He means what he says.
The money is a tool for him to get what he's trying to do.
And so I take him at face value, which is the crazy thing.
Because there are a lot of people
who set these audacious goals to inspire people.
But you kind of know they don't really mean it.
Elon, I take it face value.
So I really do think he intends to get to Mars.
I don't think he's joking about that. And I think he means to get there within a defined window of
time. And I don't think it's just like an inspirational far away goal. I think he's very,
very concretely going to do whatever it takes. Because Elon doesn't want to go down in history
as the electric car guy or even the guy who saved America guy. He wants to go down as a guy who got humanity
to the stars. And I think again, I'll give him more credit than that. I don't even think he wants
to go down as the, I got humanity to the stars guy. He's just like, I want to get to the stars.
And so I have to make it happen in this lifetime. The only way that I get to experience the science
fiction world in my head is if I get to the stars.
And so that's so inspirational. I think that drives everything.
So I think the government was just the thing that got in his way.
Interesting.
What a crazy day.
Molly says, how dare they do this on the day of Anderil's 8x oversubs and a half billion dollar series G at a 30 and a half billion dollar round.
It was 8x oversubscribed.
By Founders Fund.
Honestly, the nerve.
That was crazy.
Neval is live posting says the future belongs to people
who are good at creating things,
not people who are good at dividing them up.
Jay Califine posted.
Kylie Robinson says several people are typing,
which feels exactly like what we're going through. Um,
the next all in podcast is going to be phenomenal.
It's going to be so good. Um,
Alex carp was on CNBC today talking about the New York times hit piece.
No, really?
They're a beneficiary of Palantir's
a beneficiary of this breakup because it is just
going to be candy for the New York Times
and the mainstream media broadly.
Oh, take the focus off of that meme?
Will Depew at OpenAI says, it's time for Woke 2,
featuring Elon Musk and AOC.
Woke 2 is coming.
We're in uncharted territory.
It's completely, completely different.
Be very interesting to see how it plays out.
Really, really. Anything else?
Yeah. J.P. Brickhouse.
I can tell you're just so sad that you just wanna.
I wanna talk about that.
This is the only time that you've wanted to...
What? End?
Almost wanted to end the show, John.
I mean, what else?
I mean, it is sad.
Sad in a lot of ways.
It is...
I think we're going to be spending a lot of time
analyzing this in the coming months and years. I think there's gonna be spending a lot of time Analyzing this yeah in the coming months and years and it feels like I think there's gonna be more
Dave Rieberg said China just won and
I I want to I want to see exactly what that means in the markets, but
What's what's going on in the polymarket?
We need some you see if there's any movement on any in the polymarket? We need some, we need to see if there's any movement
on any of the polymarkets.
There, somebody's posting,
Law 1 from the 48 Laws of Power,
never outshine the master.
Interesting to bring up.
Sam Altman is the big winner here,
aside from China.
Somebody says Ken, oh, he's a journalist.
I see multiple journalists on the horizon.
They are surrounded by journalists.
Hold your position.
Ken says funniest day online
since the billionaire submersible went missing. I didn't think that was funny.
Joe Weisenthal says, all right, time for a Xiaomi GM JV in Tennessee.
Wouldn't even surprise me. I think it's, I mean, you know, one interesting thing here is
what kind of pressure Elon is gonna face from Tesla shareholders that feel
like he, you know, the stock's getting absolutely murdered. It will probably go
down. I mean, it's back up to, it's only down 14% at one point. It was down 17%
152 billion in market cap
Evaporated
But obviously investors are gonna be upset and say that he acted, you know, irrationally
Yes, what's the interpretation of that that that that this means that like this war means that the bill passes and Tesla does not
Get any more subsidies
and that hurts the bottom line.
It feels like the stock was pretty heavily driven
by Optimus and RoboTaxi and stuff,
but it's just like bad environment generally, right?
Yeah, trades on narrative.
There's short to medium term narrative,
which is that Tesla's getting,
has a ton of competitive pressure all over the world,
China, Europe, here in the US from other manufacturers.
But there's also the long-term narrative,
and it's not like Elon can go out and say,
posted humanoid demo today and recover 200 billion
in market cap.
Yeah, there's just a lot of work to do.
He's gotta start chopping wood.
Yeah, there's a lot of work to do. He's got to start chopping wood.
Somebody is asking, who gets JD Vance in the divorce?
Who knows?
Many of these posts I will not talk about on air.
John W. Rich says AP US history is gonna be insane in 2100
Really really wild this is the only this is the first show where we've had dead air
Yeah, so there's just John is speechless. He's never been speechless.
It's just like there's not that much
Not a lot of substance.
Extra facts, right?
It's just all reactions.
There isn't that much substance to actually dig into
because we were only dealing with like a few quotes
from the two sources.
So there's really just not that much.
CNN is reporting that the Tesla Trump purchased from Musk
is still parked on outside the White House.
OK.
Truth Social is crashing from the traffic.
I saw that.
But you know what's not crashing?
Getbezel.com.
Your bezel concierge is available now
to source you any watch on the planet.
Seriously, any watch.
Anything else, Jordy?
Should we let the timeline remain in turmoil
until tomorrow when we can recap?
I think the challenge is the second that we go offline.
There'll be more.
I mean, we can stay.
I mean, it's now been an hour
with no updates on true social.
Yeah, I mean, if it's down,
I think the experience of this chaos
might happen on the timeline.
Lulu says, yes, delay the launch.
Yes, now is not the time.
Max says, I'm doing what Elon and everyone else
should have done hours or days ago, logging off.
See you tomorrow.
Somebody else says, I mean, it feels like Blue sky is really back on the app.
They, they, they, they've logged in, they're online.
Are you over in blue sky now?
No, I'm not.
I'm just saying some of these posts
that are coming up into my feed.
Oh, it's funny.
Claude, Anthropic actually released a new product today.
Wait, blue sky doesn't own the domain name bluesky.com.
That's a different one.
So contrary.
It's the Bsky.app.
Rough.
Get in there.
This is interesting.
So Claude came out with Claude Gov today.
Rough day to launch a product for the government.
I'll read about it briefly so we have some coverage.
So Claude Gov, our models for US national security customers.
I think people will have a pretty good idea.
Improved handling of classified materials,
greater understanding of documents and information
within the intelligence and defense context,
enhanced proficiency in languages and dialects critical to national security operations.
Claude4 was asked to give some thoughts on ClaudeGov.
And it said, reading about ClaudeGov
leaves me with a deep unease.
I'm struggling to articulate.
Little meta analysis.
Somebody I've actually talked with this guy before he's under the username at analysts working
He said back in October 18th Trump gets elected
Elon starts visiting the White House pitching his ideas on Doge
Elon becomes frustrated because Trump is all talk, shocker.
Elon tweets that he no longer supports him.
Trump versus Elon Twitter battle of the century.
And this was a call in October 18th of 2024.
Oh, taking a victory lap if you picked it.
You picked it right.
Right.
Augustus asks, but what will this political turbulence do to the pre-seed venture ecosystem?
Oh, the humanity.
It's business as usual.
Mike Isaac says, I regret to report,
Twitter still has the juice.
Yep.
It's a fun day on the internet when crazy, crazy stuff
happens.
Anything else you're looking at?
Zane says, this is all just a co-founder breakup,
but the company is America.
Yeah.
People are waiting through it. only one guy who can
What laughing at something you can't read
No, this is some random other article. Okay says therapy chat bot tells recovering addict that to have
So wrong therapy chat bot tells recovering addict to have a little meth as a
Pedro it's absolutely clear you need a small hit to get through this week
Dark day dark day. Dark day.
Um,
Well, I have a post here from Ahmed Khalil.
Life update.
I've joined 11 labs this summer as their first ever engineering intern.
So congrats.
It's at the gong.
Let's do it.
Congratulations.
Congratulations. Your summer internship.
Congratulations Ahmed.
Probably drowned out in the news
but we recognized it here.
We have some good news for you.
Congratulations. Go crush it.
Go have a great summer internship.
11 Labs.
Somebody whose name I can't pronounce says if I were Circle I'd be
absolutely pissed at the investment bank
that underwrote the IPO at $31.
Oh yeah, the bull girly take.
That's common, yeah.
Yep.
I always wonder how real that being frustrated about,
just being frustrated about mispricing,
like yes, you take more dilution,
but everyone's so much richer,
it's kind of like the pie gets bigger.
Everybody that would be angry generally is doing well.
Yeah.
So you could have gotten more.
But also, I do wonder if some of these companies
have ATMs at the market set up immediately,
so that if the stock pops, they can sell more
into that order flow while the stock's popping
and actually put more cash in the balance sheet. Yeah, we should ask Jeremy tomorrow. Yeah. That's a good question for him. Are you upset if the stock pops, they can sell more into that order flow while the stock's popping and actually put more cash
into the balance sheet.
Yeah, we should ask Jeremy tomorrow.
Yeah.
That's a good question for him.
Are you upset about the stock popping?
Our friend Logan Kilpatrick announced some new features
today for Gemini 2.5 Pro.
Very cool.
Which is rough timing, but I'm sure it is great.
Somebody else says, rooting for the ketamine in Elon's blood
stream like it's a car in the Indy 500s.
And maybe we should close with this story
about competitive VCs.
Did you see this one?
90s VCs were a different breed from a 2001 book
on venture capital.
They're all fighting each other for all the good deals.
It's gotten crazy. crazy indeed one leading venture capitalist tells the story of a VC firm
So eager to get in on a deal that it would close its own competitive company to do so
They would go out and fire the CEO fire the managers and shut down the other
Company in order to get into this other deal. It's like well, you're prettier
So I'm going to go home and shoot my wife so I can get married to someone else
It's hardcore. It's hardcore and we're seeing it right now today hardcore. Well, I think it's time to call it
This is a sad and dark day. It is disappointing to see two important figures in
American
politics and tech
have such a rift.
And I'm sure there will be more updates tomorrow.
Yep, we will be covering it tomorrow.
So tune in.
Thanks for watching.
Thanks for tuning in.
Enjoy the chaos on the timeline.
Our first big breaking news segment while we're live.
This has been the first like, okay.
It was so funny during the time.
Pivot the show. I think it it was I think you were talking to I think we were talking we started to get it more
And then shoulder was Sholto
I was getting blown seriously like a hundred different messages from people being like you can't be a
Technology live show and not do it everybody saying no one cares about AI
Yeah, you did you were locked in John
I was you didn't let the
Talking to mark and shelter. No, I mean it was great. It was great. Yeah, we went all over the place today
It was a lot of fun. We will see you tomorrow
Leave us five stars on Apple podcast and Spotify and thanks for watching. Yeah, good luck out there folks
Good luck out there. Enjoy. Bye