Big Technology Podcast - Anthropic's Labs Lead On Fable's Capabilities + Building AI-Native Products — With Mike Krieger
Episode Date: June 24, 2026Mike Krieger is the head of Anthropic Labs and co-founder of Instagram. Krieger joins Big Technology Podcast live from the Big Technology AI Summit to discuss what it's like inside Anthropic the week ...the government forced the company to pull its frontier models, Fable and Mythos, off the market. Tune in to hear Krieger describe how working with Fable changed the way he builds — queuing up a full night of work before bed and waking to find it finished in an hour — why he insists Anthropic's safety warnings are material rather than marketing, and how Anthropic navigates being both a platform and a product as it competes with the companies building on top of it. Wired senior correspondent Lauren Goode joins as a co-interviewer. Hit play for a rare look inside the lab from the person building Anthropic's next breakout product.--- AI Agent documentary: https://www.gravitee.io/ai-agent-documentary Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice. Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Anthropic is not just a massive model builder, it's a massive product builder as well,
with products like Claude and Co-work that have taken off like crazy over the past few months.
And Claude Code came out of a place that few of us know much about called Anthropic Labs.
And Anthropic Labs is a organization within Anthropic that is working on building the next level of frontier products with AI at the center.
And so today we are lucky to hear from the person.
running that lab, Mike Krieger, who is the co-founder of Instagram and now the lead of Anthropic Labs
at Anthropic. We're going to welcome him on stage along with Lauren Good of Wired, who will join me as a
co-interview. Mike and Lauren, let's hear it for both of you guys.
In the face of ongoing disruption and opportunity, TMT leaders need to deliver tangible results,
not just ideas. When pace and performance matter most,
PWC combines market insights and deep sector experience with AI, cloud, and emerging tech to accelerate your transformation and drive measurable ROI from strategy to execution.
PWC can help you anticipate what's next, outpace disruption, and compete.
For more information, visit pwc.com.
All right, Mike, so chill times in Anthropic land.
Nothing going on.
Slow week.
Slow week.
You want to start, Lauren?
Yeah, first of all, I mean, we want to talk about what you're working on at labs and explain your role to folks.
But I want to ask you first, how close are you right now to the situation with the White House?
Less than in my CPO role.
So I transitioned about like five months ago into this lab's role.
I think in the CPR role, I think would have been deep in it.
Now, obviously, you know, we want to restore it.
And as a like product person, I want to make sure that that gets access.
but less close to it than, you know, in that sort of C-level role that I had before.
Okay.
Alex, you have a follow-up?
Oh, yeah.
I have like eight follow-ups.
Definitely.
Well, there was a guy named Ben on X who said,
I will mail Anthropic and original copy of my long-form birth certificate if they will
enable Fable for me again.
I sound like those lunatics who are obsessed with Fero now.
Will you take Ben's long form?
I don't know that will take his long form.
But it has been interesting.
I mean, Fable was only available for a few days, but I've definitely, every time I've
tweeted since then they have not read whatever I was tweeting and they've mostly been like
bring back fable which like in Instagram we got rid of Gotham do you ever gotham the filter
this is like and then for the rest like the next eight years all I heard was bring back Gotham so
it struck an herb but fable will uh will come back before gotham did but yeah it's it's clearly
the folks that have gotten to use it and started incorporating it it's actually really interesting
um I've learned to not really trust day of or even week of model reactions you don't really know
until you've put it through its paces.
And so I almost just completely block out the noise
in the first couple of days of any new model release.
Because I don't know, everybody has maybe their toy example thing
that they like to do with the new model.
But it's hard to actually put it through its paces
until you've actually had real work done with it.
And I think people were just starting to do that.
And then, you know, we had to sort of pull back fable.
But I remember in December when we put out Opus 4-6,
it was like this interesting time
where everybody went home for the holidays
and a lot of people had that week off between Christmas and New Year's.
And then they came back and were like, oh, I spent a lot of time and I really get why Opus is good and I'm going to do it.
So I don't think Fable has had that opportunity yet.
But despite that, though, I mean, this was a pretty big reaction from the Trump administration.
And I think everyone here, especially if you were listening to the Alex Stamos session, understands what's going on.
But this happened within a few days of the model release.
If I can give Wired some credit, WIRE just reported last night that it was due to Anthropics.
relationship with or having giving access to the model to SK Telecom that could have raised
flags within the administration. How surprised were you by how immediate that backlash was?
Yeah, I think the sort of reaction decision was surprising and we were sort of immediately engaged
with them to, you know, restore access as well. And so at the same time, you know, the one thing
that we think a lot about internally is, you know, there used to be a poster on the Facebook wall
when we were still that was like every day feels like a week.
And I think that's becoming true in AI.
And I think a good thing to remind ourselves of in general in the industry is we're dealing
with unprecedented times.
We're dealing with new situations and they can develop really quickly as well.
And so I think also developing the capabilities and the connections and to make sure those
conversations can happen quickly is really, really important.
I have a new motto suggestion for you, for the wall.
Move fast and jailbreak things.
I don't think they're going to use that, Lauren.
Okay.
You know, we just heard from Alex that some of these capabilities have been available on, you know, previous models.
Or, you know, you can find bugs with other models that are available today.
So why do you think Anthropic got singled out on this front?
I don't know why, like, Babel specifically was singled out as, you know, again,
fable being the non-cyber intention model as well.
I think the thing that does change over time is there's, you know, the capabilities,
if you knew what to prompt and knew what you're looking for, and then there's the capabilities
I think, you know, I'm not sure I resonate with the like juicing high school athletes metaphor
from Alex, but like, you know, the uplift that you get, like uplift is a thing that we think
about a lot when we think about model safety. So if you look at our model cards, for example,
one of the ways we look at risk in the bio domain is comparing the uplift from sort of layperson
using the model versus, you know, an expert or a layperson just using the internet and
seeing the comparison there is. And so that is one trend that is, you know, has been, you know,
progressing as the models get more capable. And so maybe less why Fable is singled out and
maybe more like what the overall trajectory is. Interesting. So we just had R. Karazian, the lead
economist on R.R. on the podcast. And one of the interesting things that he said was, you know,
we had during the Pentagon situation, there were these headlines, okay, this company will not
use Anthropic anymore, but actually the data from Ramp shows that spending actually increased
to Anthropic models. It was apparently a good publicity moment for the company. So I think it sort
of plays into a debate that we have here on the show about like how much of this is like real
concern for the issues and how much of it is, you know, from Anthropic is marketing. And like,
we have somebody from Anthropic here who can actually shed some light on that. We just said
Stamos talking a little bit about it from his perspective on the security side, but we're lucky to have you
here today. So is it material or is it marketing or some combination? I mean, I think it's one of
the hardest things to like really deeply believe that something is true and not marketing. And then
due to not even just Anthropic, I think like, you know, I think people are generally right to be
skeptical of any company saying anything and you should like put it through your own filters as well.
But for me personally, I was like, no, but it's real. And like we like both deeply care about
safety and are like to the extent that we are being vocal about anything it is to either sort of
help paint the picture of what very likely is coming or we believe is coming or what we've
already seen and spotted you know for example in the in the mythos case really just looking at
like vulnerability scanning and bug finding and doing it in partnership with with companies that
were in that kind of project glasswing initial announcement and so the technology is also not
in the like in the just you know it is it is doing
really incredible things and therefore even calling out what we see as what is happening,
I think can seem hypey.
I wish I could press a button and make everybody believe that we are not being hypey.
I realize that's not the reality that we operate in, but at least from my perspective,
we try to call it like it is.
My understanding is that within labs in particular in research and development at Anthropic
that you're using the best AI models to actually build new products, to prototype new products
and sort of test out your thesis.
So is this ban essentially now limiting your ability to do that within labs?
Yeah, I mean, definitely Fable is the best model I've ever used.
And it's not to say that work has stopped, but it's definitely like less good than the other models that.
Or we are using models that are less good than that as well.
And it is also, I mean, maybe the reverse of the distrusting the first week response is what happens when you don't have.
Fable and like obviously the Twitter reaction for the people that had already kind of like gotten
into the model and reasoning it was strong. But I'd say even in my personal use, I'm like,
oh, like I'm on Opus 48 and it's good. Like I'm still productive. I'm doing work. But,
and we can go into sort of like how my work changed with sort of these like Fable or like, you know,
the models of that sort of family. But it is noticeable for sure. Yeah, I think we'd like to know
that. I mean, you know, we, the public had access to Fable for like a half a minute.
groups in this like Project Glasswing have had access to mythos,
we don't really know what the difference is between using a model that you can use today
and using one of these superanthropic models.
So what actually could you do differently with a fable or a mythos?
I think for me it's the sort of scope and scale of delegation.
And all these things are really imperfect.
Like people say like, oh, is this now a level five software engineer
or level six, but anybody who's used these models extensively knows that they're still spiking
in capabilities, right? In some ways, in many ways, they're better engineer than me. And in other ways,
I was complaining today that it had missed a decender, like the G, that bottom part is called
the decender. I'm like, how did you put that in the UI? And it's clipped. And of course,
like there's vision capabilities that need to improve and there's debugging capabilities and
there's sometimes even just sort of human common sense that is, you know, we're way better at than
when the models are.
But overall, I think the big shift for me working,
and it was really interesting because it sort of coincided
of me going back into a builder role.
So I really got to see going from using these models
as an executive, you're trying to do the most of them,
but not going to have it right all your email.
And I think strategy still needs to come from you,
and then you can use the models to sort of pressure test it.
But going back into a builder role
and going from, okay, I am delegating
chunks like, please fix this bug, or I'm thinking of implementing this feature, let's go back and forth,
to something that ends up being much more sort of, all right, like, I got this bug report from one of our users,
or I have this notion of something that I want to build, like, can you sketch out two or three ways in which we could do it?
Right, that seems plausible.
Often I find, actually, sometimes it'll give me the sort of explanation or proposal and be like,
okay, that actually is over my head.
Like, you are clearly way smarter than me, like, explain it to me like I'm maybe not five, but at least, you know, not you.
and it will sometimes explain it that way, but then go build it and getting it right,
like, you know, at a very, very, very high rate.
And I think that starts really changing how you operate.
Like I moved much more to before going to bed, making sure that I had like queued up for
Fable, like enough chunky work to last.
I would call it the whole night and I would check in later and it got it done in an hour
and it was like just hanging out for the next seven hours.
But like really like delegating like much more of a goal than just a...
So one task, for example.
Yeah.
Like, give us an example of one task you would hand to it.
I mean, here's a kind of crazy one, which is for the programmers in the audience.
Like, I had written one of our labs projects in Python.
That's like the language I know.
Well, Instagram was all Python.
And for like some not super exciting reasons, we actually needed it to be in TypeScript to deploy it.
And I was like, all right, that's going to be like, you know, in Instagram, we, for years talked about moving from Python to PHP or hack or the Facebook language after the acquisition.
And at least when I was there and never did.
But I basically, we have a feature called dynamic workflows where you can have it like also break down the task into like a lot of sub-tasks.
And I trusted it to sort of not just do the individual action, but here's like a whole language conversion of millions of hundreds of thousands of lines of code at that point.
Go off and do it.
Go plan it.
Go execute it.
Go verify the work.
Double verify the work.
And then I came back to the work being complete.
So that level of like this is a big sort of chunky task.
So that was so you're basically saying it was faster.
did it in an hour, you're guessing
compared to, with Fable,
compared to what it would have been before.
I think the main difference is in the past,
it would be like, great, I did it?
And you'd kind of took a shortcut here
or this is not quite right, or I need to go verify it
or like, oh, you cut this corner.
It's like the managing interns thing
that everyone's been saying for the past year.
Which is very offensive to interns, by the way,
but yes.
Yeah, exactly.
I don't know.
Have you managed an intern?
Yeah, that's true.
And you were saying it was more correct.
So it's faster, it's more accurate.
it's more accurate, more reliable,
and then according to the U.S. administration, dangerous.
I think the other pieces that has like a greater,
theory of mind is the wrong word,
but sort of like theory of project,
so that it's less, you know,
oh, I'm going to make this change,
and it'll say, great, I'll make this change.
But really, such as you've done like software engineering at scale,
the best engineers kind of keep in mind
all the disparate parts of how this thinks,
and they also see around the corners,
like, I can make this change,
but if I don't do it in this way,
then the next change is going to be,
incrementally harder. And I think that's like been a significant difference I've seen in that kind of
class of model. So I think when we talk about anthropic labs, right, people think of Claude
Code because it is really a breakout product. And it sounds like you've been tasked with basically
figuring out what the next Claude Code is. Would you say that's an accurate description of what
you're doing at labs? And also why does Anthropic need labs? Labs. Yeah. It's also maybe worth thinking
about why we needed labs in 2024 when I arrived and why we needed labs today, because I think
that the answer kind of shifts. I started the labs, the original labs team with Ben Mann, who's one of
the co-founders of Anthropic, in my third week at Anthropic, and it had been something that I'd
been bubbling under. And at the time, the reason was really different. It was all of our product
engineering, the team was 25 people, and we didn't have the models, really. Like, we had Claude,
when I joined it was Sonnet 3, like you've been Opus 3. Like, those were for their time good models,
but you weren't going to, they were not even interns, right?
They weren't even IC3 engineers.
So if you have a team of only 25 or 30 engineers,
they are working on like the next incremental thing.
And we were feeling like the models are starting to get better,
but we don't have any products that sort of show that off.
Like a good litmus test for me is when we get ready to release a model,
do we have either a product or a demo or some other illustration
of something that is very different?
And it gets harder over time.
Like with Fable, you know, even like, like,
illustrating that we can task or this longer amount of work.
So really labs at the time was let's make sure we don't,
like our products don't fall behind the model exponential that's happening.
And yeah, so Claude code came out of that initial one
because nobody in the rest of the product,
people were thinking about coding,
but nobody was sort of had the like space to go and think about,
well, what if we totally change the form factor
and we embrace the fact that the models we're going to evolve in this way.
And a lot of the like the two most useful thought exercises
we do in labs. One is, like, visualize the gap between what the models can do today and how most
people use it. It can be closed that gap. That's one. And the other one is, imagine what the models are
bad at now that they're actually going to be really good at in six months. And let's make sure we have a
product ready for that by then. I think those are like the two guiding questions for labs. And then also out of
that first incarnation came computer use. Computer use was different, though, because when we built it,
it was really bad. Like, we tried a bunch of products with it. And this was around, you know,
on at 3-5 and be like, Claude, can you help me, you know, clean up my desktop?
And it would like click the thing, it was delete the file.
You're like, this is not safe for release.
We're definitely not going to go and build this or to ship this.
But we had that product so that every new model that we'd release, we'd first check it
internally and say, did computer use to get better?
And we'd tell the research team how it'd gotten better or worse until the moment where
we said, it's good enough.
We're actually going to put a product out around this.
It also gives you this sort of beacon into the future that then you can kind of measure
your future products against. But then compare it to now. So we have a thriving product team.
There's co-work. There's, you know, Claude Code has grown a lot. We have our platform. And now I think
it's actually much less about none of these product teams are doing this sort of thinking. And I think
it's much more that the models are advancing really quickly. And even our capability to interact with
them needs to evolve. So one of the things we collaborated with labs and Cloud Code that we ship
today is cloud code artifacts.
Having cloud code not just be able to type back to you, but also sort of draw a picture
or give you an illustration.
And that partially came from spending a lot of time in lab saying just a text box and a big
text response is not going to cut it anymore.
Like when I mentioned that the models feel like they're way smarter than me when they
talk to me.
Sometimes I'm like, can you draw me a picture?
Because this is what I actually need to fully understand this.
But it's really what we've been thinking about is, you know, yes, we have a lot more
products.
You know, we actually have a lot of consolidation to do in our products.
That's another initiative that we have.
But within that, we still have an opportunity to make things much more accessible to a person that does not spend all of their time thinking about prompting and the exponential and the difference between high, low, and medium effort.
Like, there's a lot we can still do there.
But Mike, so there's a, it puts people using anthropic models in an interesting place, right?
You know, Cursor, I think, just sold for $60 billion to SpaceX.
And someone put this meme on Twitter that, like, you know, cursor would have sold for $300 billion if it was.
wasn't for this guy. And it's a picture of Boris Ternity, the person who created Claude Code.
And so for companies that are going to build on top of anthropic technology, you know,
they're going to wonder, do I want to partner with Anthropic or is Anthropic going to go
ahead and build the product that I'm going to want to build, potentially even after partnering
with them. Yeah. I mean, we'll take the like agenetic coding side than I think the broader sort
of aspect of, you know, being both a platform and a product, I think is really interesting.
when we take on projects, the goal is often to sort of push that area of the industry forward.
So, you know, there were AI coding editors, and some of them were really good.
But nobody was quite thinking about it in as sort of free-form a way as we got to think about it with Cloud Code.
And now a lot more products have that flavor than I think would have otherwise.
And so I think if wherever – you can call me out on this, Alex,
if we're ever entering an industry where we're like, all you're doing is the same thing everybody else is doing,
but like you've got the anthropic brand.
I feel like that's a bad use of our time
and a bad use of our either labs or product team time.
Like if we're going in somewhere,
it should hopefully be to say,
all right,
we think that the direction of travel is this way.
We can build a product of that.
And then by the way,
there's no world,
nor should there be a world where like all the products
are anthropic products.
That'll be a bad world, right?
So like, that is hopefully either creating new space for companies
or sort of showing the way
where other products can incorporate.
Yeah.
It would almost be like working for a tech company
that has like social, messaging,
video of the
right
Mike yeah
okay
well there was some question
for example when
you know Anthropic launched a product
that was seen as competitive to Figma
and you had been on the Figma board
prior to that and I think you stepped down
is that correct yeah and so
it's a good question that Alex has brought up I think
where Silicon Valley is known for this really
healthy, vibrant, risk tolerant
startup ecosystem and when the big
start coming in with tons of venture capital
and you know a lot of resources
people say, well, wait, are they essentially just going to steal my idea?
Yeah.
No, I think our dual existence, and it's something that other companies have to navigate,
we talked, I'll talk about Amazon a lot in the previous panel,
like they have to navigate this role where they are both the infrastructure provider.
They obviously have a very large e-commerce.
They do video, but they also serve video.
And then, you know, by and large, customers can live in that dual world of like,
okay, I'm using their infrastructure, also knowing that they are also using their infrastructure to do that.
And I think the, you can talk to our customers and see how well we're doing it.
The thing I always try to do is like at least approach it with a lot of transparency.
So the cursor example is an interesting one where like Michael and I talked a lot over the, you know, time around here's where things we're heading.
And, you know, similarly with the other products that we think about, like can we, I think it's a couple of things.
It's transparency.
And then it's shared building blocks.
Like, yeah, I think in general, and I actually don't think there's any cases where this is even true.
Like we're trying to build on top of the same capabilities that are available.
elsewhere. The last time I was here in the Commonwealth Club on the stage was our
healthcare day at the beginning of the year and we didn't ship like Claude healthcare only
we have it like nobody else has it we shipped a bunch of like plugins and skills and
MCPs and like complementary abilities so that's how I'm not claiming it's easy or that
it's a straightforward thing but it is how we're trying to navigate what is like
admittedly a complicated sort of situation. Speaking of startups, Anthropic is still
technically a startup but you're worth a lot of money. I mean what's
the latest valuation? Is it? 965. 965 billion dollars or something like that.
I sold Instagram for a billion, right? Start up. In 2010, but I thought of us, yeah. Right.
Financials have changed quite a bit since then. And yet Anthropic has positioned itself.
It is, you know, a PBC, and it's positioned itself as sort of a more ethical company around
building AI. And I'm wondering if you could talk a little bit about how you see that positioning
in Anthropics role in particular, changing the culture of the Valley.
I think back to how Google, in the beginning of the 2000s,
really changed the culture of Silicon Valley in so many ways.
And how do you see Anthropics culture now dictating this next era?
Yeah, that's a really interesting question.
Maybe I'll start, like, insight, and I think there's an external component, too.
I think the reason I joined in the first place,
so I was winding down my second startup,
and knew I wanted to go work at a frontier lab
because I'd started to use these models for coding,
and they were bad at coding,
but I could see that they were as bad as they were ever going to be
coding they were going to improve and I had started building on top of these APIs so the
startup I was doing was called Artifact and we did sort of AI powered sort of news recommendations
and she read a lot of big technology via artifact back in the day it was a lot of things we added so
but not wired you know you guys had a really hard paywall to be honest fair enough we didn't do
do great on my subscription because I can get one for you okay it's actually really
funny like the making deals yeah making deals it's the it's the login cookies it's
It was like really hard to keep people.
I know, I know.
Please escalate this to Condé now.
I know.
And email login is very hard to do in an hour.
But I was building on top of the APIs and be like, wow, okay, they're able to do really interesting things.
But it ultimately made me go to Anthropic was like they walk the walk and they really like deeply believe in trying to make AI go well for humanity.
And that is like in the water internally.
and I think has been why I think the company has remained as cohesive as it has even as we've grown.
And I think that it's like a testament also to the co-founders there on how often they are talking about this as well.
There's a surprise for me coming from a world where at Instagram we did a weekly all hands,
and we talked about product 95% of the time.
And maybe 5% of the time we talked about something else that was going on in the world or around the company.
Probably maybe underselling or like go-to-market.
Maybe it was like 80-20, but it was definitely a very, very heavy product.
And I remember Anthropic about six months in, myself and Kate Jensen, who's one of the leaders in the sales organization, did a joint all hands where we talked about are like, you know, how we're doing product and go to market together.
And people were like, this is so great.
I finally understand our product strategy and like what we have been doing.
It's like, oh, right, this is not, quote, unquote, a product company.
You know, it is a very mission-driven AI company with like a very strong sense of like why it exists in the world.
I think in terms of the overall impact on the valley, it remains to be seen.
I think positive signs that I've seen are interesting signs of the scenes that I've seen
are a renewed interest in philanthropy across the board.
And I think that's something that has been written about.
And I think it will be an interesting sort of outflow.
Again, who knows how all of this goes.
But depending on how it goes, it could mean a lot of interesting new sort of philanthropic deployment.
And then I think the other piece is, you know, the conversation around how AI could or should go
is one that is happening in real time with the technology versus retrospectively,
which I think has been the case for other technology waves.
And I think that is a good thing.
Hi, everyone, Alex Cantorowitz here.
I want to tell you about a documentary I've made with gravity to explore the future of AI agent security.
To find out if we're truly ready for autonomous agents, I sat down with MIT professor Ramesh Rosker,
former White House CIO Teresa Payton,
Michelin's Group Chief Data NAI officer, Ambika Roger Gopal,
and Sharon Guy, a former executive at Alibaba.
They each offer unique insights into this evolving landscape.
We conclude with Rory Blundell, CEO of Gravity,
to discuss the path forward.
With Gravity leading the way,
join us on this journey.
You can watch the full documentary at the link in the show notes.
Visit BetMGM Casino and check out the newest exclusive.
The Price is Right Fortune Pick.
BetMGM and GameSense remind you to play responsibly.
19 plus to wager.
Ontario only.
Please play responsibly.
If you have questions or concerns about your gambling or someone close to you,
please contact Connects Ontario at 1-866-531-2,600 to speak to an advisor.
Free of charge.
BetMGM operates pursuant to an operating agreement with Eye Gaming Ontario.
Mike, you know, you talked a little bit about Anthropic has this gap that it sees between the capabilities of the models and where everybody is building products.
And with labs, what you try to do is get ahead of that so you can show people what AI might be able to do now and six months from now.
So please tell us.
Please tell us what you're building, where you see the potential, and what people should be on the lookout for.
Yeah, throw a roadmap.
Oh, if I can throw an ant.
What you're building now, but also if you had, if you have a pie in the sky like Elon Musk data centers in space type ambition, I want to hear about that to tell us. Tell us everything.
Great. We have 13 minutes. Go. Exactly. The rest of the monologue in my product. I think maybe two themes I'm really excited about that we've been exploring a lot. The first one is giving Claude an environment where it has more agency and it also has more self-knowledge. And I'm going to impact that.
because that's like a lot of AIE words.
But I'll give you an example of where we are currently doing a bad job of this.
Like if you are in a Cloud project and you make a file with Cloud,
you're like, that's great.
Can you add it to our project?
Cloud will be, no, you have to go download the file and go drag and drop into this thing.
And you're like, what?
Until yesterday, I would have said the same thing about Cloud Design and Cloud Code,
where if you're in Cloud Code, you're like, cool,
like I need a design for this thing that we're building.
Or you're in Cloud Design and you make a mockup and you want to go build it.
I'd be like, cool, here's a zip file.
And you're like, what?
And I think, so that's a little bit of interoperability.
But in general, this theme of giving,
if you give Claude a lot of notion of its environment,
I was talking to actually a customer, like an API customer,
and one of the things that they were experimenting
was actually even giving Claude, like,
a secure version of their source code
while it's running in the agent loop in their product
so that if it hits an issue, it doesn't go like,
I don't know, I hit an issue.
It can be like, well, it's probably this thing,
you know, at least when it's talking to one of the sort of maintainers
of the software.
So that overall theme, and of course you have to do it with safeguards and be really careful about what you unlock with it.
It sounds kind of obvious, but it's actually night and day in terms of how expressive these products end up being able to be.
And you can even see it going from maybe like core chat or classic chat in Cloud AI and something like co-work where it's got a little bit more agency and it has a runtime and it's able to sort of understand a little bit of its environment.
But I think we are at like 10% of the journey about where we could go.
Actually, one of the reasons I think people got excited about things like OpenCla is seeing how a harness that is modifiable and you can talk to it about things.
And you don't ever get the sense of like, oh, sorry, I can't do that.
You're going to have to go to this setting screen and turn it on.
It's just a thing it has access to and hopefully with like the right gardening and permission.
So that's like theme one that I'm like extremely excited about.
And I think if we do it right should actually like transform all of our products like from head to toe.
So the other piece is, and I'll maybe like share like the, the, not the internal product we're working on, but like the phrase I got as feedback was, like, I think closing the gap, I mean, I talked about closing the gap between capabilities and reality.
I think it's also closing the gap between how people understand their own work and then how the actual day-to-day is to do that work.
I was talking to somebody internally who's on our privacy team.
And to move a ticket from like one queue through another one,
via the task tracker into another one was like eight different steps of copying and pasting,
of like manually moving a pretty, you know, like kind of annoying to have to do,
probably error prone.
I have to like keep spot checking it.
And we helped her with one of our labs projects to basically like make that not a pain.
And she's like, ah, this is the first time in my career.
And she's like been working for 30 years.
We're like, what's in my head.
And what I am using is like now this.
Like it is now closed.
And I want to like bring that feeling to everybody who like you know of course clod unlocked a lot of you know non-technical people would be able to code
But like it's we're still asking people understand way too many concepts of like what is
What is you know the difference between like my sandbox environment and production or like connected MCP as myself or others or how should I store data? And of course you can't abstract
everything but like if you combine both of those themes if you give cloud a lot of self-knowledge and you're creating an environment work and actually solve
complex problems for people in like repeatable ways.
I think I get like very very excited about that.
And you're moonshot.
Not letting you off the hook, what's your moonshot?
Moonshot?
Yeah.
Nothing in space, although I guess we're, you know,
we're talking to SpaceX about spacey things.
But you're talking to SpaceX?
I mean, those are, right, right, right.
For a compute, yeah.
It was exploring extra orbital.
What was the phrase?
Something about exploring like post-orbital world things.
Definitely not my department.
But yeah, there's stuff in.
Are you?
So the labs, is.
isn't working specifically with the team on compute?
Right, exactly.
Or chips.
Separate.
Totally separate.
Okay.
Okay.
So you're moonshot.
Yeah.
Do you personally believe in data centers in space?
I had a conversation.
I, by far from a data center expert, but I talked to somebody who is a person who sends things
to space who is not Elon Musk.
And that's what you would say.
Yeah.
And they were really bullish and I was like trying to talk about why and it was basically
like effectively, like effectively infinite.
power if you convert it well and, you know, infinite land.
And I was like, okay, you can buy that.
I mean, I think they feel good about the shielding you have to do.
Again, clearly not my area of expertise.
But after that talk, I was like, okay, I see it, you know, even if it's going to take
a few years.
At first, I admittedly, thought it was a crazy idea in general.
But now I'm like, oh, I actually really can understand why this might make sense.
When you were talking earlier about the ways that the work in Claude is going to get compressed
in all those steps, I couldn't.
help but think of tokens and how, you know, maybe it's good for your business model in the short term.
If people have to take so many steps and use so many tokens, but tokens have become this unit
of economics that we're using to describe the industry now, and people are token maxing,
and now they're tokenizing, and one, I want to see, I want to hear how we're used it on that
spectrum if your token maxer. And two, is there a near future in which the industry is not
actually measured by tokens? You know, it goes the way of MIPS or dialogue.
or some other, you know, there's some other unit of measurement
that actually defines the economics of this era.
Yeah, I think both of those are really interesting questions.
It was interesting earlier this year when you started hearing about,
like, companies that have like dashboards showing, like, who used it the most?
And we, of course, have, like, internal metrics as well.
And we found that there's not a lot of correlation between, like,
the person who's using the most tokens and, like, the person that I, like,
it was an interesting thought actually to do at your company's,
like, write down your 10 most productive people that you think are most productive,
and then, like, get your top 10 token users and see how closely
they correlate, at least for us it wasn't that correlated. It seemed dangerous to sort of like
purely glorify the like maximum usage. Obviously it's like very gamable. But even beyond that,
I think it's, you know, yes, you can ask Claude to do 10 different variants on something.
But if you thought about it deeply, maybe you would do choose two that you thought were most
promising. And the third one if you then had like some iteration on that as well. So I would
not say like a token match. Actually, the tokeniest thing was that conversion thing I did was just like
a couple million tokens. There's like a lot of tokens that it took to convert the
the thing from Python to TypeScript.
But I think people are being more thoughtful about these different pieces.
And one of the things we look at, whenever we look at a model launch,
is not just model intelligence,
but we're also really thinking about model intelligence and effort
and token efficiency as that combination.
And I think that's a big lever we have to improve,
is how do we continue to be more and more token efficient for a given task
so that you can also, hopefully you don't have to think very hard about this.
We can do this automatically.
But we're able to tune the solution to the problem a little bit more.
And then to your second question, yeah, I, you know, when I was still in the CPO seat,
I was thinking a lot about sort of outcome-based pricing as something that would be really interesting to do.
If you could do it, of course, if you talk to like the CERAs and Finns of the world that have like a really clear,
like we kept this, you know, we were able to solve this customer request and not have it go escalated.
Like, that's really clear.
It gets so much fuzzier on these, like, tasks that we actually ask Claude these days.
Like, I had a strategy document.
I use Cloud to critique my strategy document.
Like, what was the outcome?
It's like, well, I don't know.
It's like, tell me how the strategy goes six months from now.
It feels like it's going to be very hard to capture that as well.
But I would like to see some more experimentation around.
Can you better capture what it's worth to the individual and then, whether the company,
and then can we find the best way to do that as well?
And I guess the most concrete thing we've moved towards that.
And we have a product called Claude Vantage agents where we'll run all of the infrastructure for you
in terms of doing all of the, you know, agentic harness and calling the tools, et cetera.
And you can either do it in sort of the normal mode,
which is you give it tasks, it will go through tokens,
it'll tell you when it's done,
or we have an outcome-based mode where you can say,
here's what good looks like, here's a rubric,
go and do it and it'll go off and make it more outcome.
So like if everybody had moved on to that API,
then I think maybe we could have a different sort of output-based pricing,
but we'll see how that gets adopted.
John or the guys in the back, do we have the random image?
Can we show the random image?
If we can, great.
I'm excited, the random image.
Oh, here it is.
Okay, it's just because we didn't.
have a good label for it. So we just called it the random image because it might come up at any point.
But this is a chart from the Financial Times. Speaking of utility, where it shows the amount of
app releases that have come out, which are skyrocketing, and then apps with significant usage
that seem to be going down in app reviews, which seem to be going down. So Mike, I'd love to hear
you respond to what we're seeing in the image here. Is it possible that like everybody's coding
and releasing, but we're not really seeing a big boom in productivity? That's really, I mean, I think
there's definitely a power law and app usage in general. It'd be interesting to seeing if any of those
app release became one of the apps with significant usage.
We could take it. We could take it down. Yep, go ahead. I think ties into something I've been thinking a lot about,
which obviously my background is in consumer, and I've been wondering what the consumer AI
breakouts will end up being. And I don't know that we've seen a lot of them yet. And I think part
of it is, you know, I don't know how far back that chart goes, but when we were releasing Instagram,
it still felt a little bit wild west in terms of the apps.
People were excited about apps and like two kind of random people released an app
where we were able to get to like number one in photos and video within three months, right?
I think that is much harder now when you think about how consolidated the top 10 is
and how much time is spent on like the TikToks and reels of the world.
It's a lot, right?
And so I think getting that breakthrough consumer experience I think is really, really hard.
So I think that is as much a story about how sort of consolidated consumer,
products are these days, number one. Number two, how entrenched or how powerful it is to have
that sort of data, data gravity, like the data gravity of something like your Google Docs or
in your Google Doc. So even if somebody has a like 2X better AI powered, you know, doc editor,
you're going to move all your stuff? Maybe, probably not. So I think that it speaks to, you know,
the things that are sticky. I think about a lot is like the hard stuff is still hard, like
making something people want still really hard.
We have amazing bottles internally on top of our products work, right?
And so I think that's a bullish sign for product people like me
because it means that I think we hopefully still add value.
But I think that chart is maybe another place.
It's harder in many ways than ever to break through,
even if you can code more quickly.
And could we have done Instagram in a month instead of three or four, probably?
but we got there after like a long winding turns and twists and turns process.
You had, I think, 18 people at Instagram when you sold it?
13.
With these tools, do you think you would have, how many people you think you would have had?
It's really interesting because of those 13.
Just give us a number.
Everyone's like, oh, one billion dollar, one person startup.
How close could you guys have gotten?
I think we could have gotten there with like four to six, you know?
Okay.
Yeah.
Or the thing that we would have done, if we'd grown it,
we'd be able to do things in more than a single track.
Like, Instagram was, if you ever watched my five-year-old play soccer now,
by which I mean like the ball is there and every single person runs to the ball?
Like that was our product team.
It was like, video, go.
And everybody like goes and works on the one thing.
And like we'd be able to like play positions.
Like Android we built in about a month for Instagram.
We could have done it probably in a week with the models.
And to build Android, we took everybody off iOS.
And we all like relearned to code Android OS.
And then we went off and do that.
that and like for that whole month we were barely shipping updates on iOS.
So I think you can be a lot more, actually a really good example.
There's a labs project I have internally that helps accelerate how anthropic engineers
like code and do code review.
And that project, I am maintaining an iOS and an Android version of, and I basically
have the cloud that works on the iOS one, basically like ping the Android one and be like,
hey, I implemented this.
Sorry, Android users, it's still the second one, even in the AL world, sorry.
And then the internet version is like, okay, I'm going to do this.
Oh, that doesn't count because that feature doesn't make sense here.
I'm going to drop it.
And of course, we wouldn't have been able to delegate all of that on Instagram,
but we sure could have done a lot by having sort of platform parity.
Like this dream of platform close to parity is now actually quite doable.
You're probably going to get calls now from the remaining six or seven people
on your Instagram team going, was I, did I make the cut in the new era?
Also, it sounds like you probably could bring Gotham back now if you really wanted to.
I forget if we eventually, I mean, I think for April Fool's maybe we brought
product one day.
Yeah.
Do we have time for one more question?
Yeah, yeah.
My last question for you is you worked on a product that now as it has evolved is in many ways
ethically fraught because of some of the harms that people are concerned about with children.
And when you talk about the fact that there hasn't really been a big breakout consumer app
for AI, I think there has in its chatbots, right?
And chatbots have also led to some real dangers and harms for young people.
And so when you are building in labs, how are you thinking about the, you know, the potential
harms and the risks that come with just making this technology that much better?
Yeah.
I mean, I think there are certainly products that we have either prototype or conceptualized
and been like, this product, this sounds so hype you.
I hate this.
But like this product, if shipped would be bad for the world or like would nudge people
in the wrong direction.
Or even if we did it right, the like wrong or like more morally front version of this would
be actively, we think bad.
And so I think asking that question a lot internally makes a difference.
And it's a luxury to have core products and models that are doing really well.
So we don't like that's that in some ways an easy decision if we think it could get a lot of a lot of use.
But yeah, I think going back to an earlier conversation, I think front loading it is really valuable and really thinking through like
it is now more normalized to have people at a company and definitely on top it does for like economists thinking about the impact of the thing that you're building on.
on the world, and that just was not the case along the years on most of social media, I think.
Mike, it's always great to speak.
Thanks for thank you again for bringing your insight today.
And let's see it again soon.
Let's hear from Mike and Lauren.
Thank you.
Great job.
Thank you.
If you want a $3,000 a month payday for life, what would you feel free to do?
Maybe take a long weekend, every weekend, or try a bunch of new hobbies.
Would you feel free to upgrade and listen ad free?
Don't worry, we get it.
Every $20 ticket could win you $3,000 a month for life
and supports life-saving cancer research at the Princess Margaret.
Feel free to buy your payday for life ticket today.
Raffle number 155-2194.
Please play responsibly.
Are you one of those media strategy people
clicking through slides, scrolling spreadsheets?
Yes? Good. This is for you.
Because on Spotify, there's an audience that's different.
Locked in.
Loyal, invested.
They're called fans.
Fans don't just listen to music.
They feel seen by it, like it belongs to them.
So when your brand shows up on Spotify, that's who you're talking to.
And you're right next to artists like me, Lizzo.
So, are you ready to talk to fans?
Spotify Advertising, you're among fans.
