Tetragrammaton with Rick Rubin - Greg Brockman (Part 2)
Episode Date: February 28, 2026Returning to continue his conversation in Part Two, Greg Brockman is a cofounder and president of OpenAI, the company behind ChatGPT. ------ Thank you to the sponsors that fuel our podcast and o...ur team: AG1 https://DrinkAG1.com/tetra ------ Athletic Nicotine https://www.AthleticNicotine.com/tetra Use code 'TETRA' ------ LMNT Electrolytes https://DrinkLMNT.com/tetra Use code 'TETRA' ------ Sign up to receive Tetragrammaton Transmissions https://www.tetragrammaton.com/join-newsletter
Transcript
Discussion (0)
Tetragrammaton.
The first website I ever built that got users was this amazing experience.
I had this idea to build what I call the reverse Turing test.
So in the Turing test to determine is a machine intelligent, you have a human who talks to
another human and talks to an AI.
And the goal is to figure out which of these are the human, which is the AI.
So I built a website that turned this into a competitive game where you have both humans
are talking to each other, and they're each talking to an AI. They don't know which terminals,
which. And the objective is to figure out which of your terminals is the other human before the other
human does. And so the optimal strategy is to ask questions that kind of discern, am I talking to a human
or a bot? But while still acting kind of bot-like, if you act to a human-like, then you'll lose
because the other person will figure out who you are. This was 2008. And I just taught myself how to code.
I'd gone online to W3 school's tutorials, did HTML, Giotto-Script, PHB, CSS.
And I remember that I built this game, and it's a two-player game.
So I was sitting in the lobby.
So in case anyone showed up, they'd have a good experience and have someone to play against.
Lobby where?
I made it so that there was like a game lobby, just on my website.
I see.
So I just sit there, just waiting, waiting with this open screen, just sadly waiting for someone to show up.
And for like two weeks, no one showed up.
But then one day, it was the most glorious day.
I got 1,500 hits from Stumble upon.
Wow.
Yeah, if you remember stumble upon, it was like an early, like, you know, send people to random websites.
And it was an amazing moment where that day, they were like always three or four games going constantly.
I'd sit in the lobby and someone joined within a couple minutes.
And I remember this feeling that this thing was in my head and now it's in reality.
And now these people are all enjoying what I built.
And I want to keep chasing it.
Yeah.
The more you played, would you get better at that game?
Yes.
Yes.
I got quite good at it.
And it was interesting, actually, my,
I focused a lot on improving the bot.
My bot was like very rudimentary
and the way that it would work is I kept a database
of all the previous games.
And then I tried to, in any particular conversation,
match that conversation to the most similar one
and then reply with what the human had said then.
And it actually kind of worked, right?
For any sort of chit chatty thing that kind of it's done already,
then you have a pretty good database of replies.
But anything more sophisticated,
and of course it just would fall
Can you remember it all where the idea came from for that game?
Well, the way that I got excited about AI was by reading Alan Turing's paper on the Turing test.
So this is his 1950 paper called Computing Machinery and Intelligence.
I was reading it shortly before building this game, and it had the most inspirational ideas in it.
Because first he asked the question, can a machine ever be intelligent?
And he says, look, I don't know what intelligence means.
Everyone's going to have their own definition.
So let's define a test.
He defined the Turing test.
But then he says, how will you ever program an answer to this test?
You will never program it.
It's just too hard to write down all the rules of this person says this and that.
Instead, you will need to build a machine that can learn its own answer to this.
You have to build a learning machine.
So if you can build a machine that is a child machine that learns like a human child,
and you can then have a human who gives it rewards and punishments as it does good things and bad things,
then that's how you will solve this test.
And here's the wild thing.
That is exactly what we've been doing.
This is like Alan Turing in 1950,
projecting how we will first build these unsupervised models that learn,
they sort of observe the world, have all this knowledge in them.
And then we do this reinforcement learning process.
We give the machine rewards and punishments in order to achieve the objective that you have in front of it.
And I remember talking to my co-founder Ilya about this.
And I was like, how did Turing know?
And he said, there's a reason that Turing is Turing, right, that he's just so smart.
And that's why, yeah, he really had a decision.
Why do you think it took so long from 1950 to now?
Very simple answer. It's compute. It just was not enough compute.
It's like no matter how smart Turing was, he did not have a computer that he could implement his test on.
Was there a feeling in the early days for you that AI was sort of a failed thing?
The ideas have been around since the 50s and it hasn't really worked.
So this for me was always a great mystery.
It was so clear, the Turing test vision was so clear.
It's like, that's the thing you need to do.
Because as a programmer, I have to understand the solution to the problem.
But the point that Turing makes is that you can have a machine that comes up with its own solution to the problem.
And I was like, think of how many problems I have no idea how to solve.
And maybe the machine could do it.
That's what I want to do.
I want to help that thing come into existence.
And I remember after building this game, I showed up at college.
at Harvard and there's still 2008.
And I was very excited to do research with a natural language processing professor.
Were you studying math at Harvard?
I was going to study math and I got computer science nerd tonight, but my number one thing was
going to be math.
I actually was originally intending to do math, chemistry, and philosophy, triple major.
But I ended up realizing that I just loved the practicality of a building with computers.
And so I went to this professor and I asked him if I could do research with him.
he said, yes, no problem. And he gave me this parse trees problem. And I remember looking at the
parse trees. I was like, this is never going to scale. What is that? I don't know what that is.
Parsed trees, they were like an old school NLP, like natural language processing approach.
So the idea of like, imagine you take sentences and you figure out like where the object is and where the
noun is, kind of like the things that people do in elementary school. Like linguistics.
Yeah, like linguistics as an approach for how you're going to do natural language processing. And you can get some
kind of simple looking stuff out of that because it'll make reasonable looking sentences.
But that's never going to scale to having a conversation like this.
It was just so clear to me.
I was like, this is not what Tori was talking about.
I'll go do things that are useful.
And instead, I actually got into programming languages.
So I was very excited about that.
I took a class.
I wanted to do more research there.
And I think about programming languages,
and you have all this power of, like, a compiler is a computer program
that takes a different program and makes it better in some way.
So it usually takes it from a high-level form.
and puts it into a form the machine can understand.
Usually optimize it so it's faster
and really take the intent of a human
and translate into a form that the machine
can then take care of some of the details.
Because for me, that was the spirit,
is I want a machine that can solve problems that I can't,
that will empower me in ways that I am unable to reach,
that will bring me to new heights.
And not just for me, but for everyone.
And so I would do that.
I got very into the Harvard Computer Society
where we would build services for the Harvard community.
So we would host email and web hosting and other technical services.
We built different web applications for the community.
And so this was very much the ethos that I had.
And it really wasn't until I was already at Stripe where I was doing a startup and building things,
but paying attention to the community.
And I just kept seeing people talking about deep learning.
It felt like every day, if you went on Hacker News, which was this website,
that many engineers would post content on,
you'd see something about deep learning for acts.
And I remember wondering what is deep learning.
At the time, it was basically impossible to figure out
because you go to deep learning.com or dot org or whatever it was,
and it just said deep learning is a new approach to AI.
But didn't say what it was.
It didn't say what it was.
It made no sense.
But I remember that I had a friend in the field,
and so I went and talked to that person,
and he started introducing me to other people in the field,
And I just kept getting introduced to all my smartest friends from college because they were all on the field now.
How many people were in the field at that time in the world?
It was a small community, very small community.
I don't know the overall number of thousand at most.
And it was rapidly growing because the thing that I learned is that there was this moment in 2012 that really unleashed the current deep learning revolution.
And in many ways, everything had been building up to that moment.
But this was the moment that really cemented the there's something real here for many people.
And this was the creation of AlexNet, which was a image recognition paper that competed on this benchmark.
Explain what that is.
So the idea is that in I think 2006, 2008, a lab at Stanford created a competition where they gathered millions of high-resolution images from across the web.
At this point, we would consider a small dataset.
At the time, it was massive and unprecedented.
And it categorized this image into a thousand different categories that humans that would label them and say this is a specific type.
of cat and this is a specific type of bird and this is a specific type of airplanes.
So a thousand different categories of images.
And the goal was create a machine, can create a program that can categorize a new image
into one of these thousand buckets.
So can you recognize whether they're a cat or dog in an image and that people would compete
in this and it would compete hard and it would take all of these 40 years worth of
computer vision research ideas, very similar to like the feeling of those parse trees
You would have these different techniques that would do, like, edge detection and things like that that are very specific.
And if you think about what are the rules for recognizing a cat, it's like, well, maybe you look for a eye and another eye, but how do you recognize if there's an eye and you look if there's a nose?
And then, but okay, the orientation could make it very complicated to actually program this.
Relationships.
Exactly.
So you have to talk about these relationships.
It's very hierarchical.
If you think about the process of you have to see how all these different pieces fit together and then how those pieces relate to other pieces.
very complicated. And so people were not getting very good results here. It felt like a total
impossible challenge. And it's a study of the real world. It wasn't really technical. It was observational.
Yes. Because the thing that people would do is that they would say, how do I think humans do this?
Or what's the process I have in my head? Or let's write down some symbolic way to tell the machine how to pursue a
process. And this was very successful in some domains, right? For example, chess is one that in the 90s that we built great machines to solve it.
but very unsuccessful in other domains like computer vision.
There's only one page of rules for the game of chess.
Simple rules and a small search space.
So you could have a computer that would basically just say,
you know what, I'm going to look through all the possibilities,
and that is how I will win, which, by the way, was not enough for Go, right?
Go, simple rules, but massive search space.
And so you needed something more like human intuition on top of the search.
And for computer vision, you needed something that was entirely like intuition.
And so that's where the neural nets came into play.
And so a team of researchers who were Jeffrey Hendon,
Ilyssutnikovir, Alex Chorsevsky,
created a neural net that won this competition.
And it didn't just slightly win it.
It just blew everything else out of the water,
like a massive jump.
Did they have a different vision of it than everyone else?
Their approach was neural nets.
No one else believed in it.
I see.
And it's actually very funny,
the inside story on how that result came to be.
Because Alex Chorsevsky was a grad student in Jeffington's lab,
and he was working on very fast convolutional kernels for GPUs.
So he basically was programming graphics processing units, GPUs,
which now are what people use for deep learning.
And everyone felt bad for him.
It was like, that's just an engineering project.
He's just writing these very fast kernels.
Who cares?
Like, we're off doing all this cool research.
He's just an engineer.
That's not valuable.
And he had some cool results on like this small image recognition data set.
And people didn't really care.
But Ilya saw that and he instantly knew what to do with these kernels, right?
He realized this is a breakthrough in the making because when ImageNet had come out, this big data set,
he had felt like this was this grand challenge that was just so impossible if you could solve it.
It would be so great, but you just need to be able to put enough compute into it.
And he sees these kernels that were going to use the computer with GPU very efficiently.
And he said, we need to put these two things together.
don't apply it to this other data set that you're using.
Image debt is the thing.
And then Jeff Hinton's contribution was a management trick because Alex Hosefsky really hated writing papers.
He had a review paper coming up.
And Jeff told him, each week that you get a 1% improvement on the dataset, I will push back the deadline on your review paper by one week.
And he did this like a dozen times, two dozen times in a row.
Yeah.
And so it was just one of these things where it's just like,
Alex just kept grinding at the problem and the numbers got better and better.
And so you see the way the progress in this field happens is you need the right theory.
You need the right objective, the right sort of underlying approach.
You need the right engineering, right?
You need to really implement it.
You need to push hard on the problem.
You need to not give up even when it feels impossible.
And you need the right spirit, right?
You need to know that like it's worthwhile and you need to have that desire to keep going even in the face of everything else.
And so I think those three things together were what unlocked this particular result in this particular moment.
And they submitted to the competition everyone in the computer vision field is like, what just happened, right?
That this impossible problem has basically now been solved.
What was the reaction to that?
It was one of these things where within the computer vision community, it was seismic, right?
I think people very quickly went from saying neural nets are total dead end, like you're kind of a fraud if you're doing neural nets to,
only neural nuts.
But there was a time
when you were fraud
if you were doing neural nets.
Absolutely.
Yes.
Until the breakthrough.
That's right.
And actually the history
here is also very fascinating.
So there's this paper
you can find on Wikipedia
somewhere from 1995
that talks about the history
of the deep learning
booms and busts.
So it's really before
all the current waves.
And if you read it,
that the things they're saying
in there are the exact same things
people would say to us
throughout the whole history
of open AI.
These neural net people,
it was in 1965,
they would say these neural net people have no new ideas.
They just want to build bigger computers.
And so it makes you realize that history is written by the victors,
that there was a very concerted campaign waged against this whole direction,
saying symbolic systems, yes, neural nets, no.
And that the neural net people knew what they wanted to do.
They wanted bigger computers, deeper neural nets.
And that the symbolic systems people got in very cozy with funding agencies
and really just poisoned the well and said this whole thing is kind of BS.
And so that's what killed it for the seven.
was that they claimed that it overhyped.
There are all these groups working on it,
that there's no results,
and to put the nail in the coffin,
that there was this result
and showed that a single layer
on neural net couldn't solve a particular problem.
So therefore, the whole thing's dead.
And of course, the neural line people were like,
but just let us go to not single layer.
Like, we know what to do.
But it was all like one of these things
where the establishment killed it.
The sort of centralized elite said no.
And then the crazy thing is in the 80s.
What this paper says is the reason
neural nets came back.
was because of the democratization of compute, right?
It went from being that you have these professors who guard all the compute to suddenly all
these PhD students have their own computer.
And so the professor can't tell them they're not allowed to do neural nets.
And so suddenly people are doing it again.
And so to me, it's so interesting that this theme is universal.
It's like always been true.
It's not a new thing over the past 10 years.
It's something that's been there for 60, 70 years.
So what happened after the adoption?
Well, so after the AlexNet result,
that people outside the field of computer vision would still poo-poo it.
They would say, well, oh, it works for computer vision,
but neural nets have nothing to do with machine translation, right?
Because you have these fixed-sized images and stuff like that.
And for machine translation, you have these variable windows and things like that.
2014, you have sequence to sequence.
Didn't get the same massive step function you did with the Alginet,
but you could just see, yeah, you're going to push that,
and this is going to be the only thing you need.
And what happened is you effectively had the walls between departments
it's being torn down.
And it's this beautiful unification of you thought that you had all these different domains
of computer vision, machine translation, speech recognition.
And it's, nope, you just have AI.
You just have deep learning.
It sounds like any time that the fringe groups can come together, something much more
interesting can happen.
Yes.
Are you familiar with, I think it's Arthur C. Clark's first law?
No.
It's if an elderly but distinguished scientist tells you that, you that, you're not, you're not, you
something is possible, they're almost certainly right. But if an elderly but distinguished scientist
tells you that something is impossible, they're almost certainly wrong. That's great. I love that.
Open AI started in your living room in San Francisco. Describe the living room to me.
There was a big open space and we had a black wood table that was this big oval shape.
I had some couches. I did big screen TV. Day one, there was a big screen TV. Day one, there was a
was no whiteboard. And I remember two researchers were debating something. They turned to write something
on the whiteboard. There wasn't one. And I was like, I could get a whiteboard. And so I felt like
I was adding value from day one. Yeah. So who was in the room that first day? In the room that first day
would have been Sam Altman, Ily Sutskiver, Vojtek Zerrambo was there. I think Vicki Chung,
Pam Vigata, John Schulman, Andre Carpathie probably would have been there at the time. Some of the
them were finishing up their work elsewhere, their PhDs. I apologize if I'm forgetting anyone
else. But that was, you know, the founding team, the founding vibe was we had this great objective
of we really wanted to help build AI and have it be something that was a positive force for humanity.
And we did not have a thesis on how we would do it. And so that was where we started.
And at what stage of the AI revolution were we when you were in that room? What was known,
what was not known.
So this was the very beginning of 2016.
At this time, we had...
Not so long ago.
Not so long ago.
10 years.
10 years.
10 years. Time flies.
Yes.
That's wild.
And time impresses in this field.
It's crazy.
We had gone through at that point four years of this deep learning revolution.
And so one thing that was clear was that it was like there was this early phase where the fruit was
just hanging on the ground because you could just take a GPU, take a neural net, you pointed out
new problem and it's going to work and it's going to get you awesome results. And so new
architectures, it was kind of the heyday of basic research in a lot of ways. So individual researchers
being able to come up with a novel idea in a few months, prove it out, get an awesome paper.
It would be an unprecedented thing. They would kind of like define a field. So that was the
moment that we were in. It wasn't yet the moment of grand engineering. It wasn't at the moment
of large scale compute because you wanted as much compute as you could get, but you
couldn't really get more out of more GPs, right? It really was that you had one GPU and orchestrating
many of them together. We didn't have good techniques for how to actually get good returns from that.
And I remember in the very early days working on the first engineering projects to support the researchers.
And I observed two researchers building with two engineers. And the way that it would go is that
researchers would say, here's the system I want, here are my requirements. And then the engineers would go
or build something and come back a few days later, and they would project it up on my TV.
And then they would go line by line, spend a whole afternoon just debating every single line.
I remember looking at that and thinking, this is never going to end.
It's so slow.
It takes too long.
Too long.
Not going to work.
So instead, I ended up working on the project.
And I would work in a very tight loop with the researcher.
And I would say, here are five ideas.
You would say, these four are bad.
And I say, great, that's exactly what I wanted.
And so just really this not trying to push my own ideas, but really trying to learn the other person's perspective, the other person's view on the world and try to then say, okay, I could translate it in all these different ways and to try to just tease out what truth is, what reality is.
Typically, are the researchers, engineers as well or no?
In this field, they are much closer to engineers.
And there are some people who are really at that intersection.
For example, Yaccahapachia, who's our chief scientist, one of the things that is really, really,
distinguished him is that he really has his foot in both worlds, that he has deep theoretical
understanding. He has a PhD in optimization, but he also really knows how to build systems and
has done it many times. And so that it's a unique skill set. It's a very valuable. And so the thing
that I found was that for engineers to add value in this field, you have a pretty high bar because
these researchers, they all know how to code. They can build their own things. So you have to do better
than they would on their own. And that's a contraster if you're just building for doctors, say, right? Most
doctors probably don't know how to code. So the bar to do better at building something for them
than they could build on their own is relaxed. And so that a lot of what we focused on is how do you
make sure you're empowering and moving forward the researchers? How well did the handful of people
in the room that they know each other? So there was a subset of people who had, they'd all gone through
PhD programs together or some of them had interned together. So there was a set of people who
knew each other and there were a set of people who were newer to each other. But at this point,
we'd actually already gone through some formative events. So we had really throughout the back
half of 2015, I'd been doing all the recruiting to find just who were the best people in the field.
Was the first meeting, the meeting of Open AI, or was it a get-together that turned into Open
AI? I'd say that the very first moment that was really the get-together that set things in motion
was a dinner in July of 2015. And that that was with Sam, that was with Ilya, that was with
Elon that was with a handful of others. And the question there is, is it too late to start a lab
that can actually really get to AGI, right? Why would it be too late? Well, it felt like DeepMind
kind of had it, right? The DeepMind had all the talent that they had as part of Google, all the
compute, that it felt like maybe AGI was very close. And can you actually get together a group of
great people and really go for this? Did you start it in competition with Google, would you say?
I don't think of it as competition, but I do think of it as complimentary, right?
That I think that my view on how AI should go.
Yeah.
This is very foundational, is that I think that AI is something that everyone deserves
to participate in.
And to me, it felt like we're going to build these incredibly powerful systems
and that how they play out for humanity to uplift everyone is something that is the single
most important thing that can happen and contributing to that and helping steer that in a
direction that make sure it actually is benefit.
official to everyone, like that's the thing that I want to do. And so to me, it felt like it's not about the
back and forth on who has the best benchmark. It's really on how do we build systems and overall
society that integrates with those systems, right? These things are going to co-evolve together
that is a much better world than the one that we have today. And that that isn't something that
any one group can do on their own.
How big of a revolution did you know it was then?
Could you see where we are now then or no?
Then it felt like if it was going to work at all,
it kind of have to feel like this.
And I think we're not done.
Describe the personalities and strength and weaknesses
of every person in the room.
Well, I'd say that Sam, I think, is a visionary.
And I think that Sam is someone who gives permission to dream big ideas.
And I think he also cares a lot.
I think he cares a lot about people.
And I think that he is someone who is like always very optimistic about how we can
find a way to configure any solution to a problem.
And so I think that he is someone where you feel like, hey, this is never going to work.
He will find a solution.
But he is always someone.
It opens your mind that it can be.
be solved. That's right. And maybe it's not the best solution yet. That's right. But once you know
it can be solved, then you can work on a better one. That's right. And it's not detached from reality,
right? It's like kind of connected to, like I think he's been, he's like an excellent sort of
facilitator for researchers and for this overall sort of shepherding of this technology into the
world. Ilya, again, I think is a visionary. I think he is someone who I remember in that
very first couple of days, he said, I have this idea I've been thinking about for how we
can solve what it was called unsupervised learning, how we can observe the world.
What is that?
So if you think about how a human baby learns, just by observing the world, right, there's
no one saying this is the right thing.
No one's doing input.
Exactly.
It just happens.
It just happens.
And this was always, to me, this always felt like a crazy concept of how can the machine
ever learn without someone telling it, whether it's doing a good job or not.
Yeah.
But we figured it out.
I remember that he had a lot of ideas on how to really push it into machine.
Even in that room, would there be people who think that's too far?
So in that room, we immediately started trying to write down ideas and that the energy was just palpable.
One of the steps along the way to everyone coming to that room was this offsite in November of 2015.
So I mapped out the whole field or all the best people and kind of been asking people for who do you know who's great.
And then they would introduce me to people.
And so I just kind of kept track of, I kept getting introduced to this guy called Voichack.
I was like, all right, Boychick's probably one I should go after.
And we had narrowed down to a set of people, but they were all kind of like, okay, well, who else is joining?
Like, I'm interested, but who else is in?
You're like, how do I collapse this?
And I asked Sam what to do, and he suggested to bring everyone to an offsite.
And at this point, actually, I remember I was very grateful to John Shulman who had said that he would be in.
So I was at least not the only one who had committed.
And I think maybe there was one or two others who were kind of there.
But it really was a group of people who had not yet coalesced into a team.
Yeah.
And we brought everyone out.
We were in my apartment.
We got into the bus.
We drive up to Napa.
And it was just this day where everyone clicked, right?
That the energy was just so smooth, right?
It was that flow state in human form.
And I remember that we wrote up on this flip chart.
And I have a picture of this flip chart.
The plan.
There's a three-step plan.
Step one was solve R.L, which is reinforcement learning,
that is learning from these rewards and punishments.
Two is solve you well.
That is unsupervised learning.
That is observe the world and just absorb the information.
And then three is gradually learning more complicated, in quotes, things.
And this actually is what we've been doing for a decade.
It's actually crazy.
Right?
It's like we really set out the vision.
And so it's very tight to the original vision.
It's just the growth of the original vision.
It is.
It really is.
When I look at everything we have done,
It has been in service of the same goal with really the same almost technical approach and the same ethos underlying it.
Any of the other people in the room who had particular either expertise or views or that were different than the others that are worth describing?
Yeah, I'd say Voicek, and I still work very closely with him, is also a unique character.
He is extremely good at idea generation, and he will come up with very creative ideas to any problem.
and then he is also someone who is not at all attached to his own ideas.
Right?
Because if you're someone who's going to take offense, then it's really hard to be that generative.
Well, if you want it to be as good as it can be, they can all be your ideas.
That's right.
What I usually do when working with him is really think about, okay, well, what are the bounds on what we should think about, right?
Here are the places that we don't want to go or here's kind of the place that we need to end up.
And then the idea generation process just ends up with this great flywheel.
How did you end up at Stripe?
So while I was at MIT, I was building more and I did more startups.
And from each one, I felt like I learned another thing not to do.
And eventually I kind of felt like I knew enough that I could be successful at doing a startup.
But I was missing a key component, which is having an idea.
And I was pattern matching off of my friends who had been in the computer club, started their own startup.
They had gone to a master's program, come up with a cool idea there.
I was like, well, clearly that's what you need to do.
You need to go to grad school.
You saw the path.
Yes.
But the path that I saw was actually too much of the beaten path, right?
The beaten path is in your PhD program.
You invent something there and you turn that technology into a startup.
And I remember just feeling like, okay, I'm 21.
I'm just too young to do real things in the world.
This path is the only thing that's possible.
But at the very least, I can start meeting people doing startups because I had taken a startup class and it was just not useful.
I was like, all right, like, this is not going to get me to where I want to go.
And so I decided I'd meet these people doing startups.
And literally the next day, I got an email for some people working on a payment startup in Paul Alto.
And I was like, well, my new thing is to meet these people and learn to pattern match over time.
And I remember when I met Patrick, we just clicked.
What was his background?
Well, so he had been at MIT.
And so he and his brother.
Did you know him from MIT?
No, I did not.
Yeah, but we had mutual friends because he had gone to MIT.
John had gone to Harvard.
So it was that community that they were asking around.
Yeah.
And my name, of course, came up in both circles.
And I remember I flew out and it was like raining and kind of miserable.
And I remember opening the door and just like when we first started talking, it was just an instant connection.
Right.
I think we just like had this, you know, different backgrounds in many ways, right?
But also we had a very similar technical perspective.
And like even for very nerdy things.
I mean, we had the same like split keyboard that we use, this Kinesis keyboard.
We both were Dvorak, the actual layout that we used was different.
And we were talking about how to like build the firewalls for the systems that they had.
And you're talking about, you know, different things in the kernel.
And so it was just like we had this like technical connection.
Yeah.
And I think that what I also really appreciated about Patrick and John is that they were my age.
And they were already out there doing things, right?
They were already doing a startup.
And I was like, I don't think that's possible.
One of the two of us is wrong.
I really want to know who it is.
Yeah.
And then what happened?
So weekend was great.
They said, you should join.
I said, let me go think about it.
I went back to school.
John happened to be in town on Thursday.
And so I was like, you know, I probably should make my decision.
And I was like, you know what?
I want to do this.
Because again, back to that algorithm of dream of the what if this works.
I'm like, I don't know anything about paying me.
Like this is not the problem that I've grown up passionate about.
But these people...
But you get to solve a real problem?
Yeah, solve a real problem.
Exactly.
And these are the people that I feel like I can learn from or that I can work with.
And I remember John is like, will you do it?
I was like, all right, fine, I'll do it.
And I just remember feeling like, okay.
Like that they want me.
Exactly, they want me.
That's a good feeling.
It was a really good feeling.
Yeah.
And so I decided that on Thursday.
Friday I spent telling my teachers that I was out.
How did that go?
It was a little tough.
Was it emotional for you?
It was definitely, it felt like the end of something.
Yeah.
And, you know, Harvard, when I told them I was leaving, they said, you're coming back.
You're coming back.
Exactly.
And I think it's probably for their numbers and, you know, that kind of thing.
But it definitely felt like, okay, I see how this goes.
MIT was a little bit more like, okay, like talking to the relevant person that they said that, you know, if you have to check in every six months or so, because if it too long goes,
by, then, you know, maybe you'll have to reapply and we're not really sure. And it just was a very
different vibe. And I remember talking to the professors, the professors were supportive. And I think
this is the thing that they see. And, but they were also, you know, I think a little sad to,
to see me go, like mid-semester. Yeah, I really appreciated the mentorship they provided. Like, I was in the
middle of the operating systems course, which is this famous course at MIT taught by these, like,
extremely good professors. And it was just such a cool class. And I was very,
very sad to not get to implement further projects.
And so it was definitely kind of giving up on some sort of exposure to an experience.
And it was also hard to say goodbye to the people, right?
There were all these people that I'd gone there to learn from and work with.
And the funny thing is many of them ended up coming out to the valley, got to work with them at
Stripe, got to spend a bunch of time with them in other ways.
So it was much less of a goodbye than I thought it was at the time.
Yeah.
growing up where you grew up and moving as quickly as he did, did you feel different?
I definitely felt different, yes.
Was it lonely?
Did you feel like an outsider?
I definitely felt different.
I definitely didn't fit in, right?
And there was like a lot of things that other kids were into that I just didn't understand.
Like, I remember being on the school bus in kindergarten and other kids were like singing along to the radio.
And I didn't know any of the words.
Yeah.
And I just felt like I don't even know how to bridge this gap.
And so I had a number of moments like that where I just felt like there was just something about me that like doesn't quite match.
And some of it was about activities like a lot of the kids would hunt.
And that was not something that my family did at all.
And so there was just something very different.
I did play hockey for a little bit.
I was not very good.
But I was goaly.
So I started to try to do some activities that would match.
But I remember I was one thing I really cared about was carving an identity for myself.
Because if you're different and you don't really think.
feel like you have an identity, then it's lonely. But if you're different and you have an identity,
something that defined you. Yes. Then you're charting your own path. And for me, it was being a smart
kid. Like I remember in elementary school that we had a weekly spelling quiz. And the way it would work
is on Monday, the teacher would give you the 10 or 20 words for the week. And then at the end of the
week you'd be tested on them and you'd kind of have this pre-test on Monday where you'd have to
write them down as the teacher says them. And if you got it wrong, you'd have to go ask one of
the other kids who got it right how to spell it. And normally I'd get them all right. But I remember
one week, I got one of the words wrong. And I remember another one of the smart kids in the class,
he got it right. And I was so ashamed that I would have to go and ask him for the right answer,
because that would be just eroding this core identity that I had. Wow. That sounds like it was
really healthy, though, that you got to do that, and it set you up to not have to have all the
answers. Yes. Yes. I think it was not necessarily a thing. It was a good story. That's right.
So then when you got to meet someone like Patrick, was it a feeling of, oh, he's like me? Yes.
Must have been a great feeling. It really was. It really was. And he had done a startup before,
so he and John had previously. And he was 21 and he had already done a startup. Exactly. Yes. How did that
happen. So they did, they did a startup called, oh man, I haven't thought about this for a while.
I'm not going to remember all the details. But I think, okay, so Patrick and John grew up in Ireland,
and I think that Patrick had met Paul Graham, who runs Ycommodator, this startup incubator,
through the LISP community, so through the programming language that they were very into. And I think
that that was his connection to then doing YC, doing a startup, that startup got sold to a company
called Live Current Media. They worked for the acquireer for some time, but clearly that was not
the thing that they wanted to do with their lives. How early were you in Stripe? So early days of
Stripe, there were four of us. There was John, there was Patrick, there was Darabuckley, and there was
me. And when I first was there, there was some infrastructure and there was a payment processor.
It also wasn't clear exactly how we're going to proceed because there was one idea was, well,
if we build some apps and then use this payment processing we're building to power those apps.
And so that's how you actually build something.
And eventually the payment processor will probably be the thing.
But these apps could be too.
And so there was a time tracking thing that Patrick had been working on that was one of the potential ideas.
So it's still in this nascent form, but still something that we could see like the direction of travel.
And the funny thing, by the way, is that I mentioned that I did this startup class in MIT.
and that as part of that you're supposed to build like a mock startup.
And the one that I ended up building was a payment processor.
And so I spent all this time trying to figure out how do you do payment processing online?
And realizing, it's horrible.
It's painful.
It's like trying to go to PayPal and trying to read their documentation,
trying to figure out of sign up for something.
It was just so opaque.
And you just realized, how can it possibly be so bad?
Yeah.
And that was the fundamental unlock for Stripe was this realization that you can do payments processing better.
I remember that the initial website said payment processing,
doesn't need to suck, right?
And it was just like, like, that is the ethos.
It's just doesn't need to.
It can be better.
And I really like that spirit.
I remember talking to a VC at some point.
This was pre-launch, but we had gotten a lot of buzz.
And he was saying, okay, what is your secret sauce?
And I was like, well, we just do it better.
And he's like, no, no, come on.
I know that's what you say to everyone, but she's like, what's the actual secret sauce.
And I'm like, I don't know what to tell you.
Like, that is what we do is just focus on every single detail and get it right.
So it really wasn't creating something new.
It was already PayPal.
It was just a much better PayPal.
Exactly.
Is that right?
It was.
And it was really focusing on the details of the whole experience end to end, right?
And really thinking about just because if you've gone through it yourself and you realize like this part, like how do you sign up for an account?
How do you actually even connect to the APIs to the actual way the computers talk to each other?
What are the different parameters you program against?
how do you figure out which programming language to use or which rapper to use, all these things,
each one of those can add a ton of friction. And so if you just focus on, okay, I'm going to make this
one good, this one good, this one good, this one good, make it good for myself, make it good for my friends.
Actually, this will be something that makes it good for everyone.
And why did you leave Stripe?
Well, I had been at Stripe for almost five years, four and a half years. And about four years in,
I think I started to consider whether or not this was what I wanted to do for the long term.
And the way that I kind of view it, is like, okay, if you think of your career in five-year chunks,
and five years is about the right length because less than that is kind of hard to do something
significant.
And I remember feeling like, okay, I had gotten this company to a place where it was going
to succeed with or without me.
Yeah.
And then the question was, do I continue or do I go and do something new?
I was very excited about doing the idea of a startup.
I remember talking to Patrick, and I think he had some very convincing reasons, too, to stay.
That one point he made is that it's very hard to assemble a group of people that can do significant things in the world.
And best case scenario, you go start some startup and have some success with it.
And then five years later, you'll have formed that group of people that can do stuff.
But you already have it here.
Yeah.
Right?
We already have this tight group that was able to accomplish things.
And so why walk away from that?
So it was very tough.
It was not easy.
I remember I cried.
when I told Patrick that I was out, like, yeah, it was, it was very, very tough.
He and John threw a juice party for me.
It was a very, very nice send-off.
And the reason that I decided, I remember feeling like, as I thought that through,
I was like, well, if it really is so hard to build that group of people who can accomplish
significant things, I've got to get started now.
Yeah.
It sounds like you left because the mission of Stripe wasn't the mission that you wanted to focus on.
That is true.
Yeah.
It's a beautiful mission, and it's one I very much support, but it's different to say,
is this one where I will just need to pursue it in any form?
And did you leave knowing what was going to be next, or did you leave thinking, I'm going to
figure it out?
More of the latter.
I had a list of three different areas I might focus on.
Number one was AI.
Number two was VR slash AR.
Number three was programming education.
And for me, it was very clear.
If I can contribute to AI in some way, okay, I'm doing that.
But it wasn't clear to me, do I have the skills, the time, all those things.
And so the other two were kind of backup options.
I understand there was a point in time that you wrote a chemistry textbook.
That is true.
When was that?
That was also 2008.
So after high school, I took a year off.
And I had spent much of my high school doing academic, both college courses, but also competitions.
I got very into math competitions, got very into chemistry competitions.
And I remember 10th grade.
because I'd taken chemistry in ninth grade, I took some chemistry competition that my mom had found online and just kind of did it on the mark.
And I got best in my region.
I was like, okay, that's cool.
I got, my state didn't really have it.
So I was in Minnesota to do it.
I took the statewide one and I got best in the state.
And they invited me to the training camp for the international chemistry Olympiad.
So it's top 20 kids in the nation.
and for this they send you some textbooks ahead of time.
They say, please read chapters 1 through 8 of this organic chemistry textbook.
These are the textbook weren't used for the camp.
And so I read through chapters 1 through 8.
It didn't take it that serious.
I was like, look, I'm just destined to succeed.
Like, it's just going to work.
Like, it happened so far.
And I remember I showed up at the chemistry competition with the chemistry camp.
And the other kids had read not just those first eight chapters.
They'd read all the books.
They'd read not just the whole one book, but all these big fat, like physical chemistry,
all these other ones.
And I was just like, wait, what?
You can do that?
These other people are doing that.
And I remember just feeling so demoralized and crushed for the two weeks of the training
because it was just like these people knew all these things that I didn't know.
And we were supposed to be studying for the final exam.
I was playing cell phone games in my room.
I just gave up.
I was like, there's just no hope.
Yeah.
And so they announced the top four to go to the international.
They said these four people, they weren't me.
Yeah.
They announced two runners up.
They announced those two. It wasn't me. I was like, I'm obviously 20. Fortunately, they don't tell you
the ranking of the rest. I was like so bad. And I remember at the beginning of the next school year,
looking back at how I'd spent that summer. And I felt like I had this amazing opportunity in front of me
and I'd squandered it. And it was a horrible feeling. It was this feeling that I had just coasted,
that I just sort of believed that talent alone would just get me to where I wanted to go. I didn't have to
work hard. And I was like, I never want to feel that again. And so I took it seriously.
that year I started taking physical chemistry I took organic chemistry I took
all these things I spent a lot of time on looking at old chemistry competitions
and learning them and looking through a bunch of different material in order to
become a real competitor and that year I made the team at the top four went to
the International Chemistry Olympiad got a silver medal it was very very awesome
experience I really enjoyed it but for me the big takeaway was this feeling
of you always have to work hard yeah
And, you know, one of my favorite quotes is the cycling quote.
It never gets easier.
You just go faster.
And I think that that is something I very much lived by.
So how did you end up writing the book?
Well, I felt like I'd come up with a very unique way of looking at chemistry,
a very mathematical first principles approach to it.
Because if you read most chemistry textbooks, it just says, well, basically memorize these reactions,
here are these chemical properties.
Here's these compounds.
But you always ask why.
Like, why is it that way?
Does it have to be that these atoms interact in this way?
Does this compound have to be this color?
Like, how do you derive it from first principles?
And so the approach that I'd take in in order to be good at the competitions was to really
try to figure out the underlying rules, not the information.
Which wasn't in the textbook that you read.
It's not in the textbook.
You have to still it down yourself, right?
And maybe they try to communicate in some way, but it's like it's just not the thing that they
spend their time on.
And so I structured the book.
I felt like I had a different way of teaching.
in chemistry and I structured it in a way that was inspired by some of my friends who had done something similar in math in this math form called the art of problem solving where it's a very Socratic method so the book has just questions
But they're intentionally scoped so that each one builds up and so the first one starts from knowledge you should have if you just think a little bit
You're like oh, I can see how this works and the next one builds on the previous thing next one builds on the previous thing in the case of chemistry you need some experimental results so you say like here's this double slit experiment and then you're like okay well what does it mean for?
particle versus wave and then okay if it means it's both a particle and a wave then you know and so forth and
this was something that I really wanted to communicate to others I really cared about not just this
approach living in my head but other people being able to benefit from it now I never finished
I made it for about 100 pages you can find it on my website if you're interested but I felt like
that ethos was something I really wanted to carry forward you basically wrote the book that you wish you
could have read that's exactly right that's great yes
how is coding similar or different to other activities?
So the way that I think about coding is that you deeply understand some process.
You write it down in a very obscure way we call a program.
And then anyone can get the benefit of that thinking, right?
People don't need to write the code.
They don't need to understand the sort of mechanics of what went into it.
And I think that there are other very cerebral domains that are like,
this like mathematics, right, where you think hard about a problem, you write it down an obscure way.
We call proof, but no one reads those, right?
Only like the five mathematicians who care about a particular domain will really deeply read it.
And so I think that what makes coding stand apart to me is it's almost like magic.
You sort of have this vision in your head and you just by describing it somehow it comes to be.
And so in some ways it's like management, right, that you have a computer that is there to
perform the function that you have in mind, the vision that you have.
and that it carries it out in a very literal fashion when you write a program.
And I think that that to me is something I've never seen in really any other traditional domain.
Like it feels like many other things that you might do, that you just don't get that same leverage, right?
If you have this vision and somehow it comes into reality and that you don't have to physically move things in the world, it just comes to be.
Would you describe it more like a language or more like max?
I would describe it more like math.
But the thing about math that I think there's a misconception for is that math is not one plus one, right?
It's not about these like mechanical calculations.
Math is about the underlying structures of the universe.
It's about understanding how different objects, different ideas relate to each other in this deep conceptual way and the sort of symmetries and the relationships between objects look very, very different.
And I think that programming is like that.
It's really about understanding how should a website work or what does someone want and what are all the different ways that something in a corner case should behave.
If there's an error, how should you handle it?
All of these things, they feel somewhat mundane.
But if you really look at the underlying architecture, you're thinking about you have all these systems that are talking to each other,
you have data stored in these different forms.
You have ideas like encryption that are brought to play.
And how do you orchestrate all of that in a way that delivers something useful?
And so to me, I think it's about the beauty of mathematics that's reified into useful form.
Is math an overlay on nature or is nature an overlay on math?
That's a great question.
I think math is the fabric of the universe in any ways.
Like, the thing I love about math is that it is the set of rules that are true in any reality.
Like I remember in middle school, high school, starting to learn biology.
And you learn all of these details about how different cell types work and all these different processes about chlorophyll and photosynthesis.
But I remember feeling like, okay, this is something that happens to be true right here for this specific organism or way of studying things.
But is this universally true?
Like, does this have to be true always?
Does this have to be how things work?
And the thing that I love about math is that it is purely decoupled from observation.
It is the set of things that must be true.
And so I think for me, math is this deep, immutable understanding of what is even possible.
And then nature is an instantiation in a specific form.
You think it was possible for nature to exist before the understanding of math?
It was math needed for nature to exist?
I'm of two minds at this one because to me math feels like it exists independent of everything else.
It's really a first principles, deep truth that doesn't matter what perspective you take.
Like if we were to meet an alien from, you know, hundreds of light years away, they would have the same mathematics.
We would probably have something in common in that way and maybe nothing else in common.
But at the same time, I also feel like there's a question I struggle with sometimes.
of do we discover mathematics or do we invent it?
Is it a deep truth or is it our perspective on it
that makes it come into reality?
And I think that a naive view is to say,
well, it's already there without us.
But in some ways, if there's no one there to appreciate it,
it's a little bit like the tree that falls in the forest
with no one to hear or is the sound there.
And so I think that there is something about nature
and the fact that we are here and that we are observers,
that we are thinking beings that causes math to have meaning
rather than just be an abstraction that with no
no significance.
If you were to explain what the coding process feels like to someone who doesn't code,
how do you describe what it feels like you're doing?
So the most beautiful part of the coding process is when you're in flow state.
And there, everything just kind of clicks.
You have some objective.
Perhaps it's something simple, like you want to change the color of a button.
perhaps it's something complex, like you want to build a database or a big distributed system
with many computers talking to each other.
But you have some objective in your mind that you want to see in reality.
And you have a partial implementation.
Maybe you're starting from a blank slate.
Maybe you have written some code and it sort of works or it implements a subset of what you're
looking for.
And then you try it.
You test it out.
And you see that there's a gap between what you have in your head and what you observed
in the system.
And sometimes it's a subtle gap, which usually you think of as a bug.
You wrote the code.
It was supposed to do something.
It did something different.
And then in that case, you start to form a mental model of, well, I know how the code is written.
I see this observation that's different from what's expected or perhaps was expected, but is not yet what I have in my mind.
And you think about what's the gap.
Maybe you know immediately, oh, there's the specific line.
I probably messed it up.
this value is too high, I should go back and check.
Maybe it's you don't quite know.
So you think about, well, what would give me more information to figure out where it comes from?
And so often adding what we call telemetry, so adding observability so that you can get numbers out from the middle of the code about how fast different parts are running.
You can get little log lines that say that the code effectively saying, I did this, this happened.
Here's some event.
and then from looking at that trace, it's almost the history of what has happened within the program,
you can go back and trace it out.
And so in many ways, it's almost like you're, in a very detailed fashion, instructing someone else for how to perform some process.
And you have some understanding of it and you want to write out all the rules for exactly how it works in every single corner case.
And that whenever you see some undesirable outcome, you go back and say, well, what did I get wrong in the rulebook?
And you try to really have enough understanding of the trail by which that outcome occurred so that you can go back and make the appropriate changes.
How has vibe coding changed the process?
Vibe coding is a very fascinating moment.
And I think that what is happening right now is software engineering is changing entirely.
So I remember the first time that I really vibe coded.
This was for a live demo of one of our early coding models.
We built a little web interface and you could talk to the model to ask it to write some JavaScript.
And so I on a live stream had it build a little game and asked the people watching the live stream to suggest a feature.
We implemented the feature just by talking to the computer.
And to me, the deep thing that's happening is that computer,
computers have always been created in order to help humans, right?
That's the whole point.
And that we can toward ourselves to the machine when writing code, whether there's a low-level computer
programming language or all this skill and machinery for how you actually tell whether
computer did the thing that you wanted.
And what vibe coding is, is it's moving the machine closer to the human.
And so you instruct it, you still need a good, depending on how good the model is, you need
and how difficult the task is, you need some mental model of how it's going to solve the task.
If you just say, build me an awesome website, well, what is an awesome website?
So, okay, fine, I want there to be a button here or I want there to be this type of functionality.
You are effectively acting as not a individual contributor, not a software engineer, but you're acting as a manager.
But a manager who's still very accountable for the outcome.
And I think what is happening is that the models have been incrementally getting better over the
past year in a significant way. And I think there was a real turning point in December of 2025,
where it was the first time that for many of the expert engineers that I know, these models
went from being kind of nice, kind of useful to they can actually do incredibly hard pieces
of work. And so it's really shifted from being a thing for demos and just if you don't
really know a programming language, you can get something done quickly, but it's not quite right.
to this is driving serious work and really accelerating what people can do.
Is there still a reason to code the old way?
I think that coding by hand in some ways is like handwriting, like pendantship, like
calligraphy that there is an art to it and that there is an understanding of how everything
fits together by operating at that level.
There's a way in which it's like mathematics where you still really want to load,
even if you're not going to do a bunch of hand calculations,
you probably should still know how to do long multiplication.
And there's something about what coding is really about
is understanding abstraction.
It's really about understanding how systems fit together,
how pieces will interoperate.
And that is something that you want to become an expert at.
And so if you're too far from the details
and that you're just pushing a system
and you don't really understand how it works on the inside,
I think that that is a limiting factor.
As the models get better,
I think that the nature of what
you as a human need to really take accountability for and responsibility for will change.
The machine will be much better at a bunch of mechanical things, a bunch of the design things,
how interfaces work.
Each of those will come over time.
But at the end of the day, it's your vision.
And that if you care about how your vision is implemented, then knowing the nuts and bolts
of how the machine is going to do it, at least having a good mental model of it, that is
something that will pay dividends.
Can you see a time when that won't be necessary?
You tell the machine what you want, it codes it the way it wants, and you can still do iterations and improve it.
But is there a time when the active coding will be maybe like Latinist today?
I think of it almost like an ocean that is rising, right, where the ocean level is the capability of the models and that you have these islands and that every so often an island gets totally covered by the water, but there's other islands that are even higher.
And I think that coding is like that, where the islands are difficulty of problems.
And so I think what we're seeing is some islands have already been covered.
That, for example, I have a test prompt that I've used for every model for a number of years
to build a particular website that was one of the first websites I ever built.
When I first built it by hand, it took me months.
Then when I used an early one of our Codex models, it probably took me five hours,
six hours, something like that.
With our latest model, it just codes it up.
in a minute. And I just don't even have to touch the details. And it actually does a much better job
than I ever did, even with the previous iterations. That's really interesting. You really feel
the power of what is at your fingertips and the fact that you are now empowered to do even more.
But there are still mountains that we have not covered. And the question is, are those mountains
infinite or finite? And I do think that maybe the best analogy for where we're going is that
people are going to become rather than individual contributors, they'll become managers of agents,
and then they'll become middle managers, right, moving up the food pyramid, eventually become
CEOs of this organization of agents. And the one thing that I actually don't have an answer for is,
yes, all the mechanical skill, all of the deep debugging, all of the architecture, all of these
things, those, you can see how the machine will get very, very good at that. But owning the outcome,
right, the accountability, the picture of, is this doing what you want? I don't see,
a line of sight, not to say that you can never happen, but I don't know how that is something
that you would ever transition out of the human. And I think that that is something that is
deeply human and sort of something that is unique to being a person with an intention,
with a life, with relationships. And so I think that there is an involvement in the process
that you will always care about. And just like if you're someone who is having a house built,
you may not care that much about where every nail is going, but you care of all. You
about the outcome and you care a lot about are the people building it, doing it in a way that
you'd be happy with because you're the one who at the end of the day, if it's built poorly,
is going to be on the hook. How different is a poorly built model versus a well-built model
in terms of how it functions? In other words, if you can describe something that you want to
work a certain way, and if it works that way, and if the code is done elegantly if you were to do
it yourself, and maybe less elegantly if the machine were to do it, but it still did the
the thing you wanted it to do, would it matter?
So this is something we see very concretely within Open AI.
And we are taking a very agent-first approach to software development.
What does that mean?
We have set a goal that by March 31st, though, very soon, that we want two things.
One is that the default tool that all of our people reach for is an agent rather than a text
editor or a terminal.
So really that this becomes the tool.
that you find the most reliable and the first thing you apply to every problem.
The second is that the default way that people use these tools is something that has been
explicitly evaluated as safe and secure and that those two factors to me are what it means
to be agent first.
So it doesn't mean that you never go into the details, but it means that most of the time
you don't have to and that you've built both trust with the system, but also that we've
architected in a way so that we can have trust in the whole organization.
as an outcome. Now, if you drill into some of the individual points of how we direct people,
one of the hard decisions was really about how do you make sure that the code base doesn't
turn into AI slop, stuff that works, but somehow is just not very good. And we have many years
of making sure that it doesn't turn into human slop, right? That we have individuals who write code,
and then you have people who review code, and that you have a large system of incentives for,
you're writing good code that over time is something that is maintainable and other people can build on,
then you get promotions and that you have good performance reviews and all of these things.
And we need to bring this to bear in the agent world.
And so that we have a mantra of say no to slop so that we tell all the human reviewers that
there's still whoever is submitting the code needs to be accountable for it.
You as a reviewer should hold an even higher bar than you would for a human on quality,
make sure that the person actually understands what's there.
It doesn't mean that every single line that they really know, but that they can really sign off and say, yes, this improves our codebase rather than regresses it.
And one of our best engineers, as he's been playing with these models, has found that the way he wants to balance is that he wants to control the interfaces.
So he goes and still writes by hand, here's the components, here's how they fit together, maybe here's the file structure.
But the details of how it gets implemented, which often are quite intricate and complicated, those he outsources to the machine.
And so I think that the difference is really, if you don't pay attention to a particular aspect of what the model's doing, will it be something when you look under the hood that you're proud of?
And that right now we're evaluating that.
We're seeing in what circumstances the answer is yes versus no.
But we're saying that our requirement is that if someone does go look, that the answer should always be, yes, I am proud of this.
Or alternatively, you can have some sections of code that you don't care if there's a question.
good or not, as long as they meet the correctness specification. And so if you have good ways
of verifying that that section is correct, then you can have very highly optimized code that's
extremely hard for anyone to build on. But you don't think of it as a, as the way that we think
about code is this evolving artifact. You view it as a one-off, okay, I produce this. I'll produce
a totally different version, never build on top of it. Is most of what's happening, like building
blocks being made, and then maybe the way they link together is more casual.
Yes. Would that be a way to describe it?
I think that's a pretty good way to describe it.
And I think what's going to happen is that the size of the building box is going to increase over time.
And the way in which the human oversees is also going to move up that level of abstraction over time.
Because right now you look at every individual agent, that's going to feel totally barbaric and slow even just this year.
Because you'll want an overseer agent that's looking at the work of all these different ones and flags to you, this particular,
detail doesn't look quite right over here. This agent seems to have gone off the rails. This,
you know, it doesn't look like what you wanted. And so I think that really figuring out how do you
as a human manage a larger and larger fleet of agents? And in many ways, it's really about the fundamental
measure is how much compute does an individual human marshal. That is going to be the most
important metric for future productivity. What would you say the biggest technological revolutions
you've witnessed over the course of your life, each one? Well, I,
I remember growing up in North Dakota and reading like a Time magazine article about Silicon Valley
and feeling like I was born too late.
All the exciting things.
That was already happening.
It was happening.
And I wasn't there.
Yeah.
I was too young.
It just felt like there's only so many good ideas in the world, only so much innovation
that's possible.
And I can see it happening right now.
And I am not part of it.
So that was a moment that I felt the fomo.
I felt like, wow, I don't think all of everything.
see anything like this again.
Yeah.
And I remember later things like mobile phones and the shift there.
It was never something I was very passionate about.
Like there are a lot of people wanted to build apps.
For me, it never really felt like the thing that I was deeply attracted to.
And I think that there's so many different pieces of technology that add up to modern life.
Think about, for example, Uber.
Imagine describing Uber to someone in like 1950, right, to Alan Turing when he's writing his
paper on AI.
So you've got to explain computers.
You have to explain the internet.
You have to explain GPS.
You have to explain so many things.
And then the net effect of all this crazy technological innovation
is so that you can get a car to you in a few minutes.
And in some ways, it feels like a trivial outcome.
But in some ways, it's so deep.
And you realize that all these magical pieces of technology,
and they truly are magic, right?
If you were lined it before them that they would feel almost impossible.
But now we take them for granted.
And now we can use them to just make our lives sometimes more convenient, but sometimes it's to really enhance us and to accelerate us.
And to me, that was the underlying theme of each piece of technology that we've created.
In each of the pioneering situations that you've been involved in, how much do you know in advance of what it's going to be?
I would say, I always go in with a vision, right?
I always go in with thinking about the, just set aside all the reasons this might fail, just dream of-
What it can be.
What it can be.
Like, what is even possible, right?
Within the laws of physics, right?
There's no point in dreaming of the impossible, but how it could go.
And I think that Stripe was like that of thinking about, well, we could build something that will be this amazing payment behemoth, right, that actually makes payments more accessible.
They make payments more accessible.
More things happen.
Yeah.
That sounds great.
And the people there, great people, you can learn and grow together and you could do more things in the world.
So that to me was something that felt like, okay, I see what this could be in that way, many details to be figured out.
Open AI, you think about AGI.
It's like if you can actually build an AGI into existence in a way that lifts up everyone, there's no better thing to work on.
That is the most amazing thing you could hope to do in a technological sense.
And then for projects with an open AI, I think usually it has the same kind of flavor to it where I remember there are moments where I see a demo or I see a result.
You see a little initial curve and you just realize this is going to work.
I remember at some point actually, Vojchak, one of my co-founder, saying that the thing about this field that is so remarkable is every idea works.
As long as it's theoretically motivated, the math works, it's like.
going to happen, right? It will actually succeed. You'll actually get good results. But the challenge
is really figuring out which ones are going to the fastest path to your objective. And so it really
feels like there's fruit lying on the ground. And because it's all lying on the ground, sometimes
that's the challenge. There's such a massive opportunity space. But figuring out the path through,
you need these proof points. And I think we're sitting in the middle of one right now. Yeah.
Very, very clear with these agents getting so good at not just software, but they're going to get
very good at all knowledge work. I think this year we're going to have a very big transition
in how work is done. I think it's one of the biggest issues with AI in general is that because
it's so open-ended what it can do, it's hard for people to imagine what it's going to do. It's the tool
that can do anything. Yes. It's hard. That's right. Yeah, we have this problem with chat chavit,
right? You show up at chatybt. There's a text box that can do anything. Yeah. But what do you want it to do?
Many people just you're stuck on the blank page. And I think,
think that that actually the dual of it shows the opportunity with AI. I think that ultimately
AI in my mind, AI is opportunity. Like that is why we build it. The opportunity is that if you do have a
vision, if you do have something you want, if you do have a particular way you think things
should be done, if you have agency that you wish to to see enacted in the world, we have the tool
for you. There's lots of questions as these tools get better, like where did the human?
humans fit in, what does it mean to be human, all these things?
And I think that agency, that drive, that vision, those things, those are something that we as humans
have to contribute.
Yeah.
What is AGI and is it very clearly delineated?
I think of AGI as not just a system that can do any intellectual task that humans can do,
but there can really be this force multiplier for an individual human to the extent that they can operate as,
the visionary as the CEO.
And so it's that level of empowerment for the individual.
And to me, it's really not just the technical abstraction of this system that theoretically
could do something.
It's how it's actually deployed into the world.
That's a new definition.
Yes.
It's how I've grown to think about it.
Yeah.
Because if you look at our mission, it's not just a technical problem, right?
You can define the technical AGI, but I think for the very, very beginning of Open AI, we
were unsatisfied with the idea that we would just write papers, right?
You could write all the papers you want to describe how an AGI would be built, but there's no impact.
The impact is the instantiation into people's lives.
And I think that what super intelligence will be is something beyond that signposts and capability.
And I think that the outcomes we want from it are to help us solve problems that are totally out of our reach.
Great.
Right.
And I think that solving diseases, I think that space travel,
I think that there are many, many problems that we see.
Is chat GPT the primary product of OpenAI?
No.
What is?
The thing that we ultimately are selling is intelligence on demand for your problem.
ChatGPT is one instantiation of that, and it's massively popular.
Almost a billion active users,
every single week. But that's not the end of it. We have an API that many business customers
build on top of, and that is also growing absolutely phenomenally. Our newest product is called
Codex, and it's transforming how software is built within Open AI and really within the
industry as a whole. And that, again, is something that we're just seeing this takeoff.
And the thing about what Codex is, it's really two things. It is a general purpose agent harness,
so that's able to use tools for any kind of tools, any kind of application, and it's a system
that knows how to write code. But if you think about that first thing, that is extremely
valuable and repurposeable for any knowledge work task you might want. Anything you might want to do
with your computer can be expressed in terms of an agent doing some things, orchestrating some tools.
And so we're starting to apply it to things like Excel, PowerPoint, being able to create these
artifacts that people in business produce and various business functions. And we're spending a lot of
time to actually make our model very, very capable at this. And so I think what we're going to
see by end of year is a very different product surface where every knowledge worker will have
this tool that really gives them superpowers. What is agentic AI? I would think of agentic
AI as a model that you don't just talk to like in chat, but that it's hooked up to tools.
And that at a implementation level, it's almost like the model it can chat with you, the human,
but it can also chat with the system.
And it can say, please run this command.
Please create this spreadsheet.
Please look this thing up in someone's email.
So it needs access to the outside world.
It's actually able to take action.
It's not just like chatGVT, its impact is more cerebral, right?
As you talk to it, it talks back to you for agentic AI that it's actually embedded in the real world and can take action.
But one thing to note about agents is that we've been increasing the time that an agent can run quite significantly.
So you can have agents now that do useful work over the course of the day.
And the space of how you apply agents is very large because you can apply lots of compute in parallel.
You can have many agents that are working on one.
one task and they're able to produce things that would take humans very, very long time to do.
Tell me a bit about the mechanics of chat GPT. How does it work? At the core, chat
QBT is powered by a language model. And you should think of a language model as a system that
takes in some text and then outputs some other text. Now, the text doesn't need to be literally
English language. It can be images. It can be video.
It can be really any sort of modality.
And the output, similarly, doesn't need to be literally.
So it's not just language.
It's not just language.
How long has it not just been language?
Well, to be fair, even from the very beginning, it wasn't just language because there was code.
You could do code.
I see.
Right.
But it was always text.
That was what we started with.
So GPD4, we had a downstream project of that called GPD4V, which was the first time that we had vision
and that you could have a model that would actually recognize images.
is and you put text in and images in and it would output text.
And since then, we've trained models that take in sound,
taken images, taking text, output, sound, et cetera, et cetera.
And so you actually can have a full voice conversation with models that we have produced.
And that's part of ChatShvT as well.
And so I think of it as these models are general purpose intelligence processors.
And when you create them, the way that you train them is that they look at data,
But they don't really learn the data.
They learn the underlying rules that created the data.
And that's what makes them smart is that they're sort of general purpose understanding machines.
And you can point them towards whatever task you have representative data, information, training for.
Do you have a name for the big brain?
Well, we have names for our different models.
So we call...
But the models are the use part of it.
But the thing that learns everything.
We don't have a name for the overall training system.
So there's names for various components.
For example, we have a training system that does a component of the system.
We have different systems for how we actually take a trained model and serve it.
But now also training and in France are starting to come together because you do reinforcement learning where the model teaches itself, similar to what touring was talking about.
And so there's kind of an overall orchestration system.
and each of these components, they have their own names, they have their own concepts,
and there's really a whole industry that's built up around how to train models for helping people in particular ways.
Is all of the knowledge in one big base, and then is it divided up to do these different things?
So we thought that an AGI would be literally one giant model.
And you've helped with these architectures that there's this thing called mixture of experts, for example,
where you think of it as it has rather than every time you're running information through the network,
that it runs through just smaller parts of the network.
And so there's an opportunity during the training process to specialize.
So the training process could choose to say, well, I want to specialize this for language and this for text or this for programming.
In reality, it's a little bit more complicated than that.
But I would say that the big unlock has been to realize that, well, the most useful tools that humans are creating for language and for vision and things like that are models.
And these big models, they're expensive to run, but they can use tools.
So why not use smaller models too?
And so I think that we're heading towards a world of this menagerie of different models.
And you're seeing this almost Cambrian explosion right now in the field of all these open source models, people are training,
and that we train models of all sorts of different sizes.
And so there's not really just one system anymore that there's these models that can talk to other models
and that are specialized for different purposes.
and that really introduces a diversity of approach
and means that you can specialize
in all sorts of different ways.
What is pre-training and what's post-training?
I would think of pre-training as a phase
where the model observes the world and learns from it.
At a technical level, the way that we implement it
is with what we call next step prediction.
So you show a model a sequence
and you ask what should come next.
And that sequence can be anything.
For example, it could be a public post for some site on the internet.
And so what is the word the human right next here?
It could be Einstein's thoughts.
It could be something very deep and significant.
And if you think about it, if you can predict every word out of Einstein's mouth,
you are at least as smart as Einstein.
And the important thing here is that what the model is incentivized to do is to learn
not just where the nouns are, where the commas go, those kinds of surface statistics.
It is incentivized to learn the underlying rules of this distribution, like where does this data come from?
Why is it here?
Because if you're just sort of parroting back what someone else said, it's not going to be helpful
as soon as you're looking at something new, which is what these models are trained to do.
It's really about the underlying rules and generation and deeply understanding any new situation
that the model is.
So that's what pre-training is.
And at a technical level, it is an extremely interesting problem that you sort of,
scale these up to massive numbers of compute devices,
that the actual problem that you have to solve is you put data through the network,
then you pass it backwards.
So you kind of see,
how do I have to adjust all the connections in order to have gotten a slightly better answer from the forward?
Right.
So the idea is as you pass data through,
you get to some prediction,
some output.
And then you see,
okay,
it's almost like if you have all of these wires that are connected,
to produce some result,
and how do you adjust the tautness of all these wires
in order to have gotten a slightly better result?
And that's how the machine is ultimately programmed,
is that you do this forward, you do this backward,
and then you have a step called an optimizer step
where you adjust all those parameters
and you do it again and again and again,
and do this at very large scale.
And so any individual data point,
even any individual large source of data,
it doesn't really matter, right?
Because you're talking humanity scale learning.
This thing is observing the whole world.
It's a little bit like if there was some part of your childhood that you forgot,
you're not going to forget all this knowledge.
It's like really about the underlying background, understanding of reality.
So that's pre-training.
And at a technical level, you slice up this computation across many devices.
And you're always trying to pump more efficiency and trying to get, it's not just larger models.
We also have all sorts of innovations on architecture so that we have models that are shaped in different ways.
And you map those.
A lot of the engineering challenge is trying to understand how,
the hardware works where its weaknesses are, how it fails, and then how do you design the
architectures in a way that are most amenable to that?
So the output of pre-training, and usually these runs could be a month, they could be multiple
months, maybe our longest one was somewhere around nine months, big team of people in
order to keep that thing running.
GPD4, for example, I was very involved in the pre-training and built a lot of that training
stack.
2 a.m. would wake up because the run was down, go and fix it.
Like, that's what you got to do.
And it's like every hour that the job is down.
every minute that it's down, you just look at the number of GPUs that are sitting idle and you think
about how many dollars are being wasted. But more importantly, just that you're missing back. It's a missed
opportunity. And you really feel that in a very concrete way. And then the post training. Post training,
you take the output of free training. So you take this model that knows a lot of things, it's seen the world.
And you try to tell it how it should use that knowledge. What's the right behavior in different circumstances.
So it's almost like when you have a child that's observed the world and learned a lot, 20 years old, now it's time to go to college.
Now you specialize and teach for a specific domain.
But one thing that's different about these models is you're not putting new knowledge in necessarily.
You're really almost pruning down what it already knows because the amount of compute in the pre-training process versus in the post-training process is typically very, very out of balance.
So the post-training process is usually a couple days, that kind of thing.
And a lot of how we teach the model, we've evolved these techniques, but classically, we
do it through feedback, that we would have some way of saying that, well, let's train a reward
model.
So another AI that can judge what this AI is doing.
And you train that one usually through saying these are good, these are bad, or here's two
possible generations from an AI, which one's better than the other.
And then this reward model judges the pre-trained model and then gives it feedback.
And from there, it's able to shape its behavior.
So we really have the ability to take this model that knows kind of everything.
I remember talking to Alec Bradford, who is one of our researchers who would describe it as
these pre-trained models, they're less like a human and more like a humanity, like everything's in there.
And then we do our best to steer it.
We don't always get it right.
But I think that's a lot of what we've been working on and improved.
So the pre-training is like the Library of Congress, let's say.
I'd say pre-training is like the Library of Congress.
and then post-training is almost taste,
giving the machine a sense of which of those books it would like
or what to do with that information once it's retrieved.
For pre-training, where's all the information coming from?
Is it the Internet?
I would think of the classic approach
has been publicly available data on the Internet.
What's been changing is that as these AIs have gotten much smarter,
that actually you really want to train them on their own data in some form.
For example, reinforcement learning where the machine goes out and tries to solve a task and you learn.
Everything it learns in trying to solve the task is now part of its knowledge base.
That's the idea.
That's really good.
And again, it's not just the knowledge itself.
It's also the skills.
That's what we're really getting at.
Like our dream has always been to have a reasoning model that is just a pure reasoner and that is able to in any new situation be able to figure out the right thing to do and to do a great job there.
And sometimes having background knowledge is helpful, but it's really about those smarts and the ability to adapt very quickly that is like the real thing that delivers the value.
How much of the human hand is involved in the post-training?
It's been changing as well.
It really used to be that we would have these large campaigns, and sometimes we still do, but that most of the data that we would train on would be we would have humans who would painstaking the label, these different examples.
And the thing is that as the tasks that the models are capable of has gone up, there's
way less to learn from most examples.
So you really need domain experts who are deep, deep in their field.
And so some of the tasks that we produce are these like incredibly complicated, like look
up, you know, this finance report from this specific year and judge how this one compares
that one.
What was it mean for the strength of the business, you know, like paragraphs of a prompt.
that's a very specific answer
and it requires
some domain expertise to do
and require a dozen hours
from that domain expert to accomplish.
So we've really moved up
the sophistication of the task
and the way that we teach the machine.
And it makes sense, right?
As the machines get smarter,
we really need to figure out
where are they breaking down
and it becomes much less
about massive volumes of data
and much more about
this very high taste,
very targeted,
like,
what are you?
the most important problems that we want this AI to solve and trying to then teach the machine
accordingly?
My takeaway from the AlphaGo story was that the computer made a move that no human would
have made, and that's why it won.
Yes.
And if you're teaching the AI how to act more responsibly as a human, wouldn't that undermine
its ability to make the move that the human wouldn't make?
And doesn't that undermine the whole AI premise?
So the objective of our research in many ways is to achieve that type of AlphaGo moment,
but in science, in coding, in these other domains.
Because that is exactly what you want, is you want new knowledge discovery.
And you're exactly right.
If all you're doing is just learning from what's been done,
it feels like how are you going to go further?
But the thing that that perspective misses
is that the way that we are now extending the training
to reinforce the argument.
It's not just take the public data,
take the humans providing yes and no,
but actually have tools that let you test things out in the world.
Like how do humans discover new things?
Sometimes we think deep thoughts,
but I think that usually it's through experiment.
Right?
we tried something, it didn't really work, we realized, oh.
And I think that part of what's going on is the universe is almost this massive computer
and that it has far more compute in it than our brains do.
It has far more computing it than our GPUs do.
And so that's why there's something to be learned from an experiment
because there's this computational process that is just unimaginable in terms of how sophisticated it is
and that you can kind of shape it to your will, right?
You can kind of have a ball that rolls down a incline of this height and that height,
And then if the AI is able to propose that experiment and see their experimental results,
maybe it has a robot that sets that up for it.
Maybe it's a human who performs it either way.
That is how you can actually discover new knowledge.
And I think that the idea of actually have experiment and then be able to compress all
this into this model and for it to understand the underlying rules or how all that data was generated,
that is something that we are just starting to see the fruits of.
If the AI proved something in physics that negates what's in the current textbooks,
is that a safety problem or is that a breakthrough?
It's already happened.
Tell me the story.
So there's a physics professor who has been a vocal skeptic of AI.
Yes.
And we convinced him to use our latest unreleased.
AI system. And he gave it a particular hypothesis in quantum physics that he's been planning on
working on this whole year with his collaborators. It's a very hard problem that people are
pretty sure that there's a particular answer to it. Yeah. And our AI proved that actually the
opposite was true. And his reaction was that this is the first time that it's felt like the
system is thinking, right? There's new knowledge in there. There's something very, very novel.
And that paper submitting it to be published.
But I think it's a very significant moment and very representative of things to come.
Yeah, I think the most potential in AI is when it does things that humans don't know is the right answer.
Yes.
That's what's exciting.
It is.
And if the corporate perspective is to prevent that from happening, which I could see an argument for we can't rock the boat.
this is accepted. The science is accepted. It's where I get most concerned about AI. The biggest
potential downside is that it can't do what it's able to do. I think this is a really important
point. And I think that the bigger framework around it really is that this technology, we have
the ability to steer it. And that's actually one of the deep motivations for why we created open
AI is that AGI, I think it will happen with or without us, but we think that we can help
be an influence on it playing out in a direction that we think is more positive for the world
and that that is our ambition and our aspiration.
I think there's a lot of value to be delivered by the actual creation of the technology,
the core really being that we think it's something that should be available to everyone,
and it should be something that just like humans can question the wisdom.
And it's always tough when humans question the wisdom, right?
There's a lot of antibodies that try to prevent that.
But that is also how society moves forward.
And I think that we're going to have to make choices as a society.
It's not just for any one company, anyone individual to decide.
It's something that as a society we should decide what are the rules of the road.
And a lot of how we've thought about it is there should be broad bounds that society decides
an AI can never cross.
Within that, you really need people to have the empowerment to pick.
that people need to have an AI that can represent them, whether it's about their values or whether
it's about being able to question. And there may be contexts where people don't want that.
That should be their right. That should be their choice. And there should be contexts where
people are trying to discover new science or whatever it is. And that should also be their choice.
And I think that we really have this philosophy is very different from others in the field of self-empowerment
and that this technology really is for everyone. Is there an AI bubble?
I think that we will find that we were under ambitious on compute.
I think that where we are going and we're seeing the proof points of it
is a world where knowledge work is amplified by compute power.
I saw someone tweet saying, if you're taking a job, ask how many tokens are in your budget.
It's kind of a joke right now.
It's not going to be a joke later this year.
And I think that the degree to which these tools are changing software engineering,
like the software engineers who have not tried this don't feel it yet.
But those who have, they feel it, they feel it in their bones.
And I think that we're going to see that across finance.
We'll see this in sales.
And people will be able to do so much more.
And what we're already seeing is individuals within Open AI who want 100 GPUs,
a thousand GPUs dedicated just to them.
Yeah.
And if you think about 1,000 GPUs for an individual, do you have a 1,000 GPUs?
thousand such individuals. You're at a million GPUs already, and there's not 10 million GPUs in
existence. So you can only scale so far. So I think that we are in a world where we are seeing
what's coming, what this technology is capable of. When we look at how good the models are getting,
when we see our own revenue curves, and we look at the things we cannot launch because we do not have
to compute for it, all of these things together mean that it is actually quite rational why
all of the hyperscalers, why we are all trying to build compute.
Like, I think where we're going is compute will be a basic human right.
Like, I think that people to be economically productive, and even for their own lives,
the more compute they have, the higher quality of life they can have.
And so I think we're going to have to be in a world where everyone has access to compute.
Will it always be GPUs or might something replace the GPU?
I think the compute is fundamental and that the computer is always changing.
And we're already seeing lots of different approaches.
And even Nvidia, for example, has acquired GROC, which is a different approach to a compute
device.
And there's lots of interesting approaches people are taking.
And so I think that GPU is a good stand-in for the different types of accelerators.
We ourselves, for example, have a custom accelerator that we're working on called an
intelligence processor.
And I think that there is lots of room to improve the efficiency and scale.
ability of this technology.
What's the most expensive part of the whole operation?
Compute.
Compute.
Where does electricity fit in?
I'd say you can almost think of AI as a manufacturing process from electricity to intelligence
and that we use electricity as one input to how we actually do the computation.
So electricity is almost like the water that like drives the whole, the whole system.
Does it get more and more efficient over time?
Yes.
But one note is that we also, as we increase the efficiency, and we increase it a lot.
Like if you look year over year, we tend to cut our prices for the same level of intelligence by 100x sometimes.
Like literally, like you can just see it from our price drops.
Yeah.
So like where we were for GPD3 in 2020, and that was a 175 billion parameter model, you can get that level of intelligence.
I haven't looked, but billion parameters.
parameter model, something like that.
Like, it's something that you could run on your phone for free, no problem whatsoever.
Open AI versus Anthropic versus Gemini versus GROC versus Deepseek.
Tell me about what's different about those companies.
For me, I really focus on us.
I think that what competitors are helpful for is almost as like a pace card.
Just to get a sense of how you're doing.
Sometimes they can point out that, oh, here's a particular feature that we didn't think we
could implement.
They did it.
Okay, we could probably do it.
So I think it's helpful for that.
But the way that we've always proceeded is that we invest the most in basic research in the actual paradigm shifts.
And you can see this where with language models, with the reinforcement learning paradigm,
and I think there's upcoming paradigms too that we are embracing and capturing and really put long-term investments in.
And I think that is one thing that really stands out.
It's actually interesting.
I had a candidate today who we're pitching him on a long-term research project.
And he was saying, oh, I was surprised.
I thought from the outside, it looks like you guys are always doing things fast.
And I didn't realize that you make these long-term investments.
And the reason that it looks like that is because we have a pipeline of them and they come to fruition one by one.
And I think we have the smartest models, right?
If you look at the reinforcement earning stack, I think we have something that is very unique and new.
I think each company has its own niche.
Like I think when it came to consumer chat chavit is by far used the widest.
I think Google did a very good job last year of building better models than they had previously,
and that they have a natural distribution.
Like it's very always front and center to me, the fact that Google is like a lot of compute,
they have a lot of talent, they have a lot of users, they have a lot of these natural advantages.
And I think that Anthropic is coming on the scene and they've focused very hard on coding.
And one thing that I think that they did well was,
that we were focused on the benchmarks in some ways.
We were focused on academic programming competitions.
We had amazing numbers there,
but we didn't focus as much on how these models will be used in the world.
I see.
And so training on these like messy repos
and how people are actually using it.
And that was a lesson that we were delayed on,
but we gathered a team, focused very hard on that.
I think we are caught up.
And I think that people are really seeing the fact that we are definitely,
definitely on takeoff. So in some ways having the other companies does give you an idea of other
things to be focusing on. It's a big picture. That's right. And there's so much in this world,
like the whole space of knowledge work is so large that figuring out exactly what to focus on is
sometimes the hardest problem. Since everybody's models are being optimized, it seems,
for the same benchmarks, does that end up being a limitation on what's being done because
everyone's focusing on this small group of tests? It can be.
And I think it has been in the past.
If you look at where things are going, I think that we are in a bit of a post-benchmark world.
What really matters is ultimately the benchmark of our people actually using it.
Yeah.
Is your revenue growing, like those kinds of benchmarks that are impossible to gain.
Because the problem with the academic benchmarks is they're very easy to get 100% on them, right?
You just train on the test set, right?
You contaminate.
The numbers on their own don't necessarily mean much.
So sometimes you'll see a model from a particular company.
or a particular model comes out,
and it's got really good numbers that are too good to be true.
And it always is.
Would there be an argument to not participate
in the benchmark tests?
So actually, there's two answers.
One is that the way that we develop these models
is through very good e-vals,
so that we actually do want benchmarks that tell us,
they're never perfect, they're always proxy metrics,
but that really tell us are we on track.
So we've made a system called GDPVAL, for example,
that's an e-val that shows how useful our models are
on a number of knowledge work tasks.
and that hill climbing on that is actually great.
At some point, you'll saturate.
And usually once you start getting into 80%,
on these benchmarks or something, 90%,
usually it means that you're done
and there's no point getting to 100%
because usually that means you're doing something
very specific for this benchmark that doesn't make sense.
And so I think it's as long as you're in this range
where it's like giving you a meaningful signal, fantastic.
And you don't want to focus on too many of them
because I think that there's usually
you want well-constructed benchmarks
that give you some good signal.
In previous years, there have been times when actually the right answer was you take an average across a bunch of benchmarks that are themselves not that good.
And then that gives you actually a pretty good reliable signal too.
But I think that the important thing is that you don't wake up every day thinking about how do I move the number on this benchmark.
You use it as a proxy.
You use it as a side metric.
And if you focus on it too much, then you can, what we call good-hearted.
What do developers do?
And are they assigned jobs or do the developers pitch you ideas?
So two big aspects to what we do.
There's research and deployment.
In research, we are creating new models, and that that requires both research and engineering
to be joined at the hip.
In deployment, that is usually about taking the fruits of research and bringing it to the
world.
There, there's a very high bias of engineering, but you need to be high context.
You need to really understand that research, because if you don't, you're not going to do
a good job of bringing it to reality.
What the engineers actually do is you, you know, you know, you know, you know, you need to be high context.
Usually on the research side, that there is some idea that we're pursuing and that we usually
form teams.
Sometimes these are very small teams, two, three people that are working on some novel idea
that's very high risk, may or may not work.
And there you just want to get some signs of life and you need to have enough patience
because it never works the first time.
Sometimes these are, we know it's working.
We already have signs of life.
We scale it up.
So usually the life cycle of an idea is that once we have an idea, we tried it out, we got
the signs of life.
We then put more people on it.
And a lot of the engineers are focused on.
The system is breaking too frequently.
Let's go figure out why.
It's too slow.
The data that it's importing, we're getting too small of a download rate, like trying to
figure out what's going on there.
There's often building distributed systems so that we're able to process things or be
able to take outputs from the model and be able to observe them and be able to see how things
are doing, building dashboards.
So there's a lot of work to be done.
And the actual ML engineering is a very sophisticated art.
and it requires a lot of deep domain expertise.
And there aren't that many people who do it.
There's writing the actual kernels,
so the actual code that runs on the GPU
that turns the low level representations
of all this data into objects that can be computed upon.
So it's almost like, you know,
you think about how your eyes turn visible light
into some signal in your brain.
And that once you're there,
it's almost like we have these people who are doing
the like the low level neuron engineering.
So all of that is,
very sophisticated, takes a long time to build up the expertise.
And there's a lot of techniques.
They're at that intersection of the machine learning and the engineering.
For example, as you scale up the size of the model,
the way that people classically did this was wrong,
that it turned out that you would get bad results,
but you wouldn't always know, right?
Should it be the case that as you scale up the model,
that you get very smooth curves?
There's no like deep reason a priority that you should think so.
But one thing we realized at some point is that actually,
We need to have different ratios between different parameters and get those things right.
And we could develop this technique called UP for how we set the initialization.
And that actually gives you a much more straight line as you go.
And so there's a lot of these things where even knowing if something is wrong is not easy.
And deep expertise and partnership between people with different expertise is what yields great results.
Have you followed the claw bot open claw story at all?
Of course.
Tell me your thoughts.
I love it.
So I think that, and I know I've spent time with Peter, I think he's great, who's the developer of OpenClaw.
And to me, OpenClaw encompasses two things.
First of all, it is a AI system that you can hook up tools and then it is always on and runs and is able to take action.
And I think that there's a hacker spirit that I really like about it of saying we have these tools, we have these models.
There's this massive overhanging.
They're so much more capable than what we're using it for.
Let's try.
Let's see what it can do.
And there's a second thing, too, which is that there's a gap that we have to fill of what I call the security architecture, right?
The trust architecture of, well, you have an AI that's hooked up to many things.
But how do you know it's going to do the right thing?
How do you know that it will send the right messages?
And people will post the things on Twitter that are kind of fun, like someone who his wife was texting him and his coddott replied.
and it's 2 a.m. and the baby's crying and the cloudbats replying and wife isn't having it.
Like, you know, it's kind of funny to read the text, but you think about where this goes,
we need better guardrails, right?
We need systems that are engineered for trust and safety.
And that's a lot of what we focus on, right, that we really think about not just the,
let's build the technology, make the capability, but how do we actually bring this to the
world in a scalable way that we're transforming our own enterprise on the basis of this technology?
We want to help transform every enterprise.
And so a lot of how we're thinking about this is that I think it's a great sign of things to come.
And I think that to really scale it to everyone requires making it incredibly easy for people to set up,
but also making sure that the default way that things are done is safe.
And that that is something that we are investing in pretty heavily.
And there are deep choices that we made.
For example, there's one vision of how you could build AI,
rewinding 10 years, which is that you keep it all secret.
And then you put together the finishing pieces and no one knows that you're doing it.
So you have no pressure to deploy.
So you really have time to get it all right.
And then you push the button and question mark, question mark, but then benefit the world.
And I was looked at that plan and I was like, I don't think I can sign up for this.
It just feels wrong.
You know, first of all, from a technical perspective, if you never encountered a reality,
how will you be certain that you actually have put in place the right?
safety systems.
You think about Cloudbot.
It's nice that it's starting as kind of a fun project.
It's clear that for it to get hooked up to very sophisticated,
important systems and being trusted with a lot of responsibility,
we need to develop new technology, but we're going to learn and iterate in a loop.
And so to me, it always felt important to get this technology right.
We have to encounter reality at each step along the way.
But the second is legitimacy, that you really,
if you're going to build technology, it's going to change everyone's lives.
I think people need to know about it.
People need to be included in that.
And so if you look at chat, GPT, we made a very deliberate decision to say,
we think that this is technology for the world to be included in for our ethos,
for what we started this company for.
It's for everybody.
Tetragrammatin is a podcast.
Tetragrammatin is a website.
Tetragrammatin is a whole world of knowledge.
What may fall within the sphere of tetragrammaton?
Counterculture, tetragameter, sacred geometry, tetragrammatin, the avant guard, tetragameter, generative art,
tetragrammatin, the tarot, tetragrammatin, out-of-print music, tetragrammatin, biodynamic, tetragrammatin, graphic design, tetragrammatin, mythology and magic, tetragrammatin, obscure film, tetragrammatin,
beach culture, tetragrammatin, esoteric lectures, tetragrammatin, off-the-grid living,
Tetragrammatine. Alt, spirituality.
Tetragrammatin, the canon of fine objects.
Tetragrammatin, muscle cars.
Tetragrammatin.
Ancient wisdom for a new age.
Upon entering, experience the artwork of the day.
Take a breath and see where you are drawn.
