No Priors: Artificial Intelligence | Technology | Startups - What’s Beyond GitHub Copilot? With Copilot's Chief Architect and founder of Minion.AI Alex Graveley
Episode Date: March 30, 2023Everyone talks about the future impact of AI, but there’s already an AI product that has revolutionized a profession. Alex Graveley was the principal engineer and Chief Architect behind Github Copil...ot, a sort of pair-programmer that auto-completes your code as you type. It has rapidly become a product that developers won’t live without, and the most leaned-upon analogy for every new AI startup – Copilot for Finance, Sales, Marketing, Support, Writing, Decision-Making. Alex is a longtime hacker and tinkerer, open source contributor, repeat founder, and creator of products that millions of people use, such as Dropbox Paper. He has a new project in stealth, Minion AI. In this episode, we talk about the uncertain process of shipping Copilot, how code improves chain of thought for LLMs, how they improved product, performance, how people are using it, AI agents that can do work for us, stress testing society's resilience to waves of new technology, and his new startup named Minion. No Priors is now on YouTube! Subscribe to the channel on YouTube and like this episode. Show Links: Alex Graveley - San Francisco, California, United States | Professional Profile | LinkedIn Minion AI Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @alexgraveley | @ai_minion Show Notes: [1:50] - How Alex got started in technology [2:28] - Alex’s earlier projects with Hack Pad and Dropbox Paper [07:32] - Why Alex always wanted to make bots that did stuff for people [11:56] - How Alex started working at Github and Copilot [27:11] - What is Minion AI [30:30] - What’s possible on the horizon of AI
Transcript
Discussion (0)
The most fun stuff was like these people that would pop up and be like, I don't program,
but I just learned how to write this hundred-line script that, like, does this thing that I need.
It's like, oh, my God.
This is the No Pryors podcast.
I'm Sarah Goya.
I'm Al-Degel.
We invest in, advise, and help start technology companies.
In this podcast, we're talking with the leading founders and researchers in AI about the biggest questions.
Everyone talks about the future impact of AI.
But this week on no priors, we're talking with the architect of arguably the first AI product
that is already revolutionized a profession, GitHub co-pilot.
Alex Gravely was the principal engineer and chief architect behind this product,
a sort of paraps programmer that auto-completes your code as you type.
It's rapidly become a product that developers won't live without, and the most leaned upon
analogy for every new startup, co-pilot for finance, for sales, marketing, support, writing,
decision-making, everything.
Alex is a longtime hacker and tinkerer, open source contributor, repeat founder, and creator of
products that millions of people use, such as Dropbox paper.
I have huge respect for the range of work he's done, ranging from hardware-level security
and virtualization to real-time, collaborative, web-native, and AI-driven UI.
He has a new project in stealth, Minion.
Just as a heads-up, we recorded this podcast episode in person,
and you'll notice we don't sound as crisp as we usually do.
Still, I think you'll enjoy listening to this conversation.
Let's get into the interview.
Welcome to the podcast, Alex.
Hi.
So let's start with some background.
How did you end up working in tech or AI?
Tech was earlier.
I started, you know, really young,
and I got really into Linux when I was 14 or so.
And, yeah, it was right around the time.
when the web was like a new thing and you had to work to kind of get on the web,
it's the idea of helping in the open and making it freely available so other people could
learn from things. Like I was learning from things seemed great. So I went and spent many years
just working on open source stuff. Yeah, spend many hours compiling kernels and hacking on stuff.
And yeah. And what was the thought process behind starting like hackpad?
Oh, hackpad. Yeah.
Yeah, so I had just finished like four or five years at VMware, and I wanted to get into
startups.
I knew that.
And then so I left VMware and I started working on an education startup like many of us do.
Many, many founders start with like the idea of an education startup.
It's like a right of passage.
Yeah.
It's like a right of passage.
Yeah.
So I spent, I don't know, nine months working on that.
Turns out education is very hard.
So after nine months, I was like, all right, this isn't going anywhere.
I don't know if there's a value prop here.
I mean, that's, that was the value is that I learned that, like,
you have to make something that is both achievable
and that people want to pay for or spend their time on.
So, yeah, then I was just kind of fishing around.
I was living in, like, a warehouse in San Francisco,
a bunch of Burning Man people.
And we were having trouble organizing large-scale Burning Man projects.
And so I forked Etherpad and started hacking on it,
recruited a friend to, in the camp.
community to start working on with me. And yeah, just grew from there. Before we did YC,
we had most many of the large burning man camps using it to organize their, their builds.
Can you describe the product experience? Oh, yeah. It was a real-time text editor, kind of like
Google Docs. Google Docs is the only other one that did that at that time. It was kind of nice
because it would highlight who said what. So you could go track down. If somebody had a contribution,
be like, oh, what did you mean? Or I heard you say something about this. So that's very useful in like
large-scale anonymous, pseudo-anonymous groups where you don't really know who has ownership
over what, like Burning Mancamps. And let's see. And then we did YC and then we got a bunch,
almost many of those people did it. My hack was to go do YC and try and get all the companies
doing YC to use my product, which many of them did. I also took like very extensive notes of all
the YC presentations using the product that everyone would then like look at. And we were
to get Stripe. And Stripe used us for many years, actually. They built for the, I think we were
their first knowledge base. They used it for a long time. A bunch of other big companies as well.
And post Dropbox acquisition, you worked on what became paper. How did you think about what you
wanted to go work on next? I spent the next few years kind of poking around it stuff. I knew that
I wanted to make a robot that does stuff for you. So there was a company called Magic. So I went to
this kind of called Magic.
They were doing this, like, text-based personal assistance.
You remember this one?
I think just like everybody starts an education startup.
Magic is one of those names of cycling,
and there's a really cool AI company right now called Magic as well.
So I feel like there's also these names that kind of persists from generation to generation,
which I think is really cool.
I'm sure you know it's Code Gen.
Yeah, yeah.
No, yeah.
I haven't seen a demo yet, but it sounds like they're doing the right thing.
There's a few people doing that kind of whole repository changes.
So it seems like a great.
direction. It was all op-space. It was super heavy op-space. So, you know, they had teams of people
and it was 24 hours and they would cycle in and, you know, they'd lose context. And, like,
they're all busy because they're trying to, like, deal with lots of people with lots of requests
all the time. And so really, it was like a crash course in, like, human behavior, right?
Like, what do people do under stress? How do they act? What do they say? What can you train? What can't
you train? Like, can you bucket stuff? And the answer is no. Like, humans are complicated, especially in
text from. The beauty of the web in like traditional UIs, right, is that you fill them in.
And like, if it doesn't do what you want, then you make the decision to either go forward or not go
forward, right? Text has this annoying property of you can be all the way at the 99% mark and then
change the goal entirely. Yeah, so it's complicated. It's hard to understand what people want.
It's hard to understand the complexity involved, especially when you're dealing with the real
world, flights get delayed, passwords get lost.
Do you think we're the last generation to deal with that?
In other words, it feels like we're about to hit transition point in which agents can
actually start doing some of these things for us for real, when before I think all
these products really started with these operations heavy approaches.
I remember, you know, there was a really early sort of personalized search engine called
Ardbark.
Or similarly, if you kind of look behind the hood, it was a lot of ops people and there
was a little bit of algorithm, right?
That was really with like, you would like sort of describe what you're good at or something.
and then they would try and send questions to you.
Yeah, exactly.
They'd kind of route things,
but I think there's actually people
doing some of the routing at least.
I can't exactly remember, right?
But I think it was, you know,
I think a lot of people wanted to build
these really complex sort of bots
or agents that were doing really rich things
and the technology just wasn't there.
It feels like for a period of time.
There was also some startups.
I remember one company that I got involved with
that was trying to do like virtual scheduling and assistance,
you know, six, seven, eight years ago.
And again, it felt like it was a little bit early,
maybe.
Yeah, it was a little bit early.
Really cool.
Clara Labs.
Clara, that's the one.
Yeah, yeah, yeah.
Yeah, I remember the Sarah.
And then we had an operator.
Do you remember operators?
I remember operator?
Yeah.
And then, like, on the question answering stuff, we had jelly.
So there's a whole series of.
Yeah, what we were able to do with magic.
As a like an aside there, you know, I started working there because I wanted to work on the AI part.
I think somewhere in there, Facebook M started as well.
And it was just, it was a fun place to try and learn everything I could about solving those problems.
So, yeah, before Transformers, there was sequence to sequence, was kind of the previous iteration.
You were able to take all the histories of the chats between assistants and people and train a little model on it and a little by today's metrics and run it.
And it would show some gray text in the little text bar and people could edit it and hit enter.
The operators.
Yeah, yeah, the operators.
And, yeah, we would measure how much time they spent typing with it and with,
said out of it. And so it would save like half an hour across 100 people per day, across an
eight-hour shift. So yeah, that was like my first shipping AI product. And then how'd you end up
going to Microsoft? Oh, yeah, there's a bunch of other stuff along with it. So after that, I got
into crypto. My friend was doing H-CAPTCHA, which was like sort of a CAPTCHA marketplace,
which is now like something like the number one or number two,
CAPTCHA service in the world, which is crazy. So kind of launched that. That was fun.
annoyed people the world over for many man hours in aggregate, and then left that to work
with Moxie on a cryptocurrency for Signal. So that was really fun, complicated, and it all worked
in a few seconds. We were shooting for Venmo quality. When you think about crypto in the context of
AI, because people talk about it in a few different contexts, right? One is you have programmatic
sort of money as code running, and so that could create all sorts of really interesting things.
from an agent-driven perspective, but then the other piece of it is identity.
And some people think, I mean, World Coin would be one example, but there's other examples
of effectively trying to secure identity cryptographically on the blockchain in an open way,
and then using that identity in the future to differentiate between AI-driven agents and people.
Do you think that's going to be important, or does that stuff not really matter in terms
of the identity portion of not only crypto, but just like how we think about the future of agents?
It's a good question.
The honest answer is I think we're going to go through a many-year period of
extreme discomfort where AIs pretend to be things or confuse people or extract money from your
grandparents or drain people's life savings in ways that are scary. And, you know, Open AI is
trying to do their best, but for some reason the focus has been on Open AI doing everything.
And instead of like, we should go build the systems that prevent that. We should go pass the
legislation that drops the hammer on people doing that stuff. We should go. All this kind of
that is, unfortunately, it seems like we're going to need some really bad things to happen before
we align correctly. I'm not really scared about AI's killing us, although I'm very grateful that
there are people that are thinking about it. I'm more worried about bad people using new technology
to hers. Yeah, Elia from Near has some really interesting thoughts on this because he was one of the
main author, or he was the last author on the Transformer paper before he started Near. And he's brought
up these concepts of like, how do you stress test society relative to the coming wave of AI, which I think is
an interesting concept. Yeah, it's a great way to look at it. Like, it's not as bad as it could be,
right? If you think about it, there are most of the things that you want to spam, either have a
spam blocker or are somewhat difficult to create an account on. So doing a better job of
sock puppet account filtering is going to be really important going forward. You know, I like
what Cloudflare is doing with their kind of fingerprinting instead of visual captures, which are
not good enough anymore. One thing that is like kind of a saving grace here is, is that, is
that many of the things that you would want to do cost money.
So calling everybody costs money.
Texting everyone should be hopefully illegal soon, but also costs money.
Maybe not enough money to prevent these things, but...
Probably not enough as agents can make money, right?
They can just look at the tradeoffs of cost.
Yeah, I think it's interesting.
I guess I would say it's not, you know, it's somewhere in the middle.
It's like, imagine there's a, you know, North Korea has been trying to do this to us for a long time, right?
Now there's a North Korea that has more resources or is more distributed or whatever.
We have some mitigations.
We need more.
We need to be thinking about it a lot more.
How did you end up at GitHub and how do you end up working on copilot?
While I was working on mobile coin, my dad's kidneys failed and I tried to donate a kidney.
And they found a lump in my chest as part of the scans they do.
And I had to have most of my right lung removed in 2018.
And so that was a big deal.
And so took some time.
It's weird.
Healing from internal injuries takes a lot longer than you think.
Anyway, happy story is that I don't have cancer now for over four years.
And my dad's kidney's got, he got a transplant also.
So things are good.
And so after, I don't know, I guess I was recovering for quite a while.
And then I went and begged my friend for a job.
I figured I should start working again.
That was the transition, I guess.
So I worked on some random stuff at first.
I converted GitHub to using their own product to build GitHub, which is kind of fun.
They weren't, yeah.
I think people still use code spaces now to build GitHub, which is pretty cool.
But yeah, then this kind of opportunity to work with Open AI came up.
And because I had been tracking AI in the past and was pretty aware of what was going on, I jumped on it.
Was that proposed by OpenAI or by GitHub or who kind of initiated it at all?
So I don't know the exact beginnings.
I know that Open AI and Microsoft were working on a deal for supercomputers.
So they wanted to build a big cluster for training.
And there was a big deal that was being worked out.
And there was some software kind of provisions thrown in, I think Office and Bang probably.
And GitHub was like, oh, okay, well, maybe there's something GitHub can do here.
I think Open AI threw a small, through a small fine tuneover, and was like, here's a small model trained on
on some code, see if this is interesting, you know.
So we played around with it.
What's small?
I don't know.
This was, I have to remember now.
I think this is before I knew very much.
So it was definitely not a Da Vinci size model, it's for sure.
I don't know what size was.
Yeah.
And so it was just like I learned later that this was basically a training artifact.
So they had wanted to see what introducing code into their base models would do.
I think it had positive effects on chain of thought reasoning.
Code is kind of linear.
So you can imagine that, you know, you kind of do stuff one after another and the things before have an impact.
And, yeah, it was not that good.
It was very bad.
It was, I think, just, like I said, just an artifact and a small sample of GitHub data that they had crawled.
And this was before, actually, I joined.
Me and this guy, Albert Ziegler, were the first two after Uge.
Uge got a hold of this model and started playing with it.
And he was able to say, like, well, you know, like it doesn't work most of the time,
but here it is doing something, you know, here's, here's, and it was only Python at that time.
Here it is, you know, generating something useful.
We didn't really understand anything.
So that was enough to like, okay, well, you know, go fetch a couple of people and start working,
see if there's anything in there.
We didn't really know what we had.
So the first, you know, task was to go test it out, see what it did.
We crowdsourced a bunch of Python problems in-house, stuff that we knew wouldn't be in the training set.
And then we started work on fetching repositories and finding the tests in them so that we could basically generate functions that were being tested and see if the test still pass.
There had been like a brand new pie test feature introduced like recently that allowed you to see which functions were called by the pie test.
So you go find that function, zero the body, ask the model to generate it and then rerun the test and see if it passed.
And I think it was less, I don't know, 10%, something like that.
of those guys and the dimensions are kind of like how many chances do you give it to solve something
and then how do you test whether it's worked or not right so for the standalone tests that was
we had people write test functions and then we would try to generate the body and if the test
passed then you know it works and in the wild test harness we would download a repository
run all the tests look at the ones that passed find the functions that they call make sure
that they weren't trivial generate the bodies for them rerun the tests you would
to pass us to it, then you get your percentage.
Yeah, I mean, it was something like some very, very low percentage up front,
but we knew that there was kind of a lot more juice to squeeze.
So like getting all of GitHub's code into the model,
and then a bunch of other tricks that we hadn't, you know,
we hadn't even thought of at that time.
Yeah, and eventually, you know, it went from, you know,
less than 10% in the wild test to over 60%.
So that's like one in two,
tests, it can just generate code for, which is insane, right?
Somewhere along the way, you know, there was like 10% to 20% to 35% to 45%, you know,
these kind of like improvements along the way.
Somewhere along the way, we did more prompting work so that the prompts got better.
Somewhere along the way, they used, you know, all the versions of the code as opposed to
just the most recent version.
They used diffs so that it could understand small changes.
Like, yeah, it got better.
And so, but when we first started, we were just,
We didn't know we had. We were just trying to figure it out. At the time, they were thinking in terms of, like, maybe you can replace Stack Overflow or something, you know, do a Stack Overflow, you know, do a Stack Overflow?
Was that the first, like, product idea? I think you guys had for it.
I don't know that we had that idea. I think that was kind of like a...
Yeah, yeah, yeah, yeah. It was more like, if you made something they competed in Stack Overflow, because we have all this code, wouldn't it be nice to leverage it.
Yeah, and so we made some UIs, but like, early on, you know, it was like, early on, it was bad.
So it would be like, you'd watch it and it would, it would like, it would run and most of them
wouldn't be bad.
And be like, there'd be like one success and be like, oh, sweet, I got a success, but I had
to wait, you know, some number of seconds for.
Was the test user group just like the six of you, like some larger group?
The first iteration was just like an internal tool that people would, that help people
write these tests.
And then we wanted to see if maybe we could turn that into some UI that people would use
where if there was some way to cover up the fact that one in ten things passed.
Right? So we tried a few UI things there.
And then it was actually open AI.
It was like, it would be nice if we could, we're testing these model fine tunes.
It'd be nice if we could like test them more quickly.
Can what about doing like a VS code extension, just do autocomplete and, uh, all right, sure.
Why not?
And yeah, so we did auto complete at first.
And that was kind of, that was a big jump, you know, because there were still thinking
in terms of like stack overflow, you know, but this is, it's like, you know, I didn't have any idea.
basically. I don't know how to beat Stack Overflow with this thing. But we could play with some
stuff in VS code that was maybe closer to the code. First we did autocomplete and that was kind of
fun. It was useful. It would show this little pop-up box like Autocomplete does and you could pick
some strings. And so that format actually, you know, the usage was, you know, it's fine. It wasn't
the right metaphor exactly, right? But you've got like this code generated mixed in with the
specific terms that are in the code, and it's a little, it's not exactly the same thing.
We tried things like adding a little button over top of empty functions, so you would go
generate them, or you could, like, hit a control key, and it would create a big list on
the side that you can choose from, or there's a little pop-up thing.
So basically tried every single UI we could think of in VS code.
And multiple generations, like the list, that didn't work.
Yeah, none of them really worked.
I think lists were like, you know, maybe you'd get one generation per person per day.
And this was just a small sample.
It's just like a few people that were interested in GitHub, language nerds
or people that have written tests for us and opening up people.
Yeah, so very early on, I had this idea that it should,
I had this idea that should work like a Gmail, great text autocomplete,
which was, I was like enamored with that product.
It was like, it was the first quote unquote large language model deployment in the
wild. It was fast. It was cool. The paper's great. They give you all sorts of details in how
they do it, all the workarounds they had to do. So that was always in the back of my head, you know.
It was bad also. It was like, you know, those completions are not good. But it seemed like the right
thing. Anyway, and somewhere along the way after I tried all the UI, you know, sort of come up
with some idea for, BS code didn't support this. So I tried to hack it in. Finally came up with
the way to hack it in and enough to make a demo, like a little
demo video.
Most of their support to, like, build real support for it within the organization?
It's a little complicated.
I guess, you know, we were pretty much a skunk works project.
No one knew about us.
So we would go to, like, if you go to VS code people, we'd be like, hey, we need you to go
implement this very complex feature.
Like, I don't even know who you are.
Like, what are you talking about?
There was definitely some politicking that happened to get the VS code people to dedicate
some resources to that on a short, short time frame.
Like, we were moving really fast.
You know, it was less than a year before from beginning to ship.
public launch.
Was there a certain metric where you're like,
this is good enough?
Like we need to actually put it in the public product?
Which specifically with the completions or like the UI?
It was completions.
It was.
Yeah.
There was,
I mean,
we had a long,
we had a nice long window of public access before GA.
So where it was free and you could use it.
And we did a bunch of optimizing for different groups of people that would be,
you know,
okay,
well,
you know,
do we have more experienced people?
Do we want more new people who want people from this area or this area?
And that gave us a bunch of really good stats.
So we were able to learn that, for instance, like, speed is the only thing that matters.
Yeah, there's something crazy thing.
Like, every 10 milliseconds is 1% fewer completions that people would do.
That adds up.
10 milliseconds is pretty fast.
We learned that because somewhere in our first few months of public release,
we noticed that Indian completions were really low.
Like, for whatever reason, they were just significantly lower than Europe.
So network latency to India?
Yeah, and it turns out because opening I only had one data center, so it was all in Texas.
And so it was going, if you can imagine, you're like typing.
And so that goes from India through Europe, over the water, down to Texas and back and back and back.
And now if you've typed something that doesn't match the thing that you requested, then that's useless.
So you don't get a completion.
By the way, I know it's obvious, but that's happened on every single product I've ever worked on.
And, you know, like when I was at Google, like, I worked on a variety of mobile product.
Same thing.
You know, page load times.
Obviously, search in general, 100 milliseconds difference, like, makes a big shift in market share.
So, yeah, that's kind of nuts.
That's how much speed matters.
Yeah.
And so once we figured that out, we knew that we had something awesome, right?
Like, people that were close to Texas were like, this is freaking great.
Like, they were like, you know, we had a Slack channel and people were posting in it all the time.
And the most fun stuff was like, these people that would pop up and be like, I don't program,
but I just learned how to write this 100-line script
that does this thing that I need.
It's like, oh, my God.
I definitely feel like I now speak different languages
that I don't actually know the syntax of,
which is very exciting.
Yeah, it turns out these models are really great at finding patterns.
So once we had the UI mechanism that worked,
so we knew we were on something of that,
and then it was just this like using as much performance as we can get out.
We basically never found the bottom of like, you know,
we made as fast as we could,
and it was still improving completions.
And I'm not perpetuated, like, okay, okay, I know there's this plan for, like, Azure to run Open AI in six months.
We need you to do that in the next month.
So, like, can we, let's figure out how to make this happen.
So, because we wanted to run a bunch of GPUs in Europe, so we could hit Asia.
You know, there was no, there was no other place that we could, we could run them in Europe.
We could run them on West Coast.
We could run them in Texas at the time.
So that was, and Microsoft stepped up there.
We got it running.
And then pretty much after that, we launched.
And, yeah.
Were you surprised by the uptake request launch?
No.
No.
I mean, our retention rate was 50%.
Like, it never went, like, months later, it was still above 50%.
By, like, weekly cohort, which is, like, insane.
Yeah.
Right?
And we didn't, we didn't know if people would pay for it.
That was one thing.
I lobbied pretty hard for going cheap and capturing the market.
How did you guys think about inference?
cost for this thing at the beginning.
Oh, yeah.
We were, our estimates were wildly off, wildly off.
Yeah.
So we got estimates that were like, you know, it'll be 30 bucks a month for, for your
on average, right?
And then once Microsoft is able to like do some, kind of do their Azure infrastructure,
we were able to then like fork off little bits so we could do more accurate projections.
And there's a bunch of like moments where like, how much is it going to cost?
You'd like wait for these results.
The first big one was 10 bucks a month.
It'll cost 10 bucks a month.
And I was like, oh, my God, it's so much cheaper than we expected.
And then we could optimize it.
And that was like we hadn't even optimized on price yet.
Right.
And then we optimized and priced a bunch.
And yeah, now it's less than that.
So like, it was very fortuitous, right?
Like we were thinking like, okay, well, maybe it's, maybe it's enterprise only because that's the only people
who are going to be willing to pay for this.
Like 30 bucks a month is not, it's a lot.
And that's like with no market.
margins, right? For 40% of your code, it's not a lot.
Yeah, that's the thing. We know that. We know that now. Where do you extrapolate all this
like three to five years out? Are they basically going to be just like agents writing code
for us in certain ways? Is it 95% of code is written by co-pilots and, you know, humans are
kind of directing it? Like, what do you kind of view the world evolving into in the next couple
years? And next couple, I mean like three to five, not 20. Yeah, it's hard. I don't know.
I think it's hard for me to, you know, it's hard for me to imagine what that world looks like
because it's such a shift from like I have my hands on something and I know that it's right
to where we are now, which is I mostly know it's right or I have a sense that it's right,
but I have to test it and see it run to know that it's right to then just write this
and I'll trust that it's going to work.
Like those are pretty crazy transitions, right?
Yeah.
They exist.
Like, you know, for sure.
But you could also imagine like certain ways to do.
like a code review post some chunk or some other sort of quality check to yeah i think every
if that's the goal we want to get to is the people like i think every barrier to that is achievable
so we can code review only the dangerous parts or only the confusing parts or we can do things like
train a model on functions before and after changes to say like okay this looks like a more polished
version of this function would be this or yeah we can do things like you know just start the very
basic, you know, very basic main loop and then add everything piece by piece with tests so that
it's, you know, what works and what doesn't, and then just have the, the, I keep generating
what's the logical next feature. Like, all these things will get figured out. So if that's what
we want to do, that's what's going to happen. So what's the idea behind Minion? Yeah, I think I mentioned
making bots that do stuff for you. It's a broad topic. And I think it's where we see going.
You know, the next few years are in the AIs taking action, not just answering questions or writing copy, but actually helping us in our daily lives.
Things like organizing my schedule or booking flights or finding a trip for me to take or doing my taxes or telling me which contacts I haven't talked to in a long time and should reach out to, you know.
There's a lot of stuff that we can do by giving AI's access to information and letting them act on that information in a controlled way that checks to make sure that we're aligned.
And yeah, I think that'll be a really fun future.
Almost like you can imagine co-pilot applied to everyday activities, right?
Like, Copilot gives you a little bit of help.
So I want Minion to give you a little bit of help outside of your code editor.
How do you decide to work on Minion specifically?
Minion specifically.
So basically, I stopped working at Magic.
I quit because I couldn't figure out how to hook up the AI to data.
I was like, in order to improve the quality, I need a PhD in math.
Like, I don't know what to do now.
And it just sort of, the models got better enough where that specific problem seems solvable.
Yeah, so the tech got better.
And so that specific problem is where interacting with the real world broke down.
in the past you know like i said floods get delayed prices change not just like a little bit all the
time you know they might not have the seat you want at the concert that you want to go to and so
you know aIs are this kind of compression of everything that you in their training set but they're not
a real-time mechanism anyway so that that was the idea was like okay well i think we can you can
work on this old problem of how to make a bot do stuff for people that's what we want
Let's go make it.
I don't know.
It's like maybe I can use the excitement from co-pilot to launch into something which is incredibly hard.
And but which I believe the technology is around for if we can figure it out.
So yeah, that's it.
That's the only reason I've ever gotten to startups, actually, is like, okay, I want to do a startup so that I can do a harder startup.
You know, or I want to do a project so I can do a harder project.
Is this one sufficiently hard?
This one's hard.
Yeah.
It's good and hard, I think.
Part of that is there are these fun things that you kind of learn along the way that keep
they keep it like engaged, right?
Like the thing with code with co-pilot is like,
it turns out code is pretty special, right?
Like you can run it.
So if an AI generates some code and it runs,
then you know something about that code
that you wouldn't know necessarily the text.
And accomplish, like doing agent-like work on the internet
or on apps has a lot of these similar kind of properties
where it's like, oh, yeah, you can learn something here.
We can learn from what people do
where we can see if it's a success or failure and then learn from that,
or we can optimize what works based on what doesn't work
or figure out what's annoying and try to improve it.
So I think these things all compound in some really interesting way.
You know, I think the goal, right, is, you know, straight out of sci-fi, right?
You want to make a thing where you say, hey, computer, file my taxes,
and it does the right thing, you know?
And I think we can get there.
It's certainly in the next few years.
But it's also fun to think about how to break these things down and it turns out breaking down tasks in the same way that humans do.
Take a complex task.
You figure out, you know, you write a list of things to go through it.
Same kind of thing works for AIs as well.
So you break down complex tasks.
Figure out if there's any information you need.
Maybe you have to write some code.
Maybe you have to ask some questions.
Maybe you have to query some data.
Same thing you wouldn't do.
Humans don't usually write code to do their taxes.
But sometimes it's kind of the same thing.
through a list of your pay stubs, you know, that's executing a four loop.
Also, I like datasets that don't exist.
So people clicking on stuff on the web is a gigantic data set, which is currently unowned.
And so I think that's pretty exciting.
I think there's a lot of stuff to learn from that.
So you're hiring.
One thing we've talked about is like it's kind of a funny thing to try to hire for building
products with this new set of technologies.
that, like, working in machine learning for the last decade may not help you that much?
How do you think about the people you need?
Yeah, it's been strange.
It seems very bimodal.
Like, very senior people get it, and very young people get it.
And so I don't know that the middle has really caught up yet or realizes that...
Are we on the other side?
Are we in the middle?
No, I just made the middle in terms of, like...
I just made the middle in terms of, like...
It's like...
I'm very young.
No, it's not an age thing. It's not an age thing. It's more like you tell someone who's very naive and you tell someone who's very experienced, like, I can make magic. Here's how I can do it. And then the very naive person is like, that's awesome. And then the very experienced person is like, ah, maybe. Yeah. But the response from almost everybody else is that's not possible or I don't see it. Yeah. A number of companies I know are trying to build that intuition internally now. And often this is the first time in a really long time that I've seen certain founders come back and
start coding again. You know, they'll have a multi-hundred-person company and they'll get so
enamored by what's happening that they'll just dive in. A number of them have told me that
they feel like some of their team members just don't have that intuition for what this can
actually do. And so they don't even know what to build or where to start or how interesting
or important it is. And so it's almost like this founder mindset is needed to your point in
terms of this is a really interesting new set of capabilities and how to actually learn what
these are and then how do I apply them? And I think people lack natural intuition for what
this does right now. It's definitely not intuitive. Yeah. I mean, I would say I have some
intuition now and people that are, again, on that experience spectrum has some intuition. But it's
not intuitive. There's many kinds of different learners and there's many kinds of different
programmers, right? It takes a certain kind of programmer to be comfortable with this idea of like,
I don't know what's going to work. Let me try some stuff. It turns out that it's similar attributes to the web
or product, product making, being able to test stuff, look at the right metrics, find the right
metrics. Yeah, those are, those are useful skills, I think, that are reusable. But the intuition
and the tenacity that it takes to, like, you know, this doesn't work at all. What do I do?
That's still rare, I think, in that field, because there's so much uncertainty that it's easy to
go, like, I tried 10 things and it didn't work. So, like, this is impossible. Like, okay.
I think the natural reaction of somebody looking at something that has 10% performance at the
beginning is like, why you can bother
we're not going to get there, right?
That's a great point. The crazy thing is that these
things work at all, right?
And not only that, but they scale
and they improve the scale,
right? Most things break
at scale. Almost everything breaks
at scale. And that's what, you know,
that's what you hire people to deal with.
You know, every doubling, you usually
someone breaks, you're going to fix it.
It's a pretty crazy thing to think of
what kind of emergent properties might
still be out there.
if we can get these things 10 times bigger.
So I'm always in favor of people pushing those limits, you know?
Like, it doesn't make sense.
Like, even the best people can't explain what's going to happen when you, you know, go bigger.
We built the large Hadron Collider for that reason, too, you know?
We didn't know.
I don't know.
We think maybe we'll find some stuff.
These endeavors are valuable.
And, like, yeah, I'm grateful to be alive during this time.
It's a great note to end on.
Thank you for joining us for the podcast.
Yeah, thanks so much.
Sure.
Thank you for listening to this week's episode of No Pryors.
Follow No Pryors for a new guest each week and let us know online what you think and who an AI you want to hear from.
You can keep in touch with me and conviction by following at Serenormus.
You can follow me on Twitter at Alad Gill. Thanks for listening.
No Pryors is produced in partnership with Pod People.
Special thanks to our team, Synthel Galdia and Pranav Reddy and the production team at Pod People.
Alex McManus, Matt Saab, Amy Machado, Ashton Carter, Danielle Roth, Carter Wogan, and Billy Libby.
Also, our parents, our children, the Academy, and tyranny.m.L., just your average-friendly AGIW world government.