The a16z Show - GPT-3: What's Hype, What's Real on the Latest in AI
Episode Date: July 30, 2020In this episode -- cross posted from our 16 Minutes show feed -- we cover all the buzz around GPT-3, the pre-trained machine learning model from OpenAI that’s optimized to do a variety of natural-la...nguage processing tasks. It’s a commercial product, built on research; so what does this mean for both startups AND incumbents… and the future of “AI as a service”? And given that we’re seeing all kinds of (cherrypicked!) examples of output from OpenAI’s beta API being shared — how do we know how good it really is or isn’t? How do we know the difference between “looks like” a toy and “is” a toy when it comes to new innovations? And where are we, really, in terms of natural language processing and progress towards artificial general intelligence? Is it intelligent, does that matter, and how do we know (if not with a Turing Test)? Finally, what are the broader questions, considerations, and implications for jobs and more? Frank Chen explains what “it” actually is and isn’t and more in conversation with host Sonal Chokshi. The two help tease apart what’s hype/ what’s real here… as is the theme of 16 Minutes. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Hi, everyone. Welcome to this week's episode of 16 Minutes. I'm Sonal, your host, and this is our show where we talk about the headlines, what's in the news, and where we are on the long arc of tech trends. We're back from our holiday break, and so this week we're covering all the recent and ongoing buzz around the topic of GPT3, the natural language processing-based text predictor from the San Francisco Research and Development Company Open AI. They actually released their paper on GPT3 in late May, but only released their broader commercial.
API a couple of weeks ago, so we're seeing a lot of excitement and activity around that in particular,
although it's all being called GPT3. So we're going to do one of our explainer episodes. It's a 2x
explainer episode going into what it really is, how it works, why it matters, and broader
implications and questions while teasing apart what's hype, what's real, as is the premise of this show.
But before I introduce our expert, let me just quickly summarize some of the highlights.
So while GPT3 is technically a text predictor, that actually reduces what's possible because, of course,
Words and software are simply the encoding of human thought, to borrow a phrase from Chris Dixon,
which means a lot more things are possible.
So we're seeing, and note, these are all cherry-picked examples, believable forum posts, comments,
press releases, poetry, screenplays, articles.
Someone even wrote an entire article, headlined OpenAIs, GPT3, maybe the biggest thing since Bitcoin
and then revealed midway that he didn't actually write the article, but that GPT3 did.
We're also seeing strategy documents, like for business, CEOs.
and advice written entirely in GPT3,
and not just words,
but we're seeing people design using words to write code
for designing websites and other designs.
Someone even built a Figma plugin,
again, all of it showing the transmutability
of thoughts to words to code, to design, and so on.
And then someone made a search engine
that can return answers and URLs in response
to, quote, ask me anything,
which is anyone who's been in the NLP space knows.
I was at Park when we spun off
power set back in the day, and that's always been sort of a holy grail of question answering,
which you know all about, too, having worked in this world, Frank. Now, let me introduce you,
our expert in this episode. Frank Chen has written a lot about AI, including a primer on AI,
deep learning and machine learning, a pulse check on AI, what's working what's not, a microsite
with resources for how to get started practically and do something with your own product and your own
company, and then reflecting on jobs and humanity and AI working together. You can find all of that
on our website. Frank, to start things off, what's your favorite example of GPT3 so far?
Mine is founding principles for a religion written in GPT3. I'd love to hear your favorite and also
your quick take on wavy excitement to start us off before we dig in a bit deeper.
My favorite out of the whole thing is it's doing arithmetic. So if you ask it, what's 23 plus 67,
like just arbitrary two-digit arithmetic? It's doing it. This is a natural language processing
model. And so basically it got trained by feeding it lots and lots of text. And out of that,
it's figuring out, we think, how to do arithmetic, which is very, very surprising. Because you
don't think that exists in texts. The excitement potentially is promising signs of, you know,
progress towards general artificial intelligence. So today, if you want to do very highly accurate
natural language processing, you build a bespoke model. You have your own custom architecture,
you feed it a ton of data. What GPT3 shows is that they train this model once, and then they
throw it a whole bunch of natural language processing tasks like fill in the blank or inference
or translation. And without retraining it at all, they're getting really good results compared to
finely tuned models. Before we even go into teasing apart what's hype, what's real, let's first talk
about the it. What is GPT3? So we have two things. One, we have a machine learning model. GPT is actually
an acronym. It stands for generative, pre-trained transformer. We'll go through all those in a sec.
But thing one is we have a pre-trained machine learning model that's optimized to do a wide variety
of natural language processing tasks,
like reading a Wikipedia article
and answering questions from it,
or guessing what the ending of a story should be,
or so on and so on.
So we have a machine learning model.
The thing that people are playing with
is an API that allows developers
to essentially ask questions of that model.
So instead of giving you the model
and you program it to do what you want,
they're giving you selective access via the API.
One of the reasons they're doing this is that most people don't have the compute infrastructure to even train the model.
There's been estimates that if you wanted to train the model from scratch, it would cost something like $5 to $10 million of cloud compute time.
That's a big, big model.
And so they don't give out the model.
And then two, the controversy around this thing when they released the first version was they were worried that if they just gave the raw model out, people would do nefarious things with it.
like generate fake news articles that you would just like saturate bomb the web. And so they're like,
look, we want to be responsible with this thing. And so we'll gate access via API. So then we know
exactly who's using it. And then the API can be a bit of a throttle on what it can and can't do.
Right. Well, all helping them learn. And just as a reminder of APIs are application programming
interfaces, we've talked a lot about on the podcast. And people want to learn more. It can go to A6C.com
slash API to read all our resources, explainers. There's so much we have on this whole.
topic. But the key underlying idea, and this goes to your point about the cost of what it would
take if you were trying to build this from scratch, is APIs give developers and other businesses
superpowers because they lower the barrier to entry in this case for anyone being able to use AI
who doesn't necessarily have a whole in-house research team, et cetera. And so that's one of
the really neat things about the API. But I do want to correct one misconception. The folks out there
aren't aware of when it comes to GPT3. What they're describing as GPT3, they're actually
actually playing with OpenA's API, which is not just GPT3. Obviously, some of the technical achievements of
GPT3 are in the API, of course, but it's a combination of other things. It's like a set of
technologies that they've released, and it's their first commercial product, in fact. So that's just to get
people a little context on what the it is and isn't there. Let's go ahead and go a level deeper
into explaining what it is. In their paper, they describe it simply as an auto-regressive language
model. Can you share what it is and kind of the category this fits in? Yeah. So the broad category
of things it fits into, it is a neural network or a deep neural network. And architectures basically
talk about the shape of those networks. At the highest level, visualize it as a something
comes in on the left and then I want something to shoot out on the right side. And in between
is a bunch of nodes that are connected to each other. And the way in which those nodes are
connected to each other and then the connection waits, that's essentially the neural network.
GPT3 is one of those things.
Technically, it's called a transformer architecture.
This is an architecture for neural networks that Google introduced a few years ago.
And it's different than a convolutional neural network, which is great for images.
It's different than a recurrent neural network, which is good for simple language processing.
The way the nodes are connected to each other results in it being able to do essentially computations on large sentences.
filled with different words
and doing it concurrently
instead of sequentially.
So RNNs, which were
the former state of the art
on natural language processing,
they're very sequential.
So they'll kind of go through
a sentence a word at a time.
Recurrent, right?
Exactly.
These transformer networks
can basically sort of
consider the entire sentence
in context while it's doing its computations.
One of the things that you
classically have to do
with natural language processing
is you have to disambiguate words.
I went to the
bank. That could mean I went to go withdraw some money, or it could mean I went right up to the edge of the
river, because we have ambiguity in these words. The natural language processing system needs to figure out,
well, which sense of bank did you mean? And you need to know all the other words around that sentence
in order to disambiguate it. And so these transformers consider large chunks of text in trying to
make that decision all at once instead of sequentially. So that's what the transformer architecture does.
And then what OpenAI has been doing is basically transforming this type of neural network
with the transformer architecture on larger and larger data sets.
Conceptually, think of it as you have it read Wikipedia, and think of that as Generation 1.
Generation 2 is going to have it read Wikipedia and all of the open source textbooks that I can find.
This generation, they trained it on what's called Common Crawl.
It's kind of the same thing that Google uses to search and index the internet.
There's an open source version of that.
Think of it as robots go on to every webpage.
They gather the text.
And now we're using that as the training set for GPT3.
Yeah, something like half a trillion words, I believe.
Yeah, it's a crazy number of words.
And then this thing has two orders of magnitude more than the previous attempts.
It's something like 175 billion parameters.
for the purposes of this conversation, a way of measuring the complexity of a neural network.
Right. GPT2 had 1.5 billion.
And in between GPT2 and 3, Microsoft did one that was 17 billion, right?
So like, there is a bit of an arms race here going on, which is like how big are your neural networks?
What does it mean?
Because the paper is called language models are few shot learners.
And I remember this movement in one-shot learning where you can learn on very few examples.
but honestly what you just described to me sounded like almost a trillion examples
when you think about what it's ingesting as an input.
So can you actually explain what Fushot even means in this context?
Yeah. So first, they trained this model on the internet.
Basically, what came in as input on the left side was reams and reams and reams of text,
all the text they could get their hands on, and they cleaned it a little.
And so this is very traditional deep learning.
It is not itself a zero shot or a few shot approach.
It's deep learned, which means I have incredible amounts of input text.
What they mean in the context of this paper around no shot and fuchsot is the model can perform a variety of natural language processing tasks.
So a good example of it is analogies.
King is to queen as water is to what, right?
In the context of this system, what you can do is you could give it an example of that, and they call that one shot, which is, I'm going to give you an example of an analogy that's completely,
filled out, and then I want you to fill out more analogies. Another task would be pick the right
ending of a story, and I will give you one example with the correct answer. So I'm just going to
give it to you once. Now, typically what happens when you do traditional neural network learning,
you take an example, you give it to the system, and you tell the system the right answer.
The system uses that right answer to basically readjust the neural net. It's called back propagation.
And the theory is that as it adjusts the weights inside the neural network, it will get that answer more correct the next time it sees it.
And so everything up into this point has basically been, if I give you enough examples, I'm going to be able to tell whether that picture has a hot dog in it or not.
I will be able to generalize the features of a hot dog and I will basically deduce hot dogness if you just give me enough pictures and you tell me hot dog or not.
What's going on here is they train this model once, and then they give it one example.
That example doesn't adjust the weights of the model.
It really just primes the system to basically prepare it to answer this type of question.
So you basically tell it, look, I want you to work on fill in the blank.
And I'm going to give you one or a few examples, fuchsion of this, and then we'll go from there,
but those examples that you give it don't adjust the weights of the model.
It's one model to rule them all.
And this is kind of how humans learn.
They don't need to see a thousand, 10,000, 100,000 examples of hot dogs before they can start reliably telling whether it is hot dog or not.
It's like how children learn language.
Yeah, exactly.
Babies before they can say cat and dog can recognize the difference between cat and dogs, they didn't see a million of them.
In fact, they can't say the words dog and cat yet.
And so maybe something like this is going on in the brain, which is you have this.
sort of general processor, and then it instantly knows how to adapt itself to solve a lot of
different problems, including problems it had never seen before. And so I'm going to go back to my
favorite example of what GPT3 was used for. Like, how in the world did it deduce the rules
for two-digit arithmetic by reading a lot of stuff? And so maybe this is the beginnings of a general
intelligence that can rapidly adapt itself. Now look, I don't want to get ahead of myself. It falls
apart on four-digit arithmetic, and so it's not generally smart yet. But the fact that it got
all of the two-digit addition and subtraction problems right by reading text, like, that's crazy
to me. The general takeaway is that it does some complicated things really well and some really easy
things really badly, and this is actually true of most AI. The researchers have a huge section
on limitations where, quote, GP2-3 samples can lose coherence over sufficiently long passages,
contradict themselves, and occasionally contain non-sequiters sentences or paragraphs. Now, of course,
as an editor, that made me laugh because that's also true of human writing. So I was like,
okay, this is also true of the writing I've seen and edited. So I don't know who's talking here.
Help me tease apart where we really are in this long arc. I'm having a hard time knowing what's
real, what's not. Like, help me kind of understand what is this thing, really at this moment in time.
So we have the most sophisticated natural language processing pre-trained model of its kind.
The natural language processing community has basically divided the problem of understanding language
into dozens and dozens of sub-tasks.
And task after task after task, GPT3 goes up against the state-of-the-art, the best-performing system.
And basically what the paper does is lay out, okay, here's where GPT-3 is approaching state-of-the-art.
Here's where it's far away from state of the art.
And that's basically all we know is compared to state of the art techniques for solving that particular natural language processing task.
How does it perform?
We're really in the research domain.
Right.
So if you were to ask me, can I build a startup on it?
Can I build the world's best chat bot on it?
Can I build the world's best customer support agent on it?
I was going to ask you that.
Yeah.
I think it's really too early to tell whether you can build any of those things.
The hope is that you could, and long-term, really the hope is having built a model like this and exposed an API,
you could take any Silicon Valley startup that wants to solve a text problem, chatbots, or pre-sale support, or post-sales customer support,
or building a mental health app that talks to you.
All of those things will get dramatically cheaper and faster and easier to build on top of this infrastructure.
If this works, you have this generally smart system that's already been trained,
then you show it a couple examples of problems that you want to solve,
and then it will just solve them with very high accuracy.
All you have to do as a startup or a programmer is to say,
hey, look, I'm going to give you a couple examples of the type of problem that I want solved.
And then that priming is going to be enough for the system to get very accurate results,
and in fact, sometimes better results than if you had built the model
and fed it the data sets yourself.
So that's the hope, but we just don't know yet.
That's a really good reminder because they themselves are like,
this is early days, it's research, there's a lot of work to be done,
but it's also really exciting, as you're saying,
because this is one of the most advanced natural language models we've seen.
So the question I have then on the startup and building side,
what would it take to, what are the kinds of considerations
to make it more practical and scalable?
I mean, for one thing, the size,
you described how the transformer has this ability to sort of,
comprehend so much at once without doing it in kind of this RNN model.
But the trade-off of that is that it's so slow or be able to fit on a GPU.
So I'd love to have a quick take from you on what are the things that need to happen
to make something like this more usable, etc.
I think what's going to need to happen is that the OpenAI product team
is going to have conversations with dozens and dozens of startups that are using their
technology.
And then they successfully refine the API and improve the performance
and set up the security rules and all of that.
so that it becomes something as easy to use as, say, Stripe or Twilio.
Stripe or Twilio, we're very straightforward.
Send a text message or processes payment.
This is a lot more amorphous, which is, hey, I can do SAT analogies.
How's that relevant for my startup?
Well, there's a bit of a gap there, right?
You have a startup that's like, hey, I need my document summarized,
or I need you to go through all of the complaints we've ever gotten
and give me product insight for product managers.
And so there's basically a divide between there.
It means they closed over time.
Right. So what does this mean with the data world? Because one really interesting thing to me is on one hand, APIs give you superpowers kind of democratizing things. On the other hand, it kind of makes things a bit of a race at the bottom then because then you have to differentiate on kind of private, proprietary, these other elements. So do you have thoughts on what that means?
Yeah. I mean, the hope for something like a GPT3 is that it's going to dramatically reduce the data gathering, cleansing, cleaning process, and frankly building the data model as well, your machine learning model.
So let me try to put it in economic terms.
Let's say we put $10 million into a series A company,
and then $5 million of it goes to getting data and cleaning it
and hiring your machine learning people,
and then renting a bunch of GPUs in Amazon or Google or Microsoft,
wherever you do your compute.
The hope is that if you could stand on the shoulders of something like GPT3,
and it'll be a future version of it,
you would reduce those costs from $5 million to $100,000.
You're basically making API calls,
And the way you program, quote unquote, this thing is you just show it a bunch of examples
that are relevant to the problem that you're trying to solve.
So you show it texts where you had a suicide risk.
And you don't need to show it a bunch because it's pre-trained.
And you show it a new text that it hasn't seen before.
And you ask it, what is the risk of suicide in this text exchange?
The hope is that we can dramatically reduce the costs of gathering that data and building
the machine learning models.
but it's really too early to tell whether that's going to be practical or not.
So we know what it means for startups, but how do the incumbents respond in that kind of a world?
It seems almost inevitable that the big players, there might be an AWS potentially, right,
that could make this a given in their services.
Like this kind of bigger question around this business model of AI as a service.
Yeah, so the first thing I'll say is this is Open AI's first commercial product,
which is interesting, right?
Recall that Open AI started as a research institution.
So we'll sort of see what the pricing is.
If this works, the scenario that I described earlier, which is dramatically reduce the time it takes to build a machine learning inside product, then all of the public cloud providers and other startups will offer competing products because they don't want to let OpenEI just take all of the sort of text understanding ability of the internet, right?
Google Cloud and Microsoft and Amazon and Bidu and Tencent,
they're all going to say, hey, look, I can do that too.
Build your application on me.
Now, I will say that because of the large costs of training the model,
so I mentioned estimates ranging from $5 to $10 million to train this thing once,
and obviously they didn't train it once to get to where they were.
They trained it multiple times as they did the research process.
And so this is not going to be for the faint of heart.
it's going to come on the back of a lot of money
with very skilled scientists using enormous infrastructure.
But to the extent that this product works,
then you're going to have very healthy competition
among all of the incumbents.
You might even have new players
who figure out a different angle on it.
You know, it's really fascinating watching the people who have access.
And basically the recurring theme is that it's not like plug and play.
It's obviously not built and ready for that yet.
The prompt and the sampling hyperparameters matter a lot.
priming is an art, not a science. So I'm curious for where you think the knowledge value is going to go
in the future. What are the sort of the data scientists of the future going to look like for people
who have to work with something like this? Now, granted, the models are going to evolve, the
API will evolve, the product will evolve. But what are the skills that people need to have in
order to really do well in this world coming ahead? It's really too early to tell, but it is a
fundamentally different art of programming. So if you think of programming to date, it's basically
I learned Python and I learn to be efficient with memory and I learned to write clever algorithms that can sort things fast.
That's well understood art, thousands of classes, millions of people know how to do that.
If this approach works, basically, there is this massive pre-trained natural language model.
And the programming technique is basically I show you a couple examples of the tasks that I want you to perform.
It'll be about what examples do I show you and in what form and do I show you?
And do I show you the outliers or do I show you some normal ones, right?
And so if this approach works, it'll all be about how do you prime the model to get the best accuracy for the real world problems you actually want your product to solve?
Programming becomes what examples do I show you as opposed to how do I allocate memory and write efficient search algorithms?
It's a very different thing.
Vitalik Beteran, the inventor of Ethereum, described this when he was observing some of this buzz around GPT3.
that, quote, I can easily see many jobs in the next 10 to 20 years changing their workflow to human describes, AI builds, human debugs.
There's a lot of speculation about how this might affect jobs.
It can displace customer support, sales support, data scientists, legal assistance, and other jobs like that are at risk.
But do you have thoughts on the labor and jobs side of this, like just sort of the broader questions and concerns here?
The way I think about this generally informed a lot by Eric Brin-Hompson and other people.
So if you think about a job as a set of tasks, some tasks will get automated, and then some
tasks will be stubbornly hard to automate, and then there will be new tasks, and so think of
jobs as sort of an ever-changing bundle of tasks, some of which are performed by humans today,
some of which get automated, and then there are new tasks.
And so what Vatollic describes, if this AI stuff works,
Being able to prime the AI system with the right examples and then being able to debug it at the end, those are two new tasks.
No human on the planet gets paid to do that outside of AI researchers today.
But that could be mainstream knowledge work in 10 years, which is you pick a good examples and then you debug it at the end.
So you have these brand new tasks that are generating economic value and people get paid for them that didn't exist before.
I find it very fascinating what you said, by the way, because what it also means to me is it becomes more inclusive for more people to enter the world.
worlds that might have been previously closed off to a certain class of type of programmers or
people who have certain technical skills because let's say you're very good at describing things
and it's more of an art than a science and you're very good at sort of fiddling with and hacking
at things you might be better off than someone who went through like years and years of elite
PhD education at tuning something than someone else. I think the machine learning algorithms will
invite more people who would otherwise be discouraged in pursuing careers in
careers they wouldn't have naturally risen to the top of. So I think you're right.
What do you make of the concern? There was concerns that GPT3, these answers that it gave that it
predicted were riff with racism or stereotypes. What do you make of the data issues around that?
Okay, we're going to feed it every piece of text on the internet, and then we're going to
ask it to make generalizations. What could possibly go wrong? A lot could possibly go wrong.
If you look at the heart of this system is basically, I'm trying to guess the next word.
and the way I make my guests is I go look at all the documents that have been written ever
and I ask what words are most likely to have occurred in those documents, right?
You're going to end up with culturally offensive stereotypes.
And so we need to figure out how do we put the safety rails, how do we erect the APIs?
I'm glad they open AI researchers and the community around them are being very careful about this
because we obviously have to.
How do we basically teach it the social norms we want it to emit as opposed to the
ones that it found by reading texts. Another whole philosophical sidebar, but really important,
is if you think about the internet as a sum total of human knowledge and other things that reflect
many of the realities in the world, which are atrocious and awful in many cases, the flip side
of it is it's a lot harder to change the real world and people and behavior and society and
systems, but probably a hell of a lot easier to change a technical system and be able to do certain
things. So to me, what's implicit in what you said is that there's actually a solution,
I don't mean to be solutionistic, but that's within the technology that you don't necessarily get from IRL in real life.
Yeah, that's exactly right.
And if it were in algorithm land, so to speak, where we are, right, GPT3 and its descendants, let's say GPT17 gave you a text document, right?
It wrote a text document for you.
You could take that document and put it through whatever filter that you wanted, right, to filter out sexism or racism.
and that layer could be inspectable and tunable to everybody.
You didn't know how GPT17 came up with its recommendations,
but you have this safety net at the end,
which is you can filter out things that you don't want.
So you have this second step that you can actually put into your system.
Do you don't have to depend just on the first thing?
You can catch that at a subsequent stage.
Right. You can have sort of a system of checks and balances.
So a broad meta question.
One of my favorite posts was from Kevin Lacker,
and he basically gave GPT3 a Turing test.
And he tested it on these questions of common sense, obscure trivia, logic.
And one of the things he observed is that, quote,
we need to ask it questions that no normal human would ever talk about.
And so he said if you're ever a judge in a Turing test,
make sure you ask some nonsense questions
and see if the interviewer responds the way a human would.
Because the system doesn't know how to say I don't know.
And this goes at this question of what does a Turing test tell us.
And there's been a lot of work, as you know, over the years,
about the modernization of the Turing test,
like in 2016, Gary Marcus, our friend Gary Marcus,
Francesca Rossi, and Manila Veloso published an article
beyond the Turing test in AI Magazine.
Barbara Gross of Harvard wrote a piece
called What Question Would Turing Pose Today in AI Magazine in 2012?
And she basically starts by saying that in 1950,
when Turing proposed to replace the question,
can machines think with the question,
are there computers which would do well in the imitation game?
at the time, computer science wasn't a field of study.
You know, Claude Shannon's Their of Information was just getting started.
Psychology was just only starting to go beyond behavior.
And so what would Turing ask today?
You probably propose a very different test.
And so the question I really wanted to ask you is,
how do we know if the thing is measuring what it's supposed to measure
or answering what it's supposed to answer
or that it's getting smarter, I guess?
This is more a philosophical question
than an engineering question.
So why don't I say what we know,
and then I'll widely speculate on the other stuff?
That's great. That's life and science.
I'll go for it.
So basically, if you read the paper,
you'll see that it compares GPT3's performance
against various other state-of-the-art techniques
on a wide variety of natural language processing tasks.
So for instance, if you're asking it
to translate from English to French,
there's this thing called the blue score.
The higher the blue score, the better your translation.
And so every test has its measure.
And so what we do know is we can compare GPT3 performance versus other algorithms, other systems.
What we don't know is how much does it really understand?
So what do we really take away from the fact that it aced two-digit arithmetic?
Like what does that mean?
What does it understand of the world?
Does it get math?
Let's say you had a system that was 100% accurate on every two-digit arithmetic problem that you ever gave it.
It's perfect at math, but it doesn't get it.
It doesn't know that these numbers represent things in the real world.
But what does that mean to claim that it doesn't get it?
That's a philosophical question.
Right.
It's philosophical because the question then becomes, does it even matter if it comes to applying things practically?
Because I think about this from the world of education.
You know, there's a big focus on metacognition and the awareness of knowing what you know and don't know.
But at a certain point, if the kid is doing well on the test and the test is applicable to the world and they can basically survive and do well, does it even matter if they really understood what arithmetic really means?
as long as they can solve the problem when you go to the store,
that I give you a dollar, I get five cents change back. You know what I mean?
That's exactly right. And if you generalize that out to other tasks that human solve in the real world,
imagine you just got good at 100 and then 1,000 and then 10,000 of these tasks that you had never seen before.
Let's say descendants of GPT3 got that good at a wide variety of language tasks.
What does it mean to insist, but it doesn't get the world? It doesn't get language.
That's fantastic. I'd love to get to,
of your perspective on how do we think about this broader arc of innovation that's playing out here.
Daniel Gross called GPT's three screenshots, the TikTok videos of nerds, and there's something to that.
It's kind of created this inherent virality.
So I'm curious for your take on that.
On the one hand, some of the most important technologies start out looking like a toy.
Chris Dixon paraphrased a really important idea from Clayton Christensen about how disruptive innovation happens.
But a lot of the people who are researchers really emphasize this is not a toy.
This is a big deal.
There are a lot of TikTok-ish-like videos that are coming out of the whole playground,
which is basically a place where you can try out the model.
And on the one hand, people are saying it's a toy because they're in the sandbox
and they're basically having fun feeding it prompts.
Some of those examples are actually really good.
And some of those are like comically bad.
Right.
So it feels toy-like.
The tantalizing prospect for this thing is that we have the beginnings of an approach
to general intelligence that we haven't seen us make this much progress on before,
which is to date, if you wanted to build a specific system for a specific natural language
processing task, you could do that. Custom architecture, lots of training data, and lots of hand-tuning,
and lots of, like, PhD time. The tantalizing thing about GPT3 is it didn't have an end use
case in mind that it was going to be optimal for, but it turns out to be really good at all.
a lot of them, which kind of is how people are. You're not tuned to like learn polka or double-entry
bookkeeping or learn how to audio edit a podcast. You didn't come out of the womb with that. But your
brain is this general purpose computer that can figure out how to get very, very good at that
with enough practice and enough intentionality. Well, it's really great that you use the word tantalizing
because if you remember the Greek myth root behind it, Tantalus was destined to constantly get this
like tempting fruit dangling above him as punishment, and it was so close yet so out of reach
at the same time. So bottom line it for me, Frank. It's tantalizing, right? Now look, there's a limit to
how big these models can get and how effective the APIs will be once we sort of, you know,
unleash them to regular programmers. But it is surprising that it is so good across a broad
range of tasks, including ones that the original designers didn't contemplate. So
maybe this is the path
to artificial general intelligence.
Now look, it's way too early to tell.
So I'm not saying that it is.
I'm just saying it's very robust
across a lot of very different tasks
and that's surprising
and kind of exciting.
Thank you so much for joining this episode
of 16 minutes, Frank.
Awesome. Thank you, Sonal, for having me.
