The a16z Show - GPT-3: What's Hype, What's Real on the Latest in AI

Episode Date: July 30, 2020

In this episode -- cross posted from our 16 Minutes show feed -- we cover all the buzz around GPT-3, the pre-trained machine learning model from OpenAI that’s optimized to do a variety of natural-la...nguage processing tasks. It’s a commercial product, built on research; so what does this mean for both startups AND incumbents… and the future of “AI as a service”? And given that we’re seeing all kinds of (cherrypicked!) examples of output from OpenAI’s beta API being shared — how do we know how good it really is or isn’t? How do we know the difference between “looks like” a toy and “is” a toy when it comes to new innovations? And where are we, really, in terms of natural language processing and progress towards artificial general intelligence? Is it intelligent, does that matter, and how do we know (if not with a Turing Test)? Finally, what are the broader questions, considerations, and implications for jobs and more? Frank Chen explains what “it” actually is and isn’t and more in conversation with host Sonal Chokshi. The two help tease apart what’s hype/ what’s real here… as is the theme of 16 Minutes. Stay Updated:Find a16z on YouTube: YouTubeFind a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, everyone. Welcome to this week's episode of 16 Minutes. I'm Sonal, your host, and this is our show where we talk about the headlines, what's in the news, and where we are on the long arc of tech trends. We're back from our holiday break, and so this week we're covering all the recent and ongoing buzz around the topic of GPT3, the natural language processing-based text predictor from the San Francisco Research and Development Company Open AI. They actually released their paper on GPT3 in late May, but only released their broader commercial. API a couple of weeks ago, so we're seeing a lot of excitement and activity around that in particular, although it's all being called GPT3. So we're going to do one of our explainer episodes. It's a 2x explainer episode going into what it really is, how it works, why it matters, and broader implications and questions while teasing apart what's hype, what's real, as is the premise of this show. But before I introduce our expert, let me just quickly summarize some of the highlights. So while GPT3 is technically a text predictor, that actually reduces what's possible because, of course, Words and software are simply the encoding of human thought, to borrow a phrase from Chris Dixon,
Starting point is 00:01:05 which means a lot more things are possible. So we're seeing, and note, these are all cherry-picked examples, believable forum posts, comments, press releases, poetry, screenplays, articles. Someone even wrote an entire article, headlined OpenAIs, GPT3, maybe the biggest thing since Bitcoin and then revealed midway that he didn't actually write the article, but that GPT3 did. We're also seeing strategy documents, like for business, CEOs. and advice written entirely in GPT3, and not just words,
Starting point is 00:01:34 but we're seeing people design using words to write code for designing websites and other designs. Someone even built a Figma plugin, again, all of it showing the transmutability of thoughts to words to code, to design, and so on. And then someone made a search engine that can return answers and URLs in response to, quote, ask me anything,
Starting point is 00:01:55 which is anyone who's been in the NLP space knows. I was at Park when we spun off power set back in the day, and that's always been sort of a holy grail of question answering, which you know all about, too, having worked in this world, Frank. Now, let me introduce you, our expert in this episode. Frank Chen has written a lot about AI, including a primer on AI, deep learning and machine learning, a pulse check on AI, what's working what's not, a microsite with resources for how to get started practically and do something with your own product and your own company, and then reflecting on jobs and humanity and AI working together. You can find all of that
Starting point is 00:02:29 on our website. Frank, to start things off, what's your favorite example of GPT3 so far? Mine is founding principles for a religion written in GPT3. I'd love to hear your favorite and also your quick take on wavy excitement to start us off before we dig in a bit deeper. My favorite out of the whole thing is it's doing arithmetic. So if you ask it, what's 23 plus 67, like just arbitrary two-digit arithmetic? It's doing it. This is a natural language processing model. And so basically it got trained by feeding it lots and lots of text. And out of that, it's figuring out, we think, how to do arithmetic, which is very, very surprising. Because you don't think that exists in texts. The excitement potentially is promising signs of, you know,
Starting point is 00:03:19 progress towards general artificial intelligence. So today, if you want to do very highly accurate natural language processing, you build a bespoke model. You have your own custom architecture, you feed it a ton of data. What GPT3 shows is that they train this model once, and then they throw it a whole bunch of natural language processing tasks like fill in the blank or inference or translation. And without retraining it at all, they're getting really good results compared to finely tuned models. Before we even go into teasing apart what's hype, what's real, let's first talk about the it. What is GPT3? So we have two things. One, we have a machine learning model. GPT is actually an acronym. It stands for generative, pre-trained transformer. We'll go through all those in a sec.
Starting point is 00:04:14 But thing one is we have a pre-trained machine learning model that's optimized to do a wide variety of natural language processing tasks, like reading a Wikipedia article and answering questions from it, or guessing what the ending of a story should be, or so on and so on. So we have a machine learning model. The thing that people are playing with
Starting point is 00:04:35 is an API that allows developers to essentially ask questions of that model. So instead of giving you the model and you program it to do what you want, they're giving you selective access via the API. One of the reasons they're doing this is that most people don't have the compute infrastructure to even train the model. There's been estimates that if you wanted to train the model from scratch, it would cost something like $5 to $10 million of cloud compute time. That's a big, big model.
Starting point is 00:05:09 And so they don't give out the model. And then two, the controversy around this thing when they released the first version was they were worried that if they just gave the raw model out, people would do nefarious things with it. like generate fake news articles that you would just like saturate bomb the web. And so they're like, look, we want to be responsible with this thing. And so we'll gate access via API. So then we know exactly who's using it. And then the API can be a bit of a throttle on what it can and can't do. Right. Well, all helping them learn. And just as a reminder of APIs are application programming interfaces, we've talked a lot about on the podcast. And people want to learn more. It can go to A6C.com slash API to read all our resources, explainers. There's so much we have on this whole.
Starting point is 00:05:50 topic. But the key underlying idea, and this goes to your point about the cost of what it would take if you were trying to build this from scratch, is APIs give developers and other businesses superpowers because they lower the barrier to entry in this case for anyone being able to use AI who doesn't necessarily have a whole in-house research team, et cetera. And so that's one of the really neat things about the API. But I do want to correct one misconception. The folks out there aren't aware of when it comes to GPT3. What they're describing as GPT3, they're actually actually playing with OpenA's API, which is not just GPT3. Obviously, some of the technical achievements of GPT3 are in the API, of course, but it's a combination of other things. It's like a set of
Starting point is 00:06:33 technologies that they've released, and it's their first commercial product, in fact. So that's just to get people a little context on what the it is and isn't there. Let's go ahead and go a level deeper into explaining what it is. In their paper, they describe it simply as an auto-regressive language model. Can you share what it is and kind of the category this fits in? Yeah. So the broad category of things it fits into, it is a neural network or a deep neural network. And architectures basically talk about the shape of those networks. At the highest level, visualize it as a something comes in on the left and then I want something to shoot out on the right side. And in between is a bunch of nodes that are connected to each other. And the way in which those nodes are
Starting point is 00:07:13 connected to each other and then the connection waits, that's essentially the neural network. GPT3 is one of those things. Technically, it's called a transformer architecture. This is an architecture for neural networks that Google introduced a few years ago. And it's different than a convolutional neural network, which is great for images. It's different than a recurrent neural network, which is good for simple language processing. The way the nodes are connected to each other results in it being able to do essentially computations on large sentences. filled with different words
Starting point is 00:07:48 and doing it concurrently instead of sequentially. So RNNs, which were the former state of the art on natural language processing, they're very sequential. So they'll kind of go through a sentence a word at a time.
Starting point is 00:08:01 Recurrent, right? Exactly. These transformer networks can basically sort of consider the entire sentence in context while it's doing its computations. One of the things that you classically have to do
Starting point is 00:08:12 with natural language processing is you have to disambiguate words. I went to the bank. That could mean I went to go withdraw some money, or it could mean I went right up to the edge of the river, because we have ambiguity in these words. The natural language processing system needs to figure out, well, which sense of bank did you mean? And you need to know all the other words around that sentence in order to disambiguate it. And so these transformers consider large chunks of text in trying to make that decision all at once instead of sequentially. So that's what the transformer architecture does.
Starting point is 00:08:45 And then what OpenAI has been doing is basically transforming this type of neural network with the transformer architecture on larger and larger data sets. Conceptually, think of it as you have it read Wikipedia, and think of that as Generation 1. Generation 2 is going to have it read Wikipedia and all of the open source textbooks that I can find. This generation, they trained it on what's called Common Crawl. It's kind of the same thing that Google uses to search and index the internet. There's an open source version of that. Think of it as robots go on to every webpage.
Starting point is 00:09:22 They gather the text. And now we're using that as the training set for GPT3. Yeah, something like half a trillion words, I believe. Yeah, it's a crazy number of words. And then this thing has two orders of magnitude more than the previous attempts. It's something like 175 billion parameters. for the purposes of this conversation, a way of measuring the complexity of a neural network. Right. GPT2 had 1.5 billion.
Starting point is 00:09:48 And in between GPT2 and 3, Microsoft did one that was 17 billion, right? So like, there is a bit of an arms race here going on, which is like how big are your neural networks? What does it mean? Because the paper is called language models are few shot learners. And I remember this movement in one-shot learning where you can learn on very few examples. but honestly what you just described to me sounded like almost a trillion examples when you think about what it's ingesting as an input. So can you actually explain what Fushot even means in this context?
Starting point is 00:10:18 Yeah. So first, they trained this model on the internet. Basically, what came in as input on the left side was reams and reams and reams of text, all the text they could get their hands on, and they cleaned it a little. And so this is very traditional deep learning. It is not itself a zero shot or a few shot approach. It's deep learned, which means I have incredible amounts of input text. What they mean in the context of this paper around no shot and fuchsot is the model can perform a variety of natural language processing tasks. So a good example of it is analogies.
Starting point is 00:10:55 King is to queen as water is to what, right? In the context of this system, what you can do is you could give it an example of that, and they call that one shot, which is, I'm going to give you an example of an analogy that's completely, filled out, and then I want you to fill out more analogies. Another task would be pick the right ending of a story, and I will give you one example with the correct answer. So I'm just going to give it to you once. Now, typically what happens when you do traditional neural network learning, you take an example, you give it to the system, and you tell the system the right answer. The system uses that right answer to basically readjust the neural net. It's called back propagation. And the theory is that as it adjusts the weights inside the neural network, it will get that answer more correct the next time it sees it.
Starting point is 00:11:44 And so everything up into this point has basically been, if I give you enough examples, I'm going to be able to tell whether that picture has a hot dog in it or not. I will be able to generalize the features of a hot dog and I will basically deduce hot dogness if you just give me enough pictures and you tell me hot dog or not. What's going on here is they train this model once, and then they give it one example. That example doesn't adjust the weights of the model. It really just primes the system to basically prepare it to answer this type of question. So you basically tell it, look, I want you to work on fill in the blank. And I'm going to give you one or a few examples, fuchsion of this, and then we'll go from there, but those examples that you give it don't adjust the weights of the model.
Starting point is 00:12:30 It's one model to rule them all. And this is kind of how humans learn. They don't need to see a thousand, 10,000, 100,000 examples of hot dogs before they can start reliably telling whether it is hot dog or not. It's like how children learn language. Yeah, exactly. Babies before they can say cat and dog can recognize the difference between cat and dogs, they didn't see a million of them. In fact, they can't say the words dog and cat yet. And so maybe something like this is going on in the brain, which is you have this.
Starting point is 00:13:00 sort of general processor, and then it instantly knows how to adapt itself to solve a lot of different problems, including problems it had never seen before. And so I'm going to go back to my favorite example of what GPT3 was used for. Like, how in the world did it deduce the rules for two-digit arithmetic by reading a lot of stuff? And so maybe this is the beginnings of a general intelligence that can rapidly adapt itself. Now look, I don't want to get ahead of myself. It falls apart on four-digit arithmetic, and so it's not generally smart yet. But the fact that it got all of the two-digit addition and subtraction problems right by reading text, like, that's crazy to me. The general takeaway is that it does some complicated things really well and some really easy
Starting point is 00:13:48 things really badly, and this is actually true of most AI. The researchers have a huge section on limitations where, quote, GP2-3 samples can lose coherence over sufficiently long passages, contradict themselves, and occasionally contain non-sequiters sentences or paragraphs. Now, of course, as an editor, that made me laugh because that's also true of human writing. So I was like, okay, this is also true of the writing I've seen and edited. So I don't know who's talking here. Help me tease apart where we really are in this long arc. I'm having a hard time knowing what's real, what's not. Like, help me kind of understand what is this thing, really at this moment in time. So we have the most sophisticated natural language processing pre-trained model of its kind.
Starting point is 00:14:29 The natural language processing community has basically divided the problem of understanding language into dozens and dozens of sub-tasks. And task after task after task, GPT3 goes up against the state-of-the-art, the best-performing system. And basically what the paper does is lay out, okay, here's where GPT-3 is approaching state-of-the-art. Here's where it's far away from state of the art. And that's basically all we know is compared to state of the art techniques for solving that particular natural language processing task. How does it perform? We're really in the research domain.
Starting point is 00:15:07 Right. So if you were to ask me, can I build a startup on it? Can I build the world's best chat bot on it? Can I build the world's best customer support agent on it? I was going to ask you that. Yeah. I think it's really too early to tell whether you can build any of those things. The hope is that you could, and long-term, really the hope is having built a model like this and exposed an API,
Starting point is 00:15:28 you could take any Silicon Valley startup that wants to solve a text problem, chatbots, or pre-sale support, or post-sales customer support, or building a mental health app that talks to you. All of those things will get dramatically cheaper and faster and easier to build on top of this infrastructure. If this works, you have this generally smart system that's already been trained, then you show it a couple examples of problems that you want to solve, and then it will just solve them with very high accuracy. All you have to do as a startup or a programmer is to say, hey, look, I'm going to give you a couple examples of the type of problem that I want solved.
Starting point is 00:16:07 And then that priming is going to be enough for the system to get very accurate results, and in fact, sometimes better results than if you had built the model and fed it the data sets yourself. So that's the hope, but we just don't know yet. That's a really good reminder because they themselves are like, this is early days, it's research, there's a lot of work to be done, but it's also really exciting, as you're saying, because this is one of the most advanced natural language models we've seen.
Starting point is 00:16:31 So the question I have then on the startup and building side, what would it take to, what are the kinds of considerations to make it more practical and scalable? I mean, for one thing, the size, you described how the transformer has this ability to sort of, comprehend so much at once without doing it in kind of this RNN model. But the trade-off of that is that it's so slow or be able to fit on a GPU. So I'd love to have a quick take from you on what are the things that need to happen
Starting point is 00:16:57 to make something like this more usable, etc. I think what's going to need to happen is that the OpenAI product team is going to have conversations with dozens and dozens of startups that are using their technology. And then they successfully refine the API and improve the performance and set up the security rules and all of that. so that it becomes something as easy to use as, say, Stripe or Twilio. Stripe or Twilio, we're very straightforward.
Starting point is 00:17:22 Send a text message or processes payment. This is a lot more amorphous, which is, hey, I can do SAT analogies. How's that relevant for my startup? Well, there's a bit of a gap there, right? You have a startup that's like, hey, I need my document summarized, or I need you to go through all of the complaints we've ever gotten and give me product insight for product managers. And so there's basically a divide between there.
Starting point is 00:17:44 It means they closed over time. Right. So what does this mean with the data world? Because one really interesting thing to me is on one hand, APIs give you superpowers kind of democratizing things. On the other hand, it kind of makes things a bit of a race at the bottom then because then you have to differentiate on kind of private, proprietary, these other elements. So do you have thoughts on what that means? Yeah. I mean, the hope for something like a GPT3 is that it's going to dramatically reduce the data gathering, cleansing, cleaning process, and frankly building the data model as well, your machine learning model. So let me try to put it in economic terms. Let's say we put $10 million into a series A company, and then $5 million of it goes to getting data and cleaning it and hiring your machine learning people, and then renting a bunch of GPUs in Amazon or Google or Microsoft,
Starting point is 00:18:31 wherever you do your compute. The hope is that if you could stand on the shoulders of something like GPT3, and it'll be a future version of it, you would reduce those costs from $5 million to $100,000. You're basically making API calls, And the way you program, quote unquote, this thing is you just show it a bunch of examples that are relevant to the problem that you're trying to solve. So you show it texts where you had a suicide risk.
Starting point is 00:18:57 And you don't need to show it a bunch because it's pre-trained. And you show it a new text that it hasn't seen before. And you ask it, what is the risk of suicide in this text exchange? The hope is that we can dramatically reduce the costs of gathering that data and building the machine learning models. but it's really too early to tell whether that's going to be practical or not. So we know what it means for startups, but how do the incumbents respond in that kind of a world? It seems almost inevitable that the big players, there might be an AWS potentially, right,
Starting point is 00:19:27 that could make this a given in their services. Like this kind of bigger question around this business model of AI as a service. Yeah, so the first thing I'll say is this is Open AI's first commercial product, which is interesting, right? Recall that Open AI started as a research institution. So we'll sort of see what the pricing is. If this works, the scenario that I described earlier, which is dramatically reduce the time it takes to build a machine learning inside product, then all of the public cloud providers and other startups will offer competing products because they don't want to let OpenEI just take all of the sort of text understanding ability of the internet, right? Google Cloud and Microsoft and Amazon and Bidu and Tencent,
Starting point is 00:20:13 they're all going to say, hey, look, I can do that too. Build your application on me. Now, I will say that because of the large costs of training the model, so I mentioned estimates ranging from $5 to $10 million to train this thing once, and obviously they didn't train it once to get to where they were. They trained it multiple times as they did the research process. And so this is not going to be for the faint of heart. it's going to come on the back of a lot of money
Starting point is 00:20:39 with very skilled scientists using enormous infrastructure. But to the extent that this product works, then you're going to have very healthy competition among all of the incumbents. You might even have new players who figure out a different angle on it. You know, it's really fascinating watching the people who have access. And basically the recurring theme is that it's not like plug and play.
Starting point is 00:21:01 It's obviously not built and ready for that yet. The prompt and the sampling hyperparameters matter a lot. priming is an art, not a science. So I'm curious for where you think the knowledge value is going to go in the future. What are the sort of the data scientists of the future going to look like for people who have to work with something like this? Now, granted, the models are going to evolve, the API will evolve, the product will evolve. But what are the skills that people need to have in order to really do well in this world coming ahead? It's really too early to tell, but it is a fundamentally different art of programming. So if you think of programming to date, it's basically
Starting point is 00:21:34 I learned Python and I learn to be efficient with memory and I learned to write clever algorithms that can sort things fast. That's well understood art, thousands of classes, millions of people know how to do that. If this approach works, basically, there is this massive pre-trained natural language model. And the programming technique is basically I show you a couple examples of the tasks that I want you to perform. It'll be about what examples do I show you and in what form and do I show you? And do I show you the outliers or do I show you some normal ones, right? And so if this approach works, it'll all be about how do you prime the model to get the best accuracy for the real world problems you actually want your product to solve? Programming becomes what examples do I show you as opposed to how do I allocate memory and write efficient search algorithms?
Starting point is 00:22:26 It's a very different thing. Vitalik Beteran, the inventor of Ethereum, described this when he was observing some of this buzz around GPT3. that, quote, I can easily see many jobs in the next 10 to 20 years changing their workflow to human describes, AI builds, human debugs. There's a lot of speculation about how this might affect jobs. It can displace customer support, sales support, data scientists, legal assistance, and other jobs like that are at risk. But do you have thoughts on the labor and jobs side of this, like just sort of the broader questions and concerns here? The way I think about this generally informed a lot by Eric Brin-Hompson and other people. So if you think about a job as a set of tasks, some tasks will get automated, and then some
Starting point is 00:23:13 tasks will be stubbornly hard to automate, and then there will be new tasks, and so think of jobs as sort of an ever-changing bundle of tasks, some of which are performed by humans today, some of which get automated, and then there are new tasks. And so what Vatollic describes, if this AI stuff works, Being able to prime the AI system with the right examples and then being able to debug it at the end, those are two new tasks. No human on the planet gets paid to do that outside of AI researchers today. But that could be mainstream knowledge work in 10 years, which is you pick a good examples and then you debug it at the end. So you have these brand new tasks that are generating economic value and people get paid for them that didn't exist before.
Starting point is 00:23:55 I find it very fascinating what you said, by the way, because what it also means to me is it becomes more inclusive for more people to enter the world. worlds that might have been previously closed off to a certain class of type of programmers or people who have certain technical skills because let's say you're very good at describing things and it's more of an art than a science and you're very good at sort of fiddling with and hacking at things you might be better off than someone who went through like years and years of elite PhD education at tuning something than someone else. I think the machine learning algorithms will invite more people who would otherwise be discouraged in pursuing careers in careers they wouldn't have naturally risen to the top of. So I think you're right.
Starting point is 00:24:34 What do you make of the concern? There was concerns that GPT3, these answers that it gave that it predicted were riff with racism or stereotypes. What do you make of the data issues around that? Okay, we're going to feed it every piece of text on the internet, and then we're going to ask it to make generalizations. What could possibly go wrong? A lot could possibly go wrong. If you look at the heart of this system is basically, I'm trying to guess the next word. and the way I make my guests is I go look at all the documents that have been written ever and I ask what words are most likely to have occurred in those documents, right? You're going to end up with culturally offensive stereotypes.
Starting point is 00:25:13 And so we need to figure out how do we put the safety rails, how do we erect the APIs? I'm glad they open AI researchers and the community around them are being very careful about this because we obviously have to. How do we basically teach it the social norms we want it to emit as opposed to the ones that it found by reading texts. Another whole philosophical sidebar, but really important, is if you think about the internet as a sum total of human knowledge and other things that reflect many of the realities in the world, which are atrocious and awful in many cases, the flip side of it is it's a lot harder to change the real world and people and behavior and society and
Starting point is 00:25:48 systems, but probably a hell of a lot easier to change a technical system and be able to do certain things. So to me, what's implicit in what you said is that there's actually a solution, I don't mean to be solutionistic, but that's within the technology that you don't necessarily get from IRL in real life. Yeah, that's exactly right. And if it were in algorithm land, so to speak, where we are, right, GPT3 and its descendants, let's say GPT17 gave you a text document, right? It wrote a text document for you. You could take that document and put it through whatever filter that you wanted, right, to filter out sexism or racism. and that layer could be inspectable and tunable to everybody.
Starting point is 00:26:31 You didn't know how GPT17 came up with its recommendations, but you have this safety net at the end, which is you can filter out things that you don't want. So you have this second step that you can actually put into your system. Do you don't have to depend just on the first thing? You can catch that at a subsequent stage. Right. You can have sort of a system of checks and balances. So a broad meta question.
Starting point is 00:26:53 One of my favorite posts was from Kevin Lacker, and he basically gave GPT3 a Turing test. And he tested it on these questions of common sense, obscure trivia, logic. And one of the things he observed is that, quote, we need to ask it questions that no normal human would ever talk about. And so he said if you're ever a judge in a Turing test, make sure you ask some nonsense questions and see if the interviewer responds the way a human would.
Starting point is 00:27:16 Because the system doesn't know how to say I don't know. And this goes at this question of what does a Turing test tell us. And there's been a lot of work, as you know, over the years, about the modernization of the Turing test, like in 2016, Gary Marcus, our friend Gary Marcus, Francesca Rossi, and Manila Veloso published an article beyond the Turing test in AI Magazine. Barbara Gross of Harvard wrote a piece
Starting point is 00:27:38 called What Question Would Turing Pose Today in AI Magazine in 2012? And she basically starts by saying that in 1950, when Turing proposed to replace the question, can machines think with the question, are there computers which would do well in the imitation game? at the time, computer science wasn't a field of study. You know, Claude Shannon's Their of Information was just getting started. Psychology was just only starting to go beyond behavior.
Starting point is 00:28:02 And so what would Turing ask today? You probably propose a very different test. And so the question I really wanted to ask you is, how do we know if the thing is measuring what it's supposed to measure or answering what it's supposed to answer or that it's getting smarter, I guess? This is more a philosophical question than an engineering question.
Starting point is 00:28:23 So why don't I say what we know, and then I'll widely speculate on the other stuff? That's great. That's life and science. I'll go for it. So basically, if you read the paper, you'll see that it compares GPT3's performance against various other state-of-the-art techniques on a wide variety of natural language processing tasks.
Starting point is 00:28:42 So for instance, if you're asking it to translate from English to French, there's this thing called the blue score. The higher the blue score, the better your translation. And so every test has its measure. And so what we do know is we can compare GPT3 performance versus other algorithms, other systems. What we don't know is how much does it really understand? So what do we really take away from the fact that it aced two-digit arithmetic?
Starting point is 00:29:07 Like what does that mean? What does it understand of the world? Does it get math? Let's say you had a system that was 100% accurate on every two-digit arithmetic problem that you ever gave it. It's perfect at math, but it doesn't get it. It doesn't know that these numbers represent things in the real world. But what does that mean to claim that it doesn't get it? That's a philosophical question.
Starting point is 00:29:27 Right. It's philosophical because the question then becomes, does it even matter if it comes to applying things practically? Because I think about this from the world of education. You know, there's a big focus on metacognition and the awareness of knowing what you know and don't know. But at a certain point, if the kid is doing well on the test and the test is applicable to the world and they can basically survive and do well, does it even matter if they really understood what arithmetic really means? as long as they can solve the problem when you go to the store, that I give you a dollar, I get five cents change back. You know what I mean? That's exactly right. And if you generalize that out to other tasks that human solve in the real world,
Starting point is 00:30:00 imagine you just got good at 100 and then 1,000 and then 10,000 of these tasks that you had never seen before. Let's say descendants of GPT3 got that good at a wide variety of language tasks. What does it mean to insist, but it doesn't get the world? It doesn't get language. That's fantastic. I'd love to get to, of your perspective on how do we think about this broader arc of innovation that's playing out here. Daniel Gross called GPT's three screenshots, the TikTok videos of nerds, and there's something to that. It's kind of created this inherent virality. So I'm curious for your take on that.
Starting point is 00:30:35 On the one hand, some of the most important technologies start out looking like a toy. Chris Dixon paraphrased a really important idea from Clayton Christensen about how disruptive innovation happens. But a lot of the people who are researchers really emphasize this is not a toy. This is a big deal. There are a lot of TikTok-ish-like videos that are coming out of the whole playground, which is basically a place where you can try out the model. And on the one hand, people are saying it's a toy because they're in the sandbox and they're basically having fun feeding it prompts.
Starting point is 00:31:05 Some of those examples are actually really good. And some of those are like comically bad. Right. So it feels toy-like. The tantalizing prospect for this thing is that we have the beginnings of an approach to general intelligence that we haven't seen us make this much progress on before, which is to date, if you wanted to build a specific system for a specific natural language processing task, you could do that. Custom architecture, lots of training data, and lots of hand-tuning,
Starting point is 00:31:36 and lots of, like, PhD time. The tantalizing thing about GPT3 is it didn't have an end use case in mind that it was going to be optimal for, but it turns out to be really good at all. a lot of them, which kind of is how people are. You're not tuned to like learn polka or double-entry bookkeeping or learn how to audio edit a podcast. You didn't come out of the womb with that. But your brain is this general purpose computer that can figure out how to get very, very good at that with enough practice and enough intentionality. Well, it's really great that you use the word tantalizing because if you remember the Greek myth root behind it, Tantalus was destined to constantly get this like tempting fruit dangling above him as punishment, and it was so close yet so out of reach
Starting point is 00:32:23 at the same time. So bottom line it for me, Frank. It's tantalizing, right? Now look, there's a limit to how big these models can get and how effective the APIs will be once we sort of, you know, unleash them to regular programmers. But it is surprising that it is so good across a broad range of tasks, including ones that the original designers didn't contemplate. So maybe this is the path to artificial general intelligence. Now look, it's way too early to tell. So I'm not saying that it is.
Starting point is 00:32:54 I'm just saying it's very robust across a lot of very different tasks and that's surprising and kind of exciting. Thank you so much for joining this episode of 16 minutes, Frank. Awesome. Thank you, Sonal, for having me.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.