a16z Podcast - a16z Podcast: Revenge of the Algorithms (Over Data)... Go! No?

Starting point is 00:00:00 Hi, everyone. Welcome to the A6 and Z podcast. I'm Sonal, and I'm here today bringing back the band together again, the AlphaGo band, I guess. I don't know how else to describe us, but we have the head of our deal in research team operation, Frank Chen. We have Steven Sinovsky, A6 and Z board partner. Just to give people some quick context, you don't have to have heard our previous podcast on AlphaGo. But when AlphaGo, the algorithm produced by Deep Mind, beat the world champ in Korea in playing Go, which is an ancient Chinese game, we had a podcast where we discussed a lot of the themes and some of the broad things around that. And what we'd like to talk about today is their latest paper, but not just only specifically, but more broadly, what this means for where we actually are, what's hype, what's real in AI or artificial intelligence. So welcome, guys. Welcome. So, like, one thing that, like, you read this paper, and the paper's published in nature, it's pretty dense. It has 17 authors on it. And so it's quite the force. But the thing that sort of jumps out is the paper, the blogs that everybody is the first thing you read is one step closer to creating general purpose

Starting point is 00:01:04 AI and immediately like my AI winter fears of hype antenna pop up because like everything is one step closer but like you could take very very very tiny little steps or you could overhype them and we've heard these promises throughout history and especially around board games so we solved checkers and people were like oh one step away from general intelligence then we solved chess. Same thing. We're one step away from general intelligence. Then we did the, oh, I can find bacteria that causes infection. Therefore, we must be one step away. And I think the fallacy is that because we're doing these things that are considered sort of high cognition, like there's smart people play chess, smart people figure it out. Then solving one thing that a smart person does must lead us

Starting point is 00:01:50 to the next thing that a smart person does, which will lead us to the next. And that's always been the fallacy, which is it actually hasn't quite generalized. Well, in fact, In fact, they don't even really compound all that much. They're all fairly discreet. The one thing that's different now is that they are all building on these new artificial intelligence or machine learning techniques and taking a whole bunch of data, training a model on it, and developing solutions that beat all their algorithmic ones. And that was a big thing about the first alpha-go was like, this is data over algorithms.

Starting point is 00:02:25 And all of a sudden, here we are- algorithms over data or or a model over data but you know frank you have an interesting way of looking at this because you know you play go and you understand that you play go i didn't know that enough more than me i only play chess i don't even know chinese chess and go all right yeah but but like it's it's not like a generalized problem you know it comes with like a whole bunch of constraints and things that make it solvable by algorithm yeah so what are some traits about Go that are not like the real world. All the rules are completely

Starting point is 00:03:00 well known. The state of play is completely well known, right? In the real world, mostly we live in a fog of war situation where we know some things and we don't know other things. It would cost us something to go figure out and Go, and in particular the representations they chose, the entire board state is known to both players, right? So there's a lot

Starting point is 00:03:17 of things that aren't like the real world at all when we do these board games. Now, having said that, why are people getting breathless yet again around one step closer to AGI? I said, Authors don't claim generalized intelligence. They're talking about being able to do apply some of these techniques in other domains. Yeah. So you give credit where credits do. Let's sort of review what they actually achieved. And they are like very impressive achievements. So what they achieved was they now have a go player that beat all of their other go players. AlphaGo zero. Right. Well, they have they named them by the people they beat. So they had AlphaGo fan, Alpha Go Lee. This one's AlphaGo Zero. So AlphaGo Zero, their latest one, has beat every iteration, which has beat all of the best human players. So they have the best of go-playing algorithms in the world.

Starting point is 00:04:00 And the difference in how they got to this one is there was no training data from human games. So all the other ones had been bootstrapped by, let me go watch what the human players do and see if I can mimic that. And that very broadly speaking is the approach to machine learning called supervised learning. And not a trivial amount. I mean, they had 100,000 games that they started from to train it on like 48 of of Google's TPUs for days on end.

Starting point is 00:04:29 That's exactly right. So a fleet of machines, a ton of data. This one started from nothing, which is it didn't even really know the rules. It had a loss function, which is reinforcement learning's way of sort of improving the algorithm is over time. So let me just give a quick intuitive explanation of reinforcement learning. So it's exactly like the game that used to play when you were a kid called hotter or colder. Somebody would go hide something.

Starting point is 00:04:52 And then you would try to get to discover it. Are you getting hotter? I'm getting hotter. Hotter, cold, cold again, right. And then the hotter or colder is the loss function. That is the thing that tells you if you're getting closer or not. It's basically trial and error. That's exactly right.

Starting point is 00:05:05 And so that's the fundamental approach in AlphaGo Zero, which is they have this loss function that describes, are you more likely to win the game, having made this move or not? Right. So that's all they had. And then they didn't have human input. They didn't have human games. They basically said, I have this locks function that tells you whether you're more likely to win this game or not.

Starting point is 00:05:22 And then it played itself. One other important thing I think it has, it also knows. it has the codified rules of the game. That is the one human input it actually got. Right. And it makes it a very, very constrained problem because there's a whole bunch of the decision trees like hot or cold.

Starting point is 00:05:39 You could climb up the sofa. You could leave the house. You could go all over the place for days on end. Whereas this just tells you know it can only be on the floor. These are the rules. It's not going to be under the sofa. It's a 17 by several.

Starting point is 00:05:50 Very little of real life is actually like you get the rules told you that way. It's black and white pieces. There are rules. Yeah, you switch turns, like all of that stuff matters a lot. It does. And one more thing to add, by the way, too, because you mentioned the 48 TPUs, it's really significant that they got it down to four in this case. Yeah. That's a huge, like, on the power side, energy, like, you know, it's a simple architecture.

Starting point is 00:06:12 It's a simple architecture. And thinking from the point of view of a software developer, now you're back to one machine. You're not like, oh, my God, I need to go rent this massive cloud with massive storage and massive interconnects. and like I need to figure out how to provision the cluster and manage the cluster, you're back down to one machine. This stuff was pretty impressive. It's, you know, they did four TPUs. It was three days of playing.

Starting point is 00:06:33 Three days total. Like not, like it's very achievable, you know, like on your Amazon credits. That's exactly right. And if you think about sort of the approach that most startups take to artificial intelligence to TAY, they basically take the supervised learning approach. And step one, raise money so that you can go get a data. set that's annotated, train your neural network, make recommendations, right? And you could be one or five or ten or fifty million dollars in getting that data set depending on how complicated

Starting point is 00:07:03 the data set is. Right. So the reason people are so excited about this is, look, this had no data aside from rules of the game. It basically played itself. And by day three, it was better than everything that had been trained before. Nearly an order of magnitude. Yeah. It was doubt the order's a magnitude of all of this. It was 408 TPUs versus four. It was three days versus 40. It was 30 million trained games versus 4.9. So order of magnitude improvement on all of those dimensions.

Starting point is 00:07:27 And so like let's give credit where credit is due. This is a very impressive technical achievement. And then the question that we sort of entered the session with, okay, does this make us more likely to be able to create an artificial general intelligence where the learning algorithms generalize across domains? In other words, can I take the breakthroughs here and make a better pick-and-pack robot for Amazon or make a better health care predictor to discover whether you have cancer? Or Salesforce forecasting, code generation, or a whole bunch of stuff.

Starting point is 00:07:58 And the interesting thing is that also the techniques here, this is just proof of something that people have been talking about for a long time. Oh, yeah. Reinforcement learning's been around for ages. And so part of what's interesting to me is that reinforcement learning has been around for a long time. Obviously, the modeling has been around for a long time. Self play. All of these things.

Starting point is 00:08:18 And this, like so many times these steps are like somebody new to the domain looking and pulling together a bunch of unrelated things. And just coming up with a very elegant, incredibly elegant solution. And it's super impressive. But it's not clear it generalizes. And I think that was one of the things that jumped out for me in reading about it, you know, when you hear, oh, and then the next step on this train of generalized television is drug discovery. protein folding and quantum chemistry and material and it's like all of a sudden I'm trying to figure out protein folding we like what are the constraints on protein folding well we know there are amino acids and we know that they have to be in three dimensions but actually nobody

Starting point is 00:08:58 else know that there's no codified rules like nobody has the rules of protein folding so there's there's no rules there's no perfect understanding of the search space there's like what's the loss function like why would you how would you even write a loss function I do want to push back a little bit though because far be it from any of us in here to hype this up. But there's something unique happening here, which at least I perceive this in the paper, which is that I was struck by the analogy to evolution. Like, this is how human beings have evolved.

Starting point is 00:09:24 This is evolution that we learn by trial and error on a massive million scale. So I don't want to completely dismiss the idea that we can get to some kind of generalized intelligence. I mean, of course, I understand and agree, but what are the limits and what are the possibilities that can actually take us there and where are we constrained? Just to break that down a bit more. Yeah. So I love the evolution analogy, right? Because in the paper, they talk about how, you know, when it started out, it was making sort of naive moves. It was sort of greedy and trying to capture all the tokens. And then it got to very sophisticated patterns that humans have discovered over thousands of years playing this game, teaching this game, codifying it in books. And it figured that out. And not only did it figure out, it figured out things that humans haven't quite codified. Right. So if you play more games, right?

Starting point is 00:10:11 sort of, it developed an intelligence that humans haven't yet because it's going through its own reinforcement thing and thousands of generations of games playing each other, sort of arrive at places that maybe humans would have gotten to if we played another thousand years, but like, you know, it figured out in three days. So I think there is something incredibly profound going on here where you're basically, you're accelerating natural selection cycles in computers. Now, where I think the analogy breaks down is, oh, and therefore we can apply this approach to every other problem and it's just going to be a straightforward application of these set of ideas to those problem sets and then we're going to have evolution at that

Starting point is 00:10:49 scale. In other words, eating different problems right in exactly the same way. And I think that's where Steve and I have a little skepticism. Yeah, well, and I think also like the key with all of these, if you just take an abstract view of it, you know, you have this awesomely elegant solution to an intractable problem, which is just for on any measure super cool. But that doesn't make generalizable to any other space. There are many, many super cool things that surface and you have to be careful as like an engineer or a founder or a person applying this to go, okay, well, what are the elements of this solution that one would need to have as a precursor to applying it? And we saw the same thing with all of the work on supervised learning. Like first you need a data set that is

Starting point is 00:11:34 clean and then it has to be like labeled really well and then you have to have a neural network model and you have to do all the weights and all this model and so there were all these things where you couldn't just say hey I'd like to make the most money in our Q4 sales let's machine learn our way there and you're like you can't just show up on Thursday and yeah and do that every year salespeople they develop a model for how they want to sell how the customers what the prospects are they know the rules like they're all these rules like there are this many salespeople They can only call so many people per day. They know which customers to call.

Starting point is 00:12:06 They know what the quotas are going to look like. They know what the product. And you start to think, well, maybe there is something here because it's a space where, like, actually the history might not be as well as applicable to supervise learning as you would like. And so if there was a way to look at this through the lens of, like, what would be the optimal system to navigate or what would be the sales matter? And I think that there are more things like that than fewer. I don't know how many, you know, properties of nature.

Starting point is 00:12:33 are amenable to this? Because for the most part, we don't understand them. Right. It's limited by what the human knows, right? We can't codify those rules. What are the domains that you see some transferability of these types of things? Well, I love the idea of sort of sales forecasting, right? Because essentially what you're doing, the intuition is that if I could play act my salespeople doing a million different things and a million different situations against a million different set of prospects, and I could sort of simulate that, then maybe best practices would just come out of that, right? Because essentially that's what I'm doing in real life. I raise a ton of money. I hire a sales leader. That sales leader hires a bunch of people,

Starting point is 00:13:08 gives them a playbook, and has them call a bunch of prospects. I can only run so many experiments, right? Because every call is an experiment. So the idea would be if I could simulate what happens inside these calls and simulate 10, 100, 1,000, a million times more, then I'd get much better best practices emerging from that. So super intriguing idea. Off the top of my head, you start wanting to think about like, okay, what about, you know, cybersecurity? And you think you're looking at your code and you know in any given moment, like the type of code that can be in a specific place. And you know the rules and the syntax and it's well understood. And you know patterns that are all so bad.

Starting point is 00:13:48 And so today what you do in the people that have tried to apply machine learning to code, they just have a lot of examples of like missing equal signs or missing semicolons or operators or. wrong. But if you think about it, like, the syntax of the language is completely fixed. And so you're back at very much a rule-based kind of, you know, rule-based is a constraint. These are the only ways you can put the symbols together. What you don't have is right or wrong win or lose with code. And you don't have a loss function that's that objective. There are interesting things to me that like there's just complexity of line that you, that is a very common measure. Like, boy, something with three sets of parentheses in it, it's likely to have a bug in it

Starting point is 00:14:31 just because it has three sets of parentheses in it. You know, something that's missing, you know, some enclosure kind of rules like using brackets, not using brackets. These kind of things you can actually flag, like you can think of Lint just flags them in C and C++ as just sort of high risk behaviors.

Starting point is 00:14:47 So there are things that you can put in to like sort of, and so then you start to think, wow, it would be really interesting to have a tool that is able to look at code and sort of just, very different than previously, which was just literally looking at the syntax, but doing millions of examples of generating code and finding bad examples, would it be able to do a better job at finding bad examples than the next piece of code that flows into it?

Starting point is 00:15:09 I mean, one of the points you guys made last time in our last podcast is that at the end of the day, these things aren't working in isolation. It's not like there's one magic approach. You know, there's always a combination of techniques that come together to actually build real products. How would this sort of fit into that? Because one of the thoughts that I had is that clearly this kind of approach, even if you don't have clearly defined rules, will always be more beneficial in places where we don't have any data, like any big data, just like humans, like kids learning from n equals two, like their parents or n equals one, if there's a single parent.

Starting point is 00:15:38 Yeah, I think if we think about what humans do, they have many different types of intelligences. Yeah. And we have many different types of strategies for solving problems. So my guess is that the artificial intelligences that we create will be similar. lots of different strategies and one of the interesting research items is which strategy should I employ to solve this problem? Because this is something the brain does effortlessly. The strategies that employ is to understand conversation are different than planning a trip than, you know, making sure you don't fall down versus like long range planning. Like how do I choose the best career? Right. So all of these are very different problems and somehow your brain kind of picks a good strategy for each one. Pedro Domingo talks about this as sort of, you know, in his book, master algorithm, which is we know that there's all of these different techniques, but kind of what we're missing is sort of the synthesizer, the thing that will know, what strategy should I pick to solve this problem? And it's something that the brain seems to just

Starting point is 00:16:35 do. What's interesting about that observation is that's where we are with machine learning today to begin with. Which is? Which is just like there were all of these different networks that you can model with so many different layers and how many parameters you want to use. And there's like this art to it right now. And I find that particular interesting because anytime there's an art to something, there's an opportunity to start a company around that art and to build out a product that surpasses the best practices in whatever field you're going after, whether it's analyzing traffic or figuring out how to drive a car. And so that is the opportunity because I don't think there's not like some path where in the next two years there's the meta algorithm that

Starting point is 00:17:17 knows which what way to pick like that's so so the the best thing to do now is to become well versed and all of them and my gut just tells me that anytime you're at a point like this the the most interesting solutions are actually going to end up being a hybrid you of like the thing that used to work that everybody said doesn't work well enough plus the new thing that everybody says is going to replace the old thing and that's basically been the entire history of AI computing in general well computing for sure but AI specifically which is Like, everybody always says the new thing, that's it. We finally got it.

Starting point is 00:17:51 Huge step. The old stuff is all done. And the best example for me of that is how everybody said machine learning was going to replace all of natural language processing. But if you dig into any of the work that's been going on, even the most state-of-the-art translation, which goes any language pair to any language pair, well, the input and the output all rely on the old school, like from the 1970s natural language stuff, just to do some very basic. bookkeeping, it's very basic stuff. You talk about like building a spell checker or whatever. Parts of speech finder. Right. And all the image, all the image stuff, like, wow, you know, if you want to figure

Starting point is 00:18:27 out features, it turns out when you start doing features of image recognition, you're using a whole bunch of old school edge detection and contrast and finding objects and all of the stuff. And so you can't just like show up and say, now we're going to understand, we're going to be the unsupervised learning company because the question everyone's going to ask is, well, how are you going to make whatever you're doing practical? What I do remember about hearing. the stories of the early days of NLP and observing parts of this firsthand as well is how the

Starting point is 00:18:53 entire field and community, a lot of them had very strong opinions about, you know, there's this whole phase of like expert domain knowledge building and really that's the only way to actually make NLP work at scale. There was all these things they had to do because it was before the days of big data. They couldn't even conceive of the Google scale big data. And then they went to this world where, oh my God, we don't even have to have these kinds of constraints and way of doing things because we have all this data. And now it's sexy to me that you can flip that model again and almost say you don't even need big data with the results like this kind of paper because you don't have to have anything. Yeah. This is the revenge of the algorithms moment, which is it's always been

Starting point is 00:19:31 about the data. If we had more data, we'd have more accurate models. And what this experiment showed is, look, all I needed was the rules and a really good loss function, some very clever programming and I get better performance than I had when I had the data set, right? So that's the tantalizing, oh, look, you could do it without data. And like, look, as an investor, that'd be awesome. If I could fund AI experiments without this step of collect data, annotate data, train model, we could run a ton more experiments. Exactly. Because it's an age of abundance. You bought up Pedro Domingo, and it was funny because one of the comments he made on the heels of this announcement, which I thought was quite interesting, is like, well, you know, AlphaGo Zero learned after

Starting point is 00:20:09 like five million games. Humans took only so many thousands of years. And I'm sort of like, that's the whole point of computers is to learn it in three days. You know, time is money. I don't care if it took like, you know, five million games. It took three days. Time is more important than this amount. You know, again, it's an amazing accomplishment. It's just, I always try to look at these in the context of how these innovations tend to happen. You know, if you just look, the same thing happened with search, like search itself. When people were inventing all the techniques of search, they were working on these tiny machines with all of these physics constraints about how much they could compute. And like, it was a whole thesis just to be able to go get like the archives of the New York Times to search through it.

Starting point is 00:20:47 And the same thing happened with spell checkers. Like, wow, we'd love to have 50,000 words in the spell checker, but we don't have that much memory. And then there was a whole, well, now we have a lot of memory. And so now we're spending all this money to try to compile all the words. And then someone said, why don't we just use all of the internet as the spelling dictionary? Then there's no spelling dictionary or the IME if it's an Asian language. And so this curve just keeps repeating itself. And I think that neither is going to win out because data is always going to be valuable

Starting point is 00:21:14 because it's an input that you can't escape. Whereas the flip side is not everything has data. So if there's this opportunity, like, you know, I look at like a great example is like how many of your lift drivers also have ways turned on just because they know the maps are entirely accurate, but they don't know about accidents or emergency. closures or weather or whatever. And so having that data combined with the data plus algorithms, that hybrid is going to continue to win.

Starting point is 00:21:44 Right. Because only if you really want to be practical and actually solve the problem. But I have to say one thing about this, which is so fascinating that you use the example of search and you describe this curve that we're on. And part of startups is sort of, you know, getting at the right place and point in time along that curve and where you are in the moment. Because sometimes I think a lot of academics or even people are really in love with their own ideas, sometimes lose a big picture.

Starting point is 00:22:03 Like they've had this build up this expertise. one area and they don't realize, like, practically speaking, the world has changed around you. Because the point that I think is fascinating as well in the innovation story using the search example is that Google was like the 15th search company to come around before it hit success. And that is kind of relevant to think about. Right. And it's super important, too, from the company building perspective, which is they had this

Starting point is 00:22:23 algorithm, which we all talked about back then, page rank, you know, appropriately named but not. That's fair, Larry. But the, or web page ranking as well. But the interesting thing was very rarely is like an. algorithm like this secret sauce for a company. Because you could look at what goes on from the outside and pretty much reverse engineer and algorithm. So again, back to, but if you have like a data source that you can actually. And a monetization model. And it turns out one of the things

Starting point is 00:22:48 that Google built out was like they were crawling the web faster than Alta Vista. They made this bet because they were machine learning people before machine learning was cool. They made the bet that having the data was going to be. And it turns out that was what the barrier to entering the search market, even for Microsoft and Bing was like sucking in the entire internet fast enough. Step one. Crawl the internet

Starting point is 00:23:09 while it's growing at exponential rates. And so, you know, back to, but I actually want to bring up one more thing that I just think is kind of

Starting point is 00:23:17 really interesting from a practical point of view. I love it. You're like the practical person in this podcast. I feel particularly practical today because, well, I can't play Go.

Starting point is 00:23:24 So I got to add something. I'm just kidding. The thing that I find the most fascinating about all of these solutions in the space is the engineering of a product

Starting point is 00:23:33 that you can make a commitment to customers that works, and then when it doesn't work, you can figure out why. And so one of the things that's so interesting about all of this is debugging. Interesting. And how does that really fit in? And sure, with Go, you lost the game. And, of course, while they're building all of this, they're figuring out, whoa, what did we do wrong to make this move repeatedly?

Starting point is 00:23:55 They're doing all of that debugging over the past end months. But in, you know, if you just all of a sudden apply this to the enterprise space or to adjusting news feeds or a zillion other things that you can think of. Like, figuring out where it goes wrong, like, that's actually really critical to a business. Like, you can't put a product out there, pick our sales forecasting example. And then it's wrong.

Starting point is 00:24:18 Like, you can't just go, ooh, machines make mistakes just like people. Your VP of sales could have messed up too. Because nobody pays money to, like, a computer for it to be wrong. Yeah. And so how do we think about that? Yeah, this whole idea is sort of transparency, behind these models? In other words, do we know why they're behaving the way they're behaving? Demystifying the black box.

Starting point is 00:24:38 You know, the superactive area of research right now, right? Which is how do I make the deep learning models more transparent so that I can debug them? I can verify them. I can make sure there's no systematic bias in them, right? Because until that, you couldn't do important things like, hey, can this person have a loan or not? Because the government will say, you cannot make that decision unless we understand why it is that you're making that decision. But you've made the argument, Frank, that when it comes to say self-driving cars, we kind of it's no better or different than what the human

Starting point is 00:25:06 mind does. We don't know. We can't interrogate the black box of the human mind that's driving that car. So the counterpoint of that is that well, you're right, Stephen, that you know you're paying for this computer to be smarter, but the reality also is the stuff and not that smarter than humans anyway, so who cares? Right, but

Starting point is 00:25:22 the problem is first, that's everybody else, not me. Yeah. Like I like I'm the best driver on the road. It's all the other people. But this is, I mean, you know, I like to always go back to this, the wonderful, wonderful research on computers and society that the Stanford professors, Nasson Reeves did, because one of the things that they really realized really back in the early 80s. Oh, Clifford Nass.

Starting point is 00:25:44 Oh, Iron Rees. Which was that there's something about a mechanical device that produces answers that makes the human brain ascribe way more authority to it than there necessarily should be. In fact, they even apply this to the world of voice. recognition systems. They applied it to voice recognition. They applied it to chat bots before they were chatbots. I mean, it's so the interesting thing is that I don't know how it's going to take a major change in society for people collectively to just go, ah, people make mistakes and that's okay, but machines can't. Like we have, especially in the United States, a very,

Starting point is 00:26:21 very low tolerance for devices making mistakes. Except when machines are doing a kind of alien intelligence that humans cannot do. Because again, the most fascinating thing about our last talk about AlphaGo, this current talk is that at the end of the day, even though it opened, the AlphaGo Zero opened and closed with similar moves to what humans would do, it converged very quickly. There was a whole set of things in the middle that it did that were just things that humans would never have done. It's actually then augmenting us in a very different way because it's adding a completely

Starting point is 00:26:53 foreign intelligence. So it's not even comparable to our own. Yeah. It's a different type of intelligence. in the same way that animals have a different type of intelligence. And then you make all kinds of category errors when you say giraffes aren't intelligent or bats

Starting point is 00:27:06 aren't intelligent. They're just intelligent in a different way. Exactly. And we can't use the human yardstick to compare them. It's the reason we are more forgiving when a dog pees all over your couch and your five-year-old kid does the same freaking thing. Right. That's definitely the case. The challenge that we have in technology

Starting point is 00:27:21 is just the perception. Like you, I mean, you can, you just see it in the discussion of news feeds. and algorithms and how people are like, these should just be right. I mean, like, yes. And people should just have debugged this before it all happened. And actually, it's not even all that crazy, sophisticated what's going on.

Starting point is 00:27:39 And what's weird is, of course, at the same time, people are critical of what's on the front page of newspapers or what's in the first five minutes of TV newscast, which is literally the same decision made by a human being who's just deciding what should we show first on CNN versus Fox News. Some human just made up with some black box. their brain, which is augmented by their title of in charge of production. Yeah. And yet infinitely forgiving of those choices.

Starting point is 00:28:06 And so I actually think that there's a lot here. And I don't have an answer, but I think it's no different than any other software, which is if you're going to make something and offer it to people in a commerce situation. It better damn work and it better not be wrong. It better be clear like how it does it to you. I mean, like people like Excel was really great at doing math. quickly and then one day I find myself at the Naval War College having to explain myself to a bunch of generals like how do we know it's right I literally just had

Starting point is 00:28:36 to sit there going I mean it's just right it just works and and it's one of those those are the moments that contribute to me feeling like a lot of of empathy for what's going on in the marketplace now about this and why I'm so so alarm not alarm that's totally the wrong word so focused on this you know know the outcome because like like until you've just sat there with a bunch of people who control nuclear weapons telling you we're using a spreadsheet to calculate it, the weight of being right doesn't really hit you. Right.

Starting point is 00:29:03 Because before that, we were just like, oh, my God, it doesn't seem to make mistakes. And we were like super ecstatic. And it may, look, the charts are cool. Yay, ship it. And then one day they're like, is it going to work or not work? And you're like, you can't, we couldn't prove it. Like, that was the essence of it. Like, I couldn't go to people at Boeing or people at the Navy or wherever in Wall Street and prove that Excel worked.

Starting point is 00:29:23 And then, of course, 30 years later, like, every time there's a mistake in Excel, like it's a mistake in the human that typed it. Like Ken Rogoff did a bunch of economic predictions that forecast the recession in 2008, and then all of a sudden you find out his model was wrong, but the recession happened anyway. Did he make the right prediction or not the right prediction? And like, how does that work?

Starting point is 00:29:44 And I think that's what's going on here, too, with these things. I have a philosophical question then for you guys. That wasn't philosophical enough. Well, one more philosophy. We're being very philosophical here. You know, you said something about how people make judgment calls for what news to show on television, et cetera, and we have these expectations of all algorithms.

Starting point is 00:30:00 And one of the topics we've discussed on this podcast, Frank, you and I discuss it with Faye Faye, is this idea of bias in algorithms and how algorithms by definition can be biased. Is one possibility of this type of work because they kept using the phrase tabular rasa, which of course I find so fascinating because in human development,

Starting point is 00:30:16 there's an analogous world of this where there was this theory that the human brain was also a tabula rasa or blank slate, and then they quickly learn, like, no, we have millions of people, years of evolution and DNA that's actually we've actually coming in inheriting things. Is there now a possibility that

Starting point is 00:30:32 algorithms can write themselves at a true tabula rasa-like way given this type of work? Is that just way out there? That one's above my pay grade. These guys are throwing up their hands for our listeners by the way. I will give an example of sort of the things

Starting point is 00:30:48 that are hardwired in your DNA. So they've done a bunch of experiments that if you watch somebody's hand getting pricked, your muscles in your own hand will involuntarily contract. Now, the big caveat is if that person has a different skin color than you, your hand doesn't contract. Really? I didn't know that. In other words, there's something going on in your lizard brain that says, that's an other's hand. Therefore, it's not relevant to me. Therefore, whatever reflexes caused your hands to contract in sympathy when somebody

Starting point is 00:31:17 jammed a needle into that hand, that's an example of this sort of pre-wiring. You know, the question is, can algorithms be pre-wired that way? Or unprewired, even? Or unprewired. Because that, to me, is where the opportunity lies. I mean, you could definitely write a set of rules that said, you know, treat other people in a different way, right? That would be top down.

Starting point is 00:31:34 The peril that most people talk about today is not so much that the algorithms are biased, but you've fed it not enough data so that your prediction is biased. So the classic example of this is in the early days of vision recognition, some of the image classification algorithms were categorizing people with dark skins as gorillas. That happened because they didn't feed it enough data of dark skin people. So when people talk about bias in algorithms, they're mostly talking about this phenomenon, which is the human researcher or the human programmer selected an incomplete data set, and therefore you got biased results, as opposed to somehow the architecture of the neural

Starting point is 00:32:12 network is biased inherently. And I think that that's a very important point that is being where. well studied, it's well articulated, but particularly in the supervised learning case, the data that's input, at least today, almost every data set that you have is going to be and have some inherent bias. Because you weren't aware of these factors when it was being collected in the first place, because I think it's fair to say the awareness of all of these issues is at an all-time high relative to that. But then again, you look at all the medical studies and you're like, well, there haven't been very many women in most of these studies,

Starting point is 00:32:45 or there have been only women but studying a drug developed by. Is it true in biology and genetics research too? There's a lot of limitations. Right. And so, but I do think that anybody today embarking on using supervised learning, whether for all or part of the solution, that that data set is implicitly challenged. And in particular, it's the labels. Because even the which labels you pick or which labels you omit is going to create some bias in the model that you're unaware of. That's exactly right.

Starting point is 00:33:13 Your cultural background, your history, the way you grew up will lead you to label an image a certain way. That may be different than somebody who, right? So what is the ground truth? The best example of this are just looking at images and like sentiment analysis is such a big thing. But images, like facial expressions have been studied all around the world for decades, multiple decades, about like what is happy, what is sad, what is questionable. The same with speech and intonation. Like sometimes, you know, ending on a high note sounds like you're asking a question unless you're in another culture where that's making an exclamation. Or it can be vocal fry and people are complaining that women shouldn't speak that way.

Starting point is 00:33:51 The vocal fry thing is a perfect example of that going on right now. And actually, the high note was the sort of the 80s version of that same thing. And so it goes in vogue too, right. Be super careful about the labels. And to Frank's point, that's a great place for transparency. Because if you can go to a future or a potential customer and talk about these, this is what our model is based on when we're forecasting your sales or telling you to optimize your assembly line. That could really matter to that process. So high level takeaways.

Starting point is 00:34:23 Hybrid works always. This is a thing. This is like a refrain. I feel like I should get you a T-shirt that says hybrid works, God damn it. Well, be careful because I'm really against a lot of hybrids. Oh, yeah. Like hybrid cloud computing. Right, right.

Starting point is 00:34:33 Hybrid in terms of solutions between old and new for coding. The old stuff that works definitely does work. Old and new, academic and industrial, like exactly, all the things that make it practical versus pure, so to speak, purest. I think another big takeaway is this revenge of the algorithms moment. Yes. There is so much momentum right now that says, basically, we're one labeled data set away from glory, right? And this result basically shows you, wow, there's a lot of mileage that you can get out of reinforcement learning where there's no data, no labels. One takeaway for me is just the element of surprise, because I've been thinking,

Starting point is 00:35:08 a lot, you know, just how, I mean, I used to work in the world of how humans learn, but what's fascinating to me is that the system played itself at a level calibrated to itself at every level. And in human learning, that doesn't happen. You're taught by your parents. You learn from, you know, adults. You learn from people have more experience. There's all these different things that happen. And the thing that's so fascinating me is that to me, it is very evolution-like. It's like a big bang moment. And while I wouldn't hype and say that means we're going to end up here, I do think that's very amazing, especially when you think about the surprises like, in the In the paper, one of the things they talk about is that the system learned something that's

Starting point is 00:35:41 actually very easy for humans to learn way later in the game. It's actually, I think, like, I forgot, Chichot or some specific move to, like, the latter. The fact that the system took forever to learn something that's very first for humans, I just, I'm endlessly fascinated by this relationship between humans and computers and what we can learn from computers and vice versa, and also what it means for the field when you can actually add how people learn to the field of artificial intelligence machine learning. Yeah, with reinforcement learning, one of the challenges has been, sometimes you get into this training epic where there's no improvement, right? Because you're just playing these games over and over again.

Starting point is 00:36:16 And like, what is it that causes the next game to be smarter? And this one was like in three days, it got incredibly smart. So that's off to these guys. And let's also add the other takeaway, which is that you can do amazing things with simple architectures because we're at a point in a moment right now where you can have four TPUs instead of 48. For me, one of the takeaways is if you can frame your problem with a set of constraints and a set of rules and a fixed set of operations, that's a very powerful concept that changes, because we've been so data-centric, people have stopped trying to think of their solutions algorithmically, and it's entirely possible that there is an algorithm. Which is the same thing that I think we would have said a year ago, which is everybody was so focused on machine learning, the traditional algorithmic approaches might have been overlooked. And here we are again with proof, but still, now we have to go back and say, well, you know, there's some basic machine learning that can work. There's some basic algorithmic stuff.

Starting point is 00:37:12 But that key is really the set of constraints. Yeah, for people who are interested in, I'd highly recommend Andre Caparthe's YC talk on this. You can search for it. It's on Google slides. And the whole talk is basically about where will artificial general intelligence come from? And he basically compares and contrast this sort of rule-based world of go to something messy like. how would you build a pick and pack robot for Amazon? And so if you're interested in this topic of what's the difference between role-based board games

Starting point is 00:37:41 versus the messy real world, it's a great presentation. That's great for the generalized intelligence side. So any parting messages for entrepreneurs building companies that have very specialized things? Because one thing we have argued in this podcast, including you, Stephen, when we did a podcast on building product and with machine learning, is that sometimes the best places to play are very specific domain focus, whether or not they have a lot. clearly explicit set of rules or not. So any thoughts on that? Yeah. So one takeaway is one of the ways we evaluate startups from an investor's point of view is have you pick the best techniques

Starting point is 00:38:11 to solve the business problem that you're trying to solve. And I think this paper basically opens up a frontier of exploration that you might not have thought about before. Because I think if you started an AI-centric company today, you would definitely be on the get data, get labels, train model, make prediction path. And this opens up another area you can try and figure out am I solving a business problem where this supervised learning approach is going to lead me to glory, this reinforcement learning approach is going to lead me to glory as opposed to supervised learning or unsupervised learning? I think is the most interesting thing for me in looking at different companies and just

Starting point is 00:38:48 and talking to the founders is just that there's this world going on of advances and new things all the time. And you could get on the treadmill of always trying to be the newest thing. And we knew that the next generation of companies, you know, two years ago, we're just going to be, you know, whatever you were doing before plus AI. But the thing was it wasn't clear that that was always the best solution. And then you replaced AI with, no, it has to be data and machine learning and labeling. And so what Frank was saying is just super important to internalize,

Starting point is 00:39:17 which is that part of being a founder and building a new product is knowing that the reason why you're choosing the technologies you're choosing and not just because you think that's what investors are looking for. Our jobs and that side are to actually fair. out like who actually has a handle on the problem they're solving and a line that they can draw from that problem to their chosen solution path. And that is often, too often overlooked. That's exactly right.

Starting point is 00:39:45 My favorite joke that I tell is when I asked, when I ask entrepreneurs what machine learning algorithms are using in their product when they claim to be in machine learning based startup, the answer I am not looking for is, I don't know, I'm sure we're using the good ones. Well, on that note, you guys, thank you for joining the A6 and Z podcast. Thank you.

a16z Podcast - a16z Podcast: Revenge of the Algorithms (Over Data)... Go! No?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.