The Knowledge Project with Shane Parrish - #13 Pedro Domingos: The Rise of The Machines

Starting point is 00:00:00 Welcome to the Knowledge Project. I'm your host, Shane Parrish. I'm the curator behind Farnham Street, which is an online intellectual hub of interestingness covering topics like human misjudgment, decision-making, strategy, philosophy. But today we're going to be talking about artificial intelligence and machine learning. the knowledge project allows me to interview amazing people from around the world to deconstruct why they're good at what they do more conversation than prescription it's about them not about me

Starting point is 00:00:38 on this episode i'm so happy to have pedro dominguez who's a professor at the university of washington he's a leading researcher in machine learning and recently wrote an amazing book called the master algorithm i was fortunate enough to have a long and fascinating conversation with him over dinner one night which i hope hoped would never end, but that ended up leading to this episode, which I think you will love. We're going to explore the sources of knowledge, the five major schools of thought of machine learning, why white-collar jobs are easier to replace than blue-collar jobs, machine wars, self-driving cars, and so much more. I hope you enjoy this conversation as much

Starting point is 00:01:17 as I did. So maybe you can, just for the sake of our audience, can you give an overview of what is artificial intelligence? Sure. Artificial intelligence or AI for short is the subfield of computer science that deals with getting computers to do those things that require human intelligence to do as opposed to just routine processing. So things like reasoning, common sense knowledge, understanding language, vision, manipulating things, navigating in the world, and learning. These are all subfields of AI, and if you have them all together, what you have is an intelligent entity, which in this

Starting point is 00:02:02 case would be artificial instead of natural. And so normally we're used to natural, and this kind of begs the question that we had talked about over dinner, which is where does knowledge come from? Yeah, so the knowledge that we human beings have that makes us so intelligent comes from a number of different sources. The first one which people often don't realize is just evolution, right? We actually have a lot of knowledge encoded in our DNA that makes us what we are. That is the result of a very long process of weeding out the things that don't work and, you know,

Starting point is 00:02:34 building on the things that do work. And then there's knowledge that just comes from experience. That's the knowledge that you and I acquire by living in the world and that's encoded in our neurons. And then equally important, there's the knowledge that the kind of knowledge that only human beings have, which is the knowledge that comes from culture, from talking with other people. people from reading books and so on. So these are the sources of knowledge in natural intelligence. The thing that's exciting today is that there's actually a new source of knowledge on the planet, and that's computers, computers discovering knowledge from data. And I think this

Starting point is 00:03:11 emergence of computers as a source of knowledge is going to be every bit as momentous as the previous three were. And also notice that each one of these sources of knowledge produces far greater quantities of knowledge far faster than all the previous ones. So for example, you learn a lot faster from experience than you do from evolution and so on. And it's going to be the same thing with computers. So in the not too distant future, the vast majority of the knowledge on Earth will be discovered and will be stored in computers. That's fascinating to think about in the sense of knowledge and application. Do you think that computers will be applying that knowledge or they'll be discovering it on their own?

Starting point is 00:03:52 They will be both discovering it and applying it. And in fact, both of those things will generally be done in collaboration with human beings. In some cases, it will be the computers doing it all by themselves. So, for example, these days there are hedge funds that are completely run by machine learning algorithms. For the most part, you know, hedge fund will use machine learning as one of its inputs. But there are some where the machine learning algorithms, they look at the data, they make predictions and they make buy and sell decisions based on those predictions. So there's going to be the full spectrum.

Starting point is 00:04:23 So we've delegated the actual decision making to the algorithm. Yes, in many cases. For example, there's this venture fund recently that announced that one of their directors is now going to be an algorithm. Oh, wow. There's seven directors on the board, and one of them's an algorithm. So their algorithm doesn't decide anything all by itself, but it doesn't. have a vote as much as any one of the humans.

Starting point is 00:04:50 And of course, you can easily imagine if this tends to work well, then maybe two more there'll be two votes that are algorithms, and maybe then there'll be a majority, and maybe eventually it'll be all algorithms, or it'll be some mix of the two. So this is fundamentally different, machine learning than traditional computer science, which is you have an input, you give it to an algorithm, which generates an output. And now we have, I think, the output and the data going into the algorithm, which is creating another algorithm or am I misunderstanding that?

Starting point is 00:05:20 Exactly. So what happens in traditional computer science and really everything that we know about the information age was created that way is that somebody has to write down an algorithm that turns the input into the desired output. So for example, if I want to, I don't know, diagnose x-rays of people's chest

Starting point is 00:05:38 to decide whether they have lung cancer or not, I have to write an algorithm that takes in the pixels of that image and outputs a prediction saying, you know, here's where the tumor is or there's no tumor. And this is very, very hard to do. And in fact, for some things, we don't even know how to teach the computer to do them. The difference with machine learning is that the computer doesn't have to be programmed by us anymore. The computer actually programs itself. You give it the examples of the input and the output, like, for example, a lot of pairs of here's the x-ray, here's the diagnosis, here's the

Starting point is 00:06:10 diagnosis. And by looking at that data, the computer figures out what is the algorithm that would turn one into the other. And the thing that's amazing is that often just by taking a basic machine learning algorithm and applying on a database of, for example, X-Rs and diagnosis, you actually wind up with something that is better at, for example, pathology, than a highly trained human being would be. And the other thing that's remarkable a lot of machine learning is that in traditional computer science, you need to write down a different algorithm for everything that you want to do. So if you want the computer to do diagnosis, you need to explain to it what are the of that diagnosis, if you wanted to play chess, you need to write a completely different

Starting point is 00:06:49 program. And if you wanted to drive a car or invest in the stock market, you need to write yet a completely different program. With machine learning, the same, the same learning algorithm, a single learning algorithm can learn to do all of these different things, just depending on the data that you give it. So if the data is chess games, it learns to play chess, if the data is x-rays, it learns to do diagnosis. that there is, you know, stocks, it learns to, you know, predict fluctuations and so on.

Starting point is 00:07:17 Is that the master algorithm? Yes. So, in essence, every major machine learning algorithm is a master algorithm, you know, in the same sense that, you know, a master key is a key that opens all doors. A master algorithm is an algorithm that works for all different problems. And that is very much the goal of all machine learning is to develop such master algorithms. And there are several major such algorithms today that have, mathematical proofs that if you give them enough data, they can learn any function.

Starting point is 00:07:47 Now, of course, the whole question is, can you do it with realistic amounts of data and computing power? And, you know, and then different algorithms tend to be better for some things than others. But what I and others believe is that we can develop a true master algorithm, meaning an algorithm that is able to solve all the different kinds of learning problems that these different algorithms can. In some sense, a grand unified theory, of machine learning in the same way that the standard model is a grand unified theory of physics or the central dogma is a grand unified theory of biology. So before we get into the different ways of machine learning, I believe there's a couple different schools of thoughts there that

Starting point is 00:08:27 I really want to hone in on and get some information from you on. I want to better understand how you see this kind of propagating in the sense of we don't understand how the algorithms are working anymore. At some point, they're just self-evolving. Are they not? Well, it's different from traditional programs, right? With traditional programs, we understand every little detail of how they work because we created it and we debug it until it did exactly what we wanted. And certainly with machine learning, things are very different because to some extent, we don't fully understand what the algorithm is doing. And in some way that it's power, right? It can actually know way more than any of us could.

Starting point is 00:09:09 Having said that, we, the machine learning researchers and the data scientists, we actually have a good understanding of how the learning algorithm itself works. What is it that it does to learn and how could you make it learn better? And then, you know, there's a different issue

Starting point is 00:09:26 which is the understanding of what the algorithm produces, right? If this algorithm has produced a model of how, you know, tumor cells work in cancer, can I understand what that algorithm is doing? And depending on the type of machine learning, this may be harder or easier. With some types of machine learning, like, for example, neural networks and deep learning, it's very opaque.

Starting point is 00:09:47 What is learned is this big jumble of lots of parameters and all-near functions, and nobody really understands what's going on, which, in fact, often precludes it from being used. But then there's other types of machine learning where what the algorithm produces is very easy to understand. It's a bunch of rules or it's a decision tree or it's some kind of graph, you know, connecting variables that you can actually look at and understand. So there's a spectrum of degrees to which the results of learning are understandable or not. Okay. Would you say that the information we learn from machine learning is more uncertain than the knowledge that we gain from, say, evolution or experience or culture, or would you put them all in the same kind of category?

Starting point is 00:10:29 Well, it's certainly quite uncertain. So any knowledge that you induce from data is necessarily uncertain because you never know if you generalized correctly or didn't. But sometimes you can actually machine learn knowledge that is actually quite certain because if you know well how the data was generated and you've seen enough data, you can say that with very high probability the knowledge that you've extracted is correct. Conversely, a lot of the knowledge that we have from evolution and from experience and from culture, we often tend to think of it as much more certain than it really is. We have this great tendency that's been well-studied by psychologists

Starting point is 00:11:06 to be overconfident in our knowledge. And a lot of the things that we take for granted, actually it turns out that they just ain't so. So evolution could have evolved into a local optimum when there's actually a much better one a little bit farther away, and you might have learned something from your mom that told you do things this way, but it actually turns out that that's wrong or it's outdated

Starting point is 00:11:26 and there's a better way to do it. So there's uncertainty on all sides of this, and it could be more or less depending on the problem. Do you think that the input that we have to make the decisions changes in the sense of the quantity of data, the reliability of the observations we're making from machine learning would be higher or? Yes, so all of those things are factors.

Starting point is 00:11:49 I think where machine learning has a big advantage over human intelligence is that it can take in vastly larger quantities of data. And as a result of which you can learn more, and it can also be more certain if that data, you know, is very consistent with this piece of knowledge. Where it has a disadvantage is that machine learning is very good. Machine learning today is very good at learning about one thing at a time. The thing that humans have is that they can bring to bear knowledge from all sorts of directions. So, you know, take, for example, the stock market, the traditional machine learning algorithms

Starting point is 00:12:25 and people started using neural networks to do this in the 80s, They just learned to predict the time series from the stock itself and maybe other related time series in a way that human beings couldn't, but human beings could know that, oh, you know, today, you know, there began a war between Russia and, you know, the Ukraine. And, you know, human beings can try to factor this in, whereas, you know, the algorithms couldn't, or, you know, the Fed just said that it's going to raise interest rates or something like that. So human beings can bring a lot of knowledge to bear that the algorithms don't have. Having said that, what we see even from the 80s to now is that the machine learning algorithms are starting to use a lot of these things. So, for example, there's hedge funds that trade based on things like, you know, what's being said on Twitter, right? If you can pick up certain themes on Twitter, then maybe this is a sign that something is going to happen or has happened or a recession has become more likely or whatever. And they can, you know, learn from things that wouldn't occur to people.

Starting point is 00:13:22 Like, for example, I know there's one company that they use. real-time traffic data and they use satellite photos, I'm not kidding, of parking lots to figure out how many people are shopping at Walmart, let's say, and other stores to decide whether their business is becoming better or worse. So I think as time goes forward, you know, the machine learning will get better using a broad spectrum of information. I think for a long time, there will still be types of common sense knowledge that people have.

Starting point is 00:13:53 So I don't think for most things, you know, the human element. element is going to become unnecessary very quickly, but maybe ultimately it will. That's a good segue into what the different ways of machine learning are, because I know there's multiple. Yeah, so there's five main ones. All of them quite interesting, because each one of them has its origins in a different field of science. So one that is very popular today is learning by emulating the brain.

Starting point is 00:14:23 So the greatest learning algorithm on Earth is the one. one inside your skull. By definition, has learned everything that you know and everything that you remember. So we can take inspiration from the neuroscience, see how brain circuits work, you know, how neurons work, how they're put together, how they learn, which is by, you know, strengthening synapses, and then develop algorithms that try to do the same thing in a simplified form. And indeed, there are some very, very successful applications today of this type of learning. like, you know, for example, you know, speech recognition on Android phones and, you know, the kind of simultaneous translation that Skype can do where you speak in English and somebody

Starting point is 00:15:03 might hear you in Chinese and vice versa to things like image recognition and whatnot. So this is one approach. Another approach is to emulate not the brain, but evolution. So, you know, your brain is great, but if you think about it, evolution made the brain in the first place and made body and made, you know, all creatures on Earth. So that's a heck of a learning algorithm. So maybe what we can do is simulate evolution on the computer and instead of evolving animals or plants evolve programs. But in the same general way of having, you know, population of individuals, you try them at the task and then the feeder ones get to reproduce and cross over

Starting point is 00:15:42 and mutate and produce the next generation. So that's another approach. And again, it's had many amazing successes like people have developed new types of radios and amplifiers and electronic circuits using this type of machine learning and they've actually gotten patents for them so they work better than the ones that were developed by human engineers they're typically completely different so the things that no one would ever think of but they're good enough that the US Patent Office actually granted them patents for these things so that's another approach both of these are in by biology in one way or another, most machine learning researchers actually think that

Starting point is 00:16:25 taking inspiration from biology is not a great idea, even if it's superficially appealing, because biology, you know, just is random, and who knows if it's actually doing the best thing. So most, you know, machine learning researchers, they believe in doing things more from first principles. And one way of doing things from first principles, which gets back to this theme of uncertainty, is Bayesian learning. So the idea in Bayesian learning is that, that you start out with a large number of hypotheses, and they're always uncertain, so you quantify how much you believe in each hypothesis using probability.

Starting point is 00:16:59 In the beginning, you have what's called your prior probability, which is how much you believe in each hypothesis before you see any evidence. And then as you see more evidence, your belief in the hypothesis evolves. So the hypotheses that are consistent with the evidence that you're seeing become more likely, and the ones that are inconsistent become less likely, and hopefully at the end of the day, some, you know, some one or a few hypotheses shine through, but even if they don't, you're always

Starting point is 00:17:23 in a position to make decisions by, you know, by letting those hypotheses vote with a weight that's proportional to how probable they are. So this is Bayesian learning. Another more first principles approach is symbolic learning. Deen symbolic learning is to learn in the same way that scientists and mathematicians and logicians do induction. So you have data, You look at the data, you formulate hypothesis to explain the data, and then you test it on your data, and then you either throw out those hypotheses or refine them, and you keep going like that. So it's very much the way the scientific method works, except it's being done by machines instead of by human scientists, and therefore it's much faster and can discover a lot more knowledge.

Starting point is 00:18:12 And in fact, one of the applications of this that's quite exciting is, you know, in the UK they developed this, complete robot scientist, you know, it's a robot biologist. It actually does this whole process, including carrying out the experiments using microarrays and gene sequences and whatnot. And it starts out with basic knowledge of biology, molecular biology, like DNA, proteins, the regulation, and so on. And then it develops models of the cells that it's looking at. And in fact, a couple years ago, the robot is called Eve. There was a previous one called Adam. a couple years ago, you've actually discovered a new malaria drug. Oh, wow.

Starting point is 00:18:52 Yeah, and once you have one robot scientist like this, there's nothing keeping you from making millions, and then science will progress correspondingly faster. And then finally, the last major school of machine learning is inspired by several fields, but probably most importantly by psychology. And this is the idea of learning by analogy. So there's a lot of evidence, and this to most people is actually quite intuitive, that we do a lot of learning and reasoning by analogy. When we're faced with a new situation,

Starting point is 00:19:20 what we do is we retrieve from memory similar situations that we experienced in the past, and then we try to extrapolate from one to the other. The solution that applied in the previous situation, we apply it or transform it to applying the new one. So, for example, if you want to do medical diagnosis in this way, what you would do when you have a new patient to diagnosis, you look for the patient in your file with the most similar symptoms,

Starting point is 00:19:43 and then you assume that the diagnosis will be the same. This sounds very naive, but it's actually quite powerful. Again, you can actually prove that if you give an approach like this enough data, it can learn anything. So those are the five main schools of machine learning. And then the master algorithm is one view of where they all come together? Exactly. So each of these schools has its own master algorithm.

Starting point is 00:20:06 For example, the master algorithm of the connection, this is called back propagation because it's based on propagating errors from the output back to the input. And the Bayesian is called probabilistic inference. The evolutionaries have genetic programming. The symbolists have inverse deduction. And the analogizers have what are called kernel machines. The mass of algorithm would actually be a single algorithm that unifies all of these into one.

Starting point is 00:20:33 Again, think of the analogy with physics. So Maxwell unified the electricity and magnetism and light into one set of equations. now the standard model has actually unified those with, you know, the strong and weak nuclear forces. So the idea here is we should be able to have a single machine learning algorithm that can actually do what each of these five can. And are they working together to kind of combine them, or is it going to be like a sixth one that is created in your mind that kind of supersedes these? Well, there's certainly a lot of people working on these things.

Starting point is 00:21:06 So there's a lot of people, for example, working on combining two of these paradigms. There's a lot of work, for example, on combining symbolic learning with Bayesian learning. There's a lot of work on combining, you know, connectionist learning and vision learning or connectionist and evolutionary. In essence, all of these combinations are things that people are working on. And these days we have, you know, Gandhastrian you find three, four, maybe even all five of them. So some people believe that we will solve the problem this way and that, in fact, we're very close to solving it this way. Others say that, yeah, no, none of these really has everything that it takes. It's going to take some new ideas.

Starting point is 00:21:43 It's going to take maybe some entirely new paradigm. And my gut feeling is that actually it's more the latter. I do believe that we have made a lot of progress, but I think we are still missing some important ideas. And in fact, part of my goal in writing the book was to try to get people from outside the field interested in these problems. because in some sense, they are more likely, ironically, to have these new ideas than the people who are already professional machine learning researchers and are thinking along a specific track, and then it's hard to jump out of that track. I want to come back before we move on to one of the things you said where there was a patent generated as the result of a machine learning algorithm. Yeah, not just a patent, but a whole series of them. I think they have dozens or more of patents for typically things like electronic devices.

Starting point is 00:22:32 at this point. What do you see as the implications to algorithms being able to patent the algorithms that they effectively have created almost without human intervention or human understanding? There's a couple of implications. One of them is great. Well, now because of that, we can have better radios and better amplifiers and better filters and whatnot. So it's a game. The other side of this is that, well, maybe we don't need all those engineers as much as we did before. which kind of touches on an interesting aspect of all this, which is, as much as there's a huge shortage of computer scientists and engineers and so on today, in the long run, things like this are easier to automate than things that are more, you know,

Starting point is 00:23:18 from the humanities and social science and so on. So people often think that the easiest jobs to automate are like the blue-collar ones, but actually our experience in AI is that it's actually more the opposite. It's often white-collar jobs that are easier to automate, For example, things like engineering and lawyers, doctors, et cetera, we've already talked about medical diagnosis as an example, where something like, for example, construction work is very hard to automate, because that type of work takes advantage of abilities that evolution took 500 million years to develop.

Starting point is 00:23:55 They seem easy because we take them for granted. But things like being a doctor or an engineer or a lawyer that you have to go to college to do. Well, you have to go to college precisely because they do not come naturally to human beings, but machines don't have that type of difficulty. So in some ways, the jobs that are easier to automate are different from the ones that people often think are. I think we're at the point when we were talking at dinner where machines can actually do a better job at identifying, you know, based on x-rays than the people can or with fewer errors. That's right? Yes. So, machines are remarkably better than human doctors that are doing all types of medical

Starting point is 00:24:37 diagnosis, not just from x-rays, but from, you know, symptoms, right? You have a patient, you have their symptoms. What is the diagnosis? And even very simple machine learning algorithms running on fairly small databases of patients, like maybe with only hundreds of thousands of patients, typically do better than human doctors. And part of the reason is that algorithms are very consistent, whereas human beings are very inconsistent. They might be given the same patient, you know, in the morning and the afternoon

Starting point is 00:25:04 and have different diagnosis, just because they're in a better mood or they forgot something. So human beings are very noisy in that regard. And, you know, and if you're the patient, that's actually not a good thing. So I think for things like this, machine learning is a very desirable thing to use. In the particular case of medicine, it's not used more already because, of course, the doctors are also the gatekeepers of the system, and they're not very interested in replacing themselves or their job that they like best by machines. But, you know, eventually it is going to happen and it is starting to happen, for example, in situations where doctors are not available, and so nurses can use this

Starting point is 00:25:39 or for patients that need, you know, constant monitoring or in low-resource situations where people can't afford the doctors and so on. There's a concept of freestyle chess. I think you're familiar with it. Mm-hmm. Our doctors, and that's where we're blending kind of the machines and the humans and the combination of the two actually makes a better decision than either one on their own. Is that sort of thing happening in your experience in the medical field?

Starting point is 00:26:05 Yeah, I think exactly. This is true in the medical field, and I think in most fields, and as you mentioned, chess is a great example, because it's not like, you know, when Deep Blue Beat Kasparov, well, now the world chess champion is a computer, and ever since then, computers have been the world chess champions. Actually, that's not the case. The best chess plays in the world today are what are called centaurs in the community. They're a team of a human and the computer. So a human and a computer can actually, together, can actually beat the computer. And this is precisely because the human and the computer have complementary strengths and weaknesses. And the same thing that I think

Starting point is 00:26:46 that is true of chess, I think is true of medical diagnosis. It's true of a lot of other things. So, for example, there's more, of course, to being a doctor than just doing diagnosis, right? There's interacting with the person. There's reading, how they're feeling from how, you know, they interact with you. All of these things computers are not yet able to do today. Maybe they will in the future. And certainly the boundary between what is best done by the machines and what is best done by humans will keep changing. But I think for the foreseeable future in most jobs, it will be a combination in human and computer that works best.

Starting point is 00:27:17 it seems like everybody's getting into artificial intelligence machine learning from Facebook and IBM to Amazon and Google do you see do you envision a world like 10 years out I have a bit of a mischievous mind so where people are trying to feed other people's algorithms false signals to change the machine learning or is that just crazy oh this is already happening and it's going to happen even more in the future. So what happens whenever you deploy a machine learning system is that the people who are being modeled change their behavior in response to the system. Sometimes in benign ways, but sometimes in adversarial ways. A classic example of this is spam filters. The first spam filters were extremely successful. They were, you know,

Starting point is 00:28:07 99% accurate. They were very good at tagging an email as being spam or being a legitimate email. But then guess what? Once those spam filters were deployed, the spammers figured out ways around them. They figured out how to exploit the weaknesses of the spam filters and do things that would get through. And there's been this ongoing arms race ever since then where the spammers come up with new tricks, the machine learning algorithms together with the data scientists come up with ways to defeat those tricks, and this just keeps going. And I think the same thing is going to be true in many other areas. In fact, two other areas where you can already see things like this very much happening.

Starting point is 00:28:44 One of them is actually the stock market. The stock market is largely a bunch of algorithms trading against each other. And in fact, what these algorithms are doing, whether or not they know it, is modeling each other. And what typically happens when somebody, you know, deploys a neural network to predict, you know, a certain stock, like, for example, you might have 3,000 networks each predicting one stock in the Russell 3,000, is that it works for a few weeks and then it stops working gradually because somebody else has started to model

Starting point is 00:29:15 what those algorithms were doing and therefore now those people are making the money and this is never ending. And in fact, what some of these people are doing now is they are combining, you know, connectionist learning with evolutionary learning because you actually, you know, in order to be able to learn, you know,

Starting point is 00:29:32 faster and more broadly than the neural networks could. And another area where you already see this happening is online ads, right? So the whole online ad market is, you know, there's these auctions among advertisers to put the ad in front of you when you see a page of results from Google or when you go to a, you know, a web page.

Starting point is 00:29:51 And Google has models of what people will click on. The advertisers have models. You know, there's companies like Rocket Fuel who basically work for the advertisers to model the users for them. The content providers, you know, have models of what will, you know, be clicked on and whatnot. So basically, everybody is modeling everybody. And, you know, all of these

Starting point is 00:30:11 models are evolving in tandem. So I think we're already starting to see this, but we will see it even more in the future. Another example is fraud detection, obviously. Another example is things like, you know, law enforcement and counterterrorism. So, you know, so the examples are legion. What do you think the implications of that are from like a country to country basis? Like, is the advantage that you would gain from being the first in machine learning cumulative, or is it just transitory and that it only lasts for a certain while? Like, is this something where you can just build up such a big lead that it would almost be impossible for anybody to compete with you?

Starting point is 00:30:47 Or is it something where it opens competition to everybody? I think some of the advantage is permanent in the sense that there's this network effect of data where if you have a good product and people start using it, then you have a lot of views. This is, for example, how Google has built up such an unassailable position in search. It's like you use their search engine, therefore they have a lot of data to learn from, therefore the search gets better, therefore more people use the search engine, and so you have even more data to learn from.

Starting point is 00:31:18 So someone coming in from scratch, trying to learn from initially no data or very little data, will have a very hard time competing with Google or Bing. So I think in some aspects, this first mover advantage is extremely important. important, you know, because you have more data and also because you have, you know, you've hired the data scientist, you've developed the algorithms, there is a race to develop better machine learning algorithm and there is certainly an advantage to coming first. Having said that, there are also lots of opportunities for those who are just starting. For example, because you could develop something that is enough of a major innovation that it outweighs those

Starting point is 00:31:56 things in the same way that Google had enough of an innovation with page rank and so on that even though they were just starting they actually did a lot better than the you know dominant search engines of the time like like for example altivista so so this is one aspect and the other one is that precisely because machine learning is something that can be used just about everywhere right in every single industry in every single part of what a company does so far it's only being used for a small fraction of the things that it could be used for so you could do a startup up that comes in and does machine learning for X, where nobody has really done machine learning for X before, and they could just run away with it. Even if initially their learning algorithms

Starting point is 00:32:35 are not the most advanced ones, they're just picking the low-hanging fruit. You could actually get a lot of mileage that way. It's amazing to think of it. The implications not only on people, but how we go about making decisions, and often we're the gatekeepers in a way, right? Like the doctors who are the gatekeepers to bring it inside an organization. And I think that'll be an interesting way to bring it in, is you bring it in a low level, and then it demonstrates confidence, and then it almost gets promoted, just like people. Yeah, and a lot of the way I think is, and this is part of why people need to become aware of machine learning, is that we should actually be our own gatekeepers. Ideally, we wouldn't

Starting point is 00:33:15 rely on third parties to be our gatekeepers. And often, the way a lot of change comes about is because people take on that role. So, for example, doctors initially were not very interested in the web or in computers. These days they have to be because patients will come to them and say, well, you said A, but I actually looked on the web and the web says that B, so what is it? And now this forces the doctors to start looking at the web. And once, for example, these machine learning systems become more widely available as they are becoming, people will start using them and the doctors will be forced to catch up. Same thing in a lot of large companies, right? The IT department says, no, we're going to use A. But then everybody starts using B, and after a while, you know, the IT

Starting point is 00:33:53 department just has to face reality and start using B as well. So I think the same kind of thing can and will happen with machine learning. What are the limitations on the decisions that we'll let machines make? Do you think we'll let machines, do you think we'll get to a place soon where a machine, there's no pilots in airplanes, where a machine can sentence someone to death, where boardroom mergers and acquisitions are made solely based on algorithms. and yeah that's a very interesting question it's really not so much a technological question as a sociological one i think what will limit uh i think over time we'll see more and more things being

Starting point is 00:34:34 done by machines and as we get comfortable with it we will have no problem handing control to machines like for example airplanes is an example right every commercial airline is actually a drone it's flying itself and in fact it would be safer if it was completely flown by a computer you know pilots tend to take, you know, the controls at landing and take off, which are actually the more dangerous moments, and they make more errors than the computers do. But, you know, people feel comfortable having a pilot in the cockpit, but we already have two people in the cockpit instead of three, and then we'll have one, and eventually we'll have zero. So I think there are a lot of decisions that we will gradually become more comfortable. It's partly a matter

Starting point is 00:35:12 of just psychologically adapting ourselves to this notion that the machines are making these calls and trusting them that they are making the right calls and that they would do what we would do if we were making the calls ourselves. I think at the end of the day, there will be some things that we will always reserve the right to make our decisions about. And I think, you know, those are the highest level decisions.

Starting point is 00:35:38 I think the decisions on how to accomplish our goals that can be taken by me, you know, like, I want to get from here to, you know, New York. I made that decision. But now how I get flown there, well, sure, I'm perfectly okay with the plane being flown by, you know, an algorithm or maybe the car that drives me to the airport also being an algorithm. And maybe, you know, I decided to go to New York

Starting point is 00:36:02 because of something that some computer advised me about. I said like, oh, there's this great whatever thing that you should do in New York. There's going to be this festival that you should attend or there's these people that you need to meet. But that decision, even though it was part. made with recommendations from a computer, I probably will always want to make it myself. I'm not just going to go to New York because the computer told me to. So I think, you know, what we see today is already this very intricate mesh of what's

Starting point is 00:36:28 decided by humans and by computers. So, you know, somebody wants to find a date, while they may have a dating site to help them find a date, but then they decide to go to dinner with them, and then they help, you know, so that's their decision, but then maybe they use Yelp to decide where to go to dinner, and then, you know, they drive the car to dinner, but it's, you know, GPS that's telling them, you know, where to turn, although it's still them driving, right? So there's this very intricate mesh of the human and the machine, and I think it's only going to get more intricate in the future. But ultimately, I think most things will be done by machines, except the really key decisions,

Starting point is 00:37:04 that people will always want to retain, even though they make them with advice from the machines. What is the singularity? Yeah, so the singularity is this notion that. that if machines can learn, right, you know, think of what we've been talking about. So you have an algorithm that makes another algorithm, right? You know, my machine learning algorithm makes an algorithm to do medical diagnosis or play chess or whatever. But by the same standard, we can actually have a learning algorithm, make another learning algorithm. And if the learning algorithm is able to make a better learning algorithm,

Starting point is 00:37:36 then that learning algorithm makes an even better learning algorithm, potential. So what happens is that we start with machines that are not very intelligent, but if each one of them can produce a machine that's more intelligent than the previous one, then maybe the intelligence will just take off and leave human intelligence in the dust. And the first people to speculate about this were, you know, John von Neumann, who's the, you know, one of the founders of computer science, and an I.J. Good, who was a statistician who worked with Turing on the Enemic Project and so on, and they first conceived of this notion of like the, you know, the technology just getting better and better

Starting point is 00:38:11 until it completely leaves human capabilities in the dust. The person who actually coined the term singularity to refer to this was Werner Vinji, a scientist in science fiction writer back in the 80s. And then the person who really popularized it was Ray Kurzweil, right? He wrote a series of books about how this is going to happen. Now, the basic evidence that people like Kurzweil adduced in support of this is they show all these curves of exponential progress. You see, like, you know, the progress is just getting faster and faster, and you're

Starting point is 00:38:44 struggling to get this into the future, and it's just going to, you know, a singularity mathematically is a point at which a function goes to infinity. And that's their argument. I think that argument is actually very dubious because in reality no exponential goes on forever, because there's always a limit because the world is finite. So actually what happens with all of these technology curves is that in the beginning, they look like exponentials, but then they flatten out. So they're what are called S curves.

Starting point is 00:39:11 Technology growth curves are always S curves. The first part of an S curve is actually mathematically indistinguish influential from exponential. So it's easy to look at that part and say like, oh, exponential growth, we're headed for a singularity. Actually, what we have is one of these S curves and we're headed for a phase transition. Once the phase transition is done and things flattened out again,

Starting point is 00:39:32 things could be very, very different from what they are now, and I think they will be in the case of AI, but I don't think we're going to see this, you know, infinite growth that, you know, goes completely beyond, you know, what humans can imagine. You think we'll just have a new baseline and then we'll adapt to it. Yeah, so, you know, there's, again, just extrapolating from the past, right? What happens is that there are these face transitions and the face transitions build on each other, right? So one capability, it makes other capabilities possible.

Starting point is 00:39:59 So electricity, you know, makes computers possible and then computers make AI possible and whatnot. And these things do build on each other. but when they happen and how large they are is extremely hard to predict. So no one knows exactly, for example, the current surge of progress in AI, how long it's going to last before it flattens out. Will it be 10 years? Would it be 20? How far will it go?

Starting point is 00:40:22 We can't assume that it will just run away from here without any more plateaus or interruptions. That would be very unusual, actually. But just to be clear, we do have algorithms that are spinning out better versions of algorithms that are generating algorithms, or is that a futuristic kind of statement? Not clear. So we do have, for example, an error of machine learning. It's called the meta-learning,

Starting point is 00:40:46 which is precisely the learning algorithms learning to make better learning algorithms. And this type of meta-learning in certain basic forms is actually already widely used today. Like, for example, Netflix uses this type of thing to recommend movies. It doesn't just use one learning algorithm. It uses a whole bunch of them. And then another algorithm on top of that, that is learning how to use their results. And for example, the way IBM Watson won at Jeopardy was using this type of learning.

Starting point is 00:41:11 Having said that, this is still quite limited than what it can do. And it's not, we don't have enough at this point for this thing to set up this loop where it just keeps getting better and better. That hasn't happened yet. But it will be very interesting to see if we can make it happen. You know, for all I know, some kid in the garage today has actually invented that algorithm, but we don't know. when we make decisions as humans we need to simplify there might be you know one million variables that affect the decision and we try to distinguish or determine you know the five that carry most of the weight computers don't have that limitation they can take into account

Starting point is 00:41:50 all of these variables how do you think that'll impact impact the way that we try to process things in the future like we're trying to simplify but machines are going to they're going to be more complex and they're okay with that complexity. And do you think that that instills trust in them? Or do you think it kind of shows the seeds of distrust? Or how do you look at that? Well, so that is exactly the key advantage of machines is that they can take an unlimited number of variables into account, very much unlike humans that are much more limited. But our brains are very good at things like vision, where we do take millions of variables into account and motion and whatnot. But for other problems, we are very, very limited, and the machines aren't.

Starting point is 00:42:34 So what's going to happen is that the machines are going to be able to learn much more complex models of the phenomena than human beings ever could. And this is good, right? Because with those better models, we can make better decisions. With a better model of the cell, you know, we can cure cancer, so on and so forth. Having said that, it'll still be important for people to trust what the computers are saying. And if they don't understand it, they won't trust it. I think what's to happen is that partly the learning algorithms are going to have better to get better at explaining to people what they're doing and again some of them are better than others and there's no reason why they can't be something that you hear a lot of today is like oh learning algorithms are

Starting point is 00:43:12 black boxes we're just going to have to learn to live with them actually the learning algorithms don't have to be black boxes there's actually no reason why shouldn't be able to say to the amazon recommend their system why did you recommend that book to me or you know i just bought a watch Please don't recommend more watches because I don't want to buy a watch now. You know, that's the last thing I want to buy. So you should be able to have this type of richer interaction with the learning albums. And in fact, it's what we already do with other people. So when your brain decides to do something and then, you know, let's say you're the doctor and just tell someone, you know,

Starting point is 00:43:43 you probably should get surgery because blah, blah, blah, and then somebody asks why. You're able to tell them why in natural language. The things that you tell them are not a complete explanation of what we're knowing in your brain. In fact, you don't even know what is a complete explanation of one or not over it. But it's enough for the people to understand and for trust the result. And you can do the same type of thing with machines. So to some degree, the machines are going to have to, you know, become better at doing this. To some extent, we humans are going to have to get more comfortable with the notion that even though we don't completely understand what's going on,

Starting point is 00:44:16 it's actually good that the machines have a handle on this complex phenomenon that without them we wouldn't. I want to switch gears a little bit, talk about IBMs, Watson, and chess, and then Deep Go. So I think it was 97, 98, when Watson beat Kasparov to claim the chess championships. You mean Deep Blue? Yeah, Deep Blue, sorry. And then we fast-forwarded to 2016, and we have Deep Go who beat Go. And those are two completely vastly different games, but the approaches that were taken, I believe, both of those systems are different.

Starting point is 00:44:54 Maybe you can walk us through the implications and the differences and why one feat was better than the other. Sure. So Deep Blue was very much classic AI. There was no machine learning involved. Deep Blue essentially was just doing a very clever and very extensive search for the best moves to make. And this is the way classic gameplay programs work is that

Starting point is 00:45:21 they look at a worth position and evaluate how good it is. And in things like chess and most other games, this is not that hard to do. For example, in chess, you know that having a piece advantage is good and roughly how much each piece is worth. You know, they're having control of the center is good. And so what the learning is trying to do is to come up with features like this and with weights for them. And then based on those weights, it decides what are the best positions

Starting point is 00:45:47 and makes the moves that will lead the game into those best positions. positions, taking into account that the human opponent is doing the same, but in the opposite direction. So this is how chess works. Unfortunately, this type of approach does not work for Go for a couple of reasons. One is that the number of choices that you have at any moment of a Go match is way bigger than in chess. Maybe in chess at any given point, there's a few dozen different moves that you could make, but in Go there's hundreds. And for each of those moves, then there's hundreds of others that you could, you know, make in the next step. So searching algorithms would be really ineffective.

Starting point is 00:46:26 Exactly. It's completely infeasible to do that type of search for Go. And for a long time, people actually had no idea how to solve Go. And when you look at how human beings, how human experts play Go, it's almost like what they're doing is instead of having these few explicit features of the board, they're doing this kind of visual pattern recognition. They look at the board and they, again, they have. have a hard time explaining what it is that made them make that move, but somehow their visual

Starting point is 00:46:52 system and their brain figured out what was the right thing to do. And so what deep mind did with AlphaGo was to actually combine some of the classic AI, you know, game search with deep learning with this type of, you know, neural network approach to do the evaluation. Since the evaluation of board positions is effectively a bit like a visual pattern recognition problem and deep learning is very good at visual pattern recognition. In fact, that's what it's best at. Then maybe you can use deep learning as the pattern recognizer and then put that into the game search. And this is in essence what deep-minded and it worked amazingly well. And that's like light years ahead of Deep Blue. It's just... Or is it just different? Yeah. So number one, right, as I said, Deep Blue use no

Starting point is 00:47:42 machine learning. You know, AlphaGo use machine learning extensively. So AlphaGo, the first thing that AlphaGo did was it learned from all the existing, the entire existing database of Go matches played by human masters. That was the first thing it did was learned from those, right? 30 million moves or something like that is the figure I think I heard. And then the next thing that it did was actually it learned by playing against itself. Two versions of AlphaGo would play and then AlphaGo would learn from which one had won. This is actually a very, very old idea in machine learning. It's one of the oldest ideas. It goes all the way back to the to the 50s, and this researcher at IBM called Arthur Samuel, who actually wrote the first

Starting point is 00:48:24 machine learning system to learn to play a game, and the system learned to play checkers by playing against itself. So AlphaGo was doing the same thing. So number one, AlphaGo is using machine learning, whereas Deep Blue wasn't, and without machine learning, I don't think we would ever figure out how to win that Go. And the second one is, the amount of computing power that was used by AlphaGo completely dwarfed the amount of computing power that was used by Deep Blue. So first of all, that type of computing power was not available back then. But, you know, DeepMind used enormous amounts of computing, not just while it was playing

Starting point is 00:49:04 live, you know, in Seoul against Lysadol, but in the months that it spent learning by playing against itself. I don't know exactly how much, you know, how many, you know, servers Google used, but I've heard that it was, you know, a very large number. continuously playing for months on end. Do you think we ever get to a point where we, as a culture, sit down and watch Google's AI play Facebook's AI at Go? I think we will if we find it interesting.

Starting point is 00:49:33 So, you know, just like there are these, you know, robots contests, and, you know, these days they're starting to have drone concerts and on a week. I could imagine Facebook challenging, you know, AlphaGo for a game. I'm not sure that people will be interested in that. I think more likely what will happen is that people will still be playing each other, even though the best players are computers. In the same way that, you know, there's race cars that go way faster than people. Yeah.

Starting point is 00:50:02 But we still have people doing, you know, in the Olympics, what you have is people competing against each other to see who can do, you know, whatever, 100 meters faster or the marathon fastest. And people are interested in race cars, because they're driven by human pilots, right? If those race cars were self-driving, we probably would be less interested. But that kind of begs the question.

Starting point is 00:50:25 Why haven't we had a self-driving race car in, like, the Indy 500? Well, I think we could at this point, actually, and it might actually win. I think in the past, the technology wasn't ready. And then once the technology is ready, the people have to let it happen, right? So the Indy 500 will have to let a self-driving car compete. I actually wouldn't be surprised if that happened in the next few or several years. I think we're at the point where it could. But, you know, sometimes people don't want the computers to be or the machines to be competing with them on a field like this.

Starting point is 00:51:02 You know, there are actually games where the humans actually refused to play against the computers because actually, you know, it used to be that humans wouldn't play computers because the humans were sure to win. there's also other areas where the humans won't play the computers because the computers are sure to win. So it depends. With self-driving cars, we're there now. Do you think from a technological point of view, just not a psychological one, or do you think we're, you know, a couple years away from that? It depends. So here's the crucial question is how uncontrolled can the environment in which you're driving be? So why was it that the first thing that we have was self-flying planes? Long before there was self-driving cars, there were already autopilots.

Starting point is 00:51:45 It's because in the air, there's very little unexpected that can happen. So for that, you don't even need AI. Just, you know, classic control systems and software engineering will actually do that for you. And then the next step is, well, what about driving a car on the freeway? Driving a car on the freeway is something that I think the technology to do that is there. The freeway is less control than the air, but it's much more controlled than driving in the city. Right. Once you're driving in the city, boy, you know, all sorts of stuff can happen.

Starting point is 00:52:12 people will cross, you know, people will make strange maneuvers, so that's harder. And I think this is where the frontier is right now. The cars are trying to drive in the city, and I think we are going to get to the point where they are widely deployed, not necessarily because they have become as good as people at dealing with an expected situations, but because we will also adapt to them, right? I think what's going to happen is that self-driving cars will have to clearly be identified as self-driving cars, and that we human beings, will know to deal with them differently than we deal with, you know, cars driven by people.

Starting point is 00:52:47 Again, we will expect different areas or also different ways in which they're reliable, right? Self-driving cars probably won't do stupid things, you know, that people do because they're drunk, but they might do stupid things because they just don't recognize something that is happening in front of them. But I think the whole system will evolve towards accommodating self-driving cars. So it's going to be an adaptation on both sides. And then I think eventually, you know, the self-dragged cars will be good enough that they can drive in any environment and then they will eventually replace everybody. How far away do you think?

Starting point is 00:53:19 Yeah, exactly. That part is hard to predict because, again, it involves so many unpredictable things. Again, the cars don't have to be perfect before we start using them instead of people. They just have to get better than people. I think when that will happen is hard to predict. but my guess is that five years from now there will be a lot of self-driving cars around

Starting point is 00:53:42 and maybe 10 years from now 15 most cars will be self-driving the other thing is that what really makes and ironically if we took everybody off the roads today and it was all just self-driving cars it would be much easier

Starting point is 00:53:58 self-driving cars have no trouble at all dealing with each other in fact they can coordinate much better than we human beings can as a result of which we can put many of them more on the road. The spacing can be smaller. We can have fewer traffic jams and so on and so forth. What really makes life hard for self-driving cars is us, the humans. So, you know, it'll be interesting to see when this happens.

Starting point is 00:54:19 You know, like the first time that the Google car got into an accident, it was because it stopped at a red light and it got rearended. Right. So it was actually doing the wrong thing except that nobody ever stopped at that red light. And that's the things that the cars have a little bit of trouble. dealing with. So Tesla and Google are taking two radically different approaches when it comes to self-driving cars. It seems like Tesla is taking the, or maybe I'm wrong, but it seems like they're taking the incremental approach where we'll have human drivers and eventually phase into, whereas Google has

Starting point is 00:54:51 removed the steering wheel effectively from their cars, or am I not? Yeah, that's right. And in fact, it's not just Tesla. If you look at the big automakers like, you know, Toyota and, you know, BMW and whatnot. I think they're all following this more incremental approach. And I think it makes sense for them because they're selling cars today. And they're not ready to deploy a fully self-driving car. So what they're going to do is they're going to introduce these things one by one. However, I think Google also has an important point, which is born of their experience, which is this notion of mixed control between the human and the car is actually very problematic. If someone is not driving the car and then suddenly the computer says like, oh, shoot, I'm confused, take over.

Starting point is 00:55:37 Then the person will not be very well able to take over. And in fact, they might well crash the car because, you know, they're not in the frame of mind to drive the car. So Google's reaction to this was to say, no, we just have to go all the way and have the car be completely self-driving. As becomes less and less novel, we start to pay less and less attention to things. Yeah, exactly. And, you know, if you're not, you know, if you're not driving the car, you don't You know, you can't take over in a fraction of a second because you need to build up context by driving the car over a certain amount of time. Also, and this happens with pilots, is if you're not piloting very often, you start to get rusty. And then when the time comes free to take over, you're actually not as good at doing it as used to be before it was a computer doing the driving or the piloting most of the time.

Starting point is 00:56:23 So, but I think that in the short term, the approach of the Tesla's and the Toyotas and whatnot will be the prevalent one, I think in the longer run, it will be the Google approach that prevails. There's another interesting thing in all this, which is, what is the business model of each of these companies, right? It's one thing to be the companies that are selling cars today. It's another thing to be Google, which actually doesn't have a business model for cars, right? Maybe they're going to sell their software to car companies.

Starting point is 00:56:50 It's yet another thing to be, for example, Uber, right? Uber is a company that has also invested heavily into self-driving cars. And they, of course, have a very different and very clear business model, which is they just want to take what they're doing and have the cars not need human drivers anymore. And, you know, all of these different models imply different ways to go about doing it. This is fascinating. I could go on for hours. I know we're coming up to our time limit here. So I just have two questions left, switching subjects a little bit here.

Starting point is 00:57:19 What books would you say had the most impact on you in your? your life in terms of do you keep coming back to or that changed the way you see the world? Well, one book that influenced me a lot was, you know, just the first AI textbook that I ever saw, you know, I saw a book in the books that called Artificial Intelligence, and I was very intrigued what that might be. And reading that book is actually what set me on the path to AI. Another book that I've read, you know, that is related to AI, and that I know has influenced a lot of people into becoming AI researchers. In my case, I was already on that before I read it, is, is Douglas Hofstadter's Gerdel Escherbach. It's an amazing book and, you know,

Starting point is 00:58:01 very thought-provoking and, you know, it speculates about all sorts of things that have to do with AI and computers, including things that we've been talking about. So that has been another very influential book. Another book that I've read, you know, more recently that I think is really amazing and really important is Jared Diamond's guns, germs, and steel. I think it gives this large, you know, you know, picture of human history that is very absent from most of what history does today. And I think, you know, it's very important to have books like that. And, you know, and I could go on, but I think those are three of the main ones. Thanks. What's on your nightstand right now? What is on my nightstand right now? Right now I'm reading another book by Douglas

Starting point is 00:58:42 Foshstad, which is a book about how analogy, you know, we were talking about the five tries of machine learning. He's actually written this book about how it's really all just analogy, right? So Douglas is the ultimate analogizing. In fact, he coined the term. And I have actually not completely read the book to date. So that's one book that I'm reading. I'm also reading a book on symmetry group theory because I think this is something that has not been exploited in machine learning and might be the origin of that sixth paradigm.

Starting point is 00:59:11 It's something that is used a lot in physics and in mathematics. In fact, some people say physics and mathematics. Thank you. Welcome to the Knowledge Project. I'm your host, Shane Parrish. I'm the curator behind Farnham Street, which is an online intellectual hub of interestingness covering topics like human misjudgment, decision-making, strategy, philosophy, but today we're going to be talking about artificial intelligence and machine learning. The Knowledge Project allows me to interview amazing people from around the world to deconstruct why they're good at what they do. More conversation than prescription. It's about them, not about me.

Starting point is 01:00:36 On this episode, I'm so happy to have Pedro Dominguez, who's a professor at the University of Washington. He's a leading researcher in machine learning and recently wrote an amazing book called The Master Algorithm. I was fortunate enough to have a long, and fascinating conversation with him over dinner one night, which I hoped would never end, but that ended up leading to this episode, which I think you will love. We're going to explore the sources of knowledge,

Starting point is 01:01:02 the five major schools of thought of machine learning, why white-collar jobs are easier to replace than blue-collar jobs, machine wars, self-driving cars, and so much more. I hope you enjoyed this conversation as much as I did. Hey, thanks for coming on the show today, Pedro. I'm so excited to have you. I've read your book, The Master Algorithm, and I've been thinking about it ever since. Thanks for having me.

Starting point is 01:01:31 So maybe you can, just for the sake of our audience, can you give an overview of what is artificial intelligence? Sure. Artificial intelligence, or AI, for short, is the subfield of computer science that deals with getting computers to do those things that require human intelligence to do, as opposed to just routine processing. So things like reasoning, common sense knowledge, understanding language, vision, manipulating things, navigating in the world.

The Knowledge Project with Shane Parrish - #13 Pedro Domingos: The Rise of The Machines

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.