In Our Time - P v NP

Starting point is 00:00:00 Thank you for downloading this episode of In Our Time. For more details about in our time, and for our terms of use, please go to BBC.co.com.uk slash radio four. I hope you enjoy the program. Hello, the problem of P versus NP is so difficult to solve. There's a $1 million prize on offer from the Clay Mathematical Institute for the first person to come up with a solution. At its heart is the question,

Starting point is 00:00:23 are there problems for which the answers can be checked by computers, but not found in a reasonable time, or can all answers be found easily as well as checked, if only we knew how? It's intrigued mathematicians and computer scientists since Alan Turing in the 1930s found that some programs, given a problem, could run indefinitely without finding the answer. That's as true of today's supercomputers as it was in his day. Resting on P versus NP is the safety of online banking and all online transactions which are currently secure.

Starting point is 00:00:53 If answers can be found as easily as checked, computers could crack passwords in moments. to solve thousands of practical problems is also in sight. With me to discuss, but not perhaps solve the problem of P versus NP, are Colver Roney-Dougall, reader in pure mathematics at the University of St Andrews, Timothy Gowers, Royal Society Research Professor in Mathematics at the University of Cambridge, and Leslie Anne Goldberg, Professor of Computer Science and Fellow of St. Edmund Hall, University of Oxford.

Starting point is 00:01:22 Colver Roney-Dougall. What was Alan Turing doing in 1936 and led him to this? Alan Turing was interested in trying to solve some problems in the foundations of mathematics and how that links in with this is he started thinking about how you could define whether or not it was possible to solve a problem and this led him at the bright young age of 22 to essentially invent our modern notion of a computer. It was a little while before computers were actually built but he defined an abstract machine that we now call a Turing machine and said that the properties it needed to have is that it should be able to read input

Starting point is 00:01:58 it should be able to write output and it should be able to decide what to do based only on what state it was in then and what symbol it was looking at. That ought to be enough to define what it's going to do at the next step. Now that model of a computer, the Turing machine... It just said...

Starting point is 00:02:14 Just an imaginary machine. Imaginary machine. He had just finished his undergraduate degree. He was looking around for problems to think about and he decided that he would start thinking about algorithms or methods in mathematics and for that he needed a machine to run his algebra. Now, one of the first things he realised given his imaginary machine, this is just a lemma in his

Starting point is 00:02:33 paper, was that one such machine can pretend to be any other machine. You can just give it the description of the other machine and it can simulate that machine. So that meant that we've only ever needed one mathematical model of a computer ever since, essentially. So when we start talking about the difficulty of solving problems, we're always, when it comes back to it, imagining one of Alan Turing's machines. You mentioned algorithms twice. You mentioned algorithms twice. What is an algorithm? One of the funny things is there's no good mathematical answer to this. So an algorithm is a collection of steps for solving a problem.

Starting point is 00:03:08 So, for example, a good recipe is an algorithm. If it's well written, it gives you a clear, unambiguous cooking something. Dice the onions, fry them for 10 minutes over medium heat. That's a set of instructions that's hopefully reasonably unambiguous. Instructions for how to build flatpack furniture. Why don't you not just call instructions then instead of an algorithm? The thing with an algorithm is it should be a sequence of instructions that might require you to go back to some previous step. You don't just work your way once through it.

Starting point is 00:03:37 You might need to go back to some earlier stage depending on what's happening. So it's a method is maybe one way of thinking about it rather than just a sequence of instructions. So for example, you learn how to add up numbers by adding up the digits and keeping track of the carries. That's an algorithm. it's going to work on any particular input. Why are you just doing maths? I mean, we did that at school, didn't we? Did carries and added numbers.

Starting point is 00:04:03 Exactly. So you were following an algorithm at exactly that point. Your teacher had taught you an algorithm for addition, involving knowing how to add single-digit numbers and remembering the carries, and you were following your teacher's instructions. I'm not trying to be prickly, but just finally, what is the advantage of calling it an algorithm, and why is algorithm so universally appropriate in the work you're doing? An advantage of calling it an algorithm rather than just a sequence of instructions

Starting point is 00:04:29 is the idea that within an algorithm you might want to loop. I mean, it's partly just a fancier word, right? But you might want to go back to the beginning and start again, depending on whether or not something's happened. So normally in a cooking recipe, if it turns out that you've burnt the onions, you might need to go back to the beginning and start again. Whereas with a computer algorithm,

Starting point is 00:04:49 I might be wanting to do something like find a number in a sorted list of numbers and the instruction might be look at the middle number is it bigger or smaller than the one you want if it's bigger than the one you want look at the middle of the next step down there's no it's a repeating thing there keep looking in the middle and going bigger or smaller depending on the answer you don't necessarily immediately know how many steps it's going to take it's not a word that you're using himself used i don't think so but i wouldn't swear to that maybe we get no nods around the table so we all don't think so one moment can i come back to you in a moment all right Tim, Tim Giles, to understand this P versus NP,

Starting point is 00:05:27 we need to understand two words, polynomial time and exponential time, the difference between them. Now, can you tell us about polynomial time and why that's important? That presumably is the P. It is indeed, yes, and NP, as it turns out, stands for non-deterministic polynomial. I'm not going to tell you why that's the case. When people actually started building computers, and in fact, they realised this before,

Starting point is 00:05:48 they came to realize the focus slightly changed from whether there is an algorithm or a systematic procedure for solving some problem to whether there is such an algorithm or systematic procedure that can solve the problem within a reasonable amount of time. So it turns out that there are quite a lot of problems for which it's very easy to write down an algorithm. The only drawback with the algorithm is that if you program your computer to run the algorithm, it would take a trillion years or something like that. And if you've got something like that, you might as well just not have an algorithm. So the focus shifted very much to whether you have practice. algorithms or not. And roughly speaking, P stands for the class of problems for which there is a

Starting point is 00:06:25 practical algorithm, an algorithm that can run in a reasonable time. It's not exactly the technical definition, but it's the main point. You could think of P as standing for practical instead of polynomial, and np is one, sorry, polynomial is practical, and then exponential is impractical. So with some algorithms, if you just, so a typical algorithm or a systematic procedure will take in some input, might be a couple of numbers that you want to multiply together or something like that, and the input can have different sizes. The numbers could have lots of digits

Starting point is 00:06:54 or not so many digits. And what you're really interested in is the length of time the algorithm takes and how that scales up as the size of the input increases. Let's deal with a polynomial, for a moment or two. This is the way you get results. You can put a problem in and you can get

Starting point is 00:07:12 result and you can check it in in a few seconds, nanoseconds, it is. So it's very useful. It is indeed. So an example of a polynomial time algorithm is long multiplication. If you take two numbers of 100 digits, then roughly speaking to multiply those two numbers together, the number of steps will be about 100 squared because each digit of one number has to be

Starting point is 00:07:34 multiplied by each digit of the other number. And because X squared is a polynomial, we say that it's, as I'm getting slightly more technical, we say that that would be a polynomial time algorithm. an exponential one would be something where as the input increases in size, the time it takes doubles, let's say, each time you increase the number of digits by one, if the time you take doubles, then very rapidly it's just going to blow up out of control. So when you've got a polynomial time algorithm, actually I could qualify this slightly, but broadly speaking,

Starting point is 00:08:08 if you have a polynomial time algorithm, it corresponds to something that you can actually program on your computer and get it to work in practice in a reasonable time. Is exponential always something that can be checked if it's solved, but it is very difficult sometime to solve, difficult sometimes in the sense of taking far too long, like you've mentioned, in some of the notes you mentioned billions of years to solve, but anyway, a long time.

Starting point is 00:08:29 Too long to live, too long to see your life out. Yeah, so exponential time algorithms will be ones that will take, the input size doesn't have to be very large before the algorithm would take billions of years. Checking, that's a slightly different thing. So that's where NP comes. into all this. There are certain problems which are search problems where you're trying to find something,

Starting point is 00:08:53 and you can think of it as a sort of a needle in a mathematical haystack. And sometimes it's easy to check an answer if someone tells you the answer. So somebody can tell you, they can say, I think this is the answer, and then you can go away and do a simple calculation and verify that it really is the answer. If that's the case, we say that the problem's in NP, but just because it's easy to verify that the answer is correct, that doesn't necessarily imply that it was easy to come up with the answer in the first place. So that's the distinction between P and NP.

Starting point is 00:09:23 P is things that are easy to do, full stop, and NP are things that are easy to check as long as somebody gives you a massive hint, basically. But if you can check it, does that mean you've also solved it? It does, once you've got the thing that you need to check, once somebody tells you the thing you need to check, and you go ahead and check it, you've solved the problem.

Starting point is 00:09:43 Well, how have they got there if you can't get there by... Well, usually, for many problems of interest, they just will not have managed to get there. We're sorting slightly in fantasy world here. So exponential explanations are defeating computers mostly. Yes, I think, broadly speaking, exponential means hopelessly and practical. Broadly hopeless, right. Leslie Goldberg, we're now moved to complexity theory

Starting point is 00:10:12 to work out which problems are quick out. Why is it so hard to distinguish between, let's call it, fast and slow? Okay, so I'm going to give you some examples to tell you about that. But the short answer is that most of the problems that we look at, the obvious algorithm is exponential. So there's exponentially many possibilities. It's just that for some problems, there's a clever way to narrow them down and find the right one. So what I'm going to do is I'm going to give you a few examples.

Starting point is 00:10:41 and the examples are all problems where the obvious algorithm is exponential. And the point of the examples will be to illustrate how subtle the difference is. And some of them, you know, they're these special clever algorithms and some there aren't. And the really hard thing is finding the difference. Okay, so let me start with a couple of problems that are actually easy. They're polynomial time, what Tim was talking about, polynomial practical good. Okay, so here's the first one. Let's say you have a group of N people

Starting point is 00:11:11 and you want to match them up into pairs. This is called the matching problem, okay? And you know compatibilities. So Colva Melvin are compatible, some people are incompatible. And all you want to do is take people and put them in teams to work together, but you want to know, can you do this in such a way that nobody's left out? Everybody's with somebody that's compatible. Okay.

Starting point is 00:11:32 If you just looked at how many different pairings there were, it's exponential. And just to give you a kind of scale of that, if you had even 100 people, the number of possibilities is more than the number of atoms in the universe. You just, you cannot look at them all. You can't pair up 100 people into compatible, into 50 compatible units without it taking more time than the universe. Yeah, sorry, I wasn't very clear. The point is that the number of different pairings that there are.

Starting point is 00:12:05 If you look at the number of possibilities, that's more than the number of atoms in the universe. Now, obviously, an exponential DOM algorithm might just consider every possibility and say, oh, is this possibility good? Are all the people that were paired compatible? No, let's try the next one. No, let's try the next one. That's ridiculous. But this is actually an easy problem because we do know a polynomial time algorithm. And in fact, we've even known it since the 1960s due to Jack Edmonds. So that's an easy problem. And now let me...

Starting point is 00:12:39 Just a second. Okay. First of all, it's going to take, it's very, very complicated. Or at least... And now it's very easy. Right. So let me try. So the point is the number of solutions is huge.

Starting point is 00:12:53 Yeah. So if you... We've all got that. Okay, we've all got that. So if you just blindly look at one solution, the next, the next, the next, the next. that would take forever. However, that's not a good idea. There are smarter ways to solve the problem.

Starting point is 00:13:08 And there is a clever algorithm which it would take more than 43 minutes to explain that does something else, okay, and manages to find a good pairing. It doesn't just look at every possibility. It cleverly constructs the best one. Right. Okay.

Starting point is 00:13:24 Now, this technique has applications in DNA sequencing in all sorts of areas of science, doesn't it? moves into that area. Ah. Can I defer the DNA sequencing? Because that's another... We've got to move on quite quickly, but if you can do that one,

Starting point is 00:13:39 it will be a help to show how it can penetrate what people will think of as serious signs and serious matters. Okay, so we're going to be discussing a class of problems called NP Complete problems, and that will be covering things such as the traveling salesman problem. We're coming to that, so this is based on what you've just said, yeah. Right, and DNA sequencing, will be an application of that.

Starting point is 00:14:03 And so it'll feed into that and all sorts of other work being done at the face of contemporary science. That's right. Okay, Colba, right. Now, we're talking about problems now. Are we where we should be in this conversation? Roughly. Right, let's talk about the factorization problem, Colbert, and over to you.

Starting point is 00:14:26 Right, so... What is it? Why is it important, and how can you sort of? solve it? What is it? Let's start with that. Let's begin there. So it's a lovely fact about whole numbers that's been known at least since the Greeks, if not before, that there's these special set of whole numbers called the prime numbers. And the thing with the prime numbers is that the only numbers that divide them are one and themselves. So two, for example, is prime because one divides it and two divides it, but nothing else does. 17 is prime because one divides it and

Starting point is 00:14:56 17 divides it, but nothing else does. But six isn't prime because two and three both divide it and they're not one or six. So it turns out that any whole number can be divided up into primes and that if I insist those primes come out in increasing order, then there's only one way to do it. So let me show you an example. If you think about, say, the number 12, or two divides it and two's prime. And once I've divided by two, I've got six left and two goes in again. So I've got two times two and then I've got three. And because I insisted on putting them in increasing order, there's only one way of breaking 12 down.

Starting point is 00:15:31 That's 2 times 2 times 3, and we're done. Now, for small numbers, it's quite easy to see how to factorise them into primes. You can probably do 12 in your head. And if I asked you a bigger number, like, I don't know, 143 or something, you would think about it for a bit, and then you might get to 11 times 13. But for big numbers, it's very, very hard.

Starting point is 00:15:51 There's a story of... What's a big number in your writing? Oh, 200 digits. Fine. There's a lovely story of, this is back in 1903, the mathematician Cole. There was a number that was believed to be prime, which was 2 to the 67 minus 1. And he walked into the lecture theatre, and without saying a word, he wrote 2 to the 67 minus 1. And then I'm not going to recite it because it's a 21 digit number, but he wrote out that 21 digit number.

Starting point is 00:16:19 And then he wrote equals. And then he wrote a 12 digit number. and a nine-digit number, and he proceeded to multiply them with long multiplication, as Leslie was talking about earlier. No questions. This was the end of that problem.

Starting point is 00:16:32 So that's what factorisation into primes is, and it's got this property that it's easy to check the answer. If I give you the answer, you can multiply them. Why is it important? Yeah, that's what I wanted to know. All modern cryptography is essentially based on the fact that if I multiply together two very big primes, that's easy to do.

Starting point is 00:16:52 If I just give you the answer, you can't find the two very big primes easily. So you can't go backwards in that sense? You can't go backwards. It's a one-way thing. Multiplication is easy. 3,000 billion centuries or something like that. We believe you can't go backwards. Let's be precise here.

Starting point is 00:17:05 We hope you can't go backwards. It takes an inordinate amount of time. You may get there in an in order to much faster. I mean, we don't know anything that's much faster than cleverly checking lots of possibilities at the moment, and that's what's slow. Okay. Tim McGarrows. Now, let's move on to NP-complete problems.

Starting point is 00:17:22 Right. So the NP problems are these ones where it's easy to check. So this factorisation is a very good example of an NP problem. If I actually, if so, suppose we've got a 200 digit number and the challenge is to find two 100 digit numbers that multiply together to give the 200 digit number, that is very, very hard, as Colver has just said. But if someone tells you two 100 digit numbers

Starting point is 00:17:46 and says, I think those might work, then it's much easier to check. You just go away and do a quick, I mean, a computer would need to do it for you. really because multiplying 100 digit numbers by hand is not so easy, but a computer can do it very easily. So that's an N.P problem. And a bit of a miracle

Starting point is 00:18:03 occurs, something that has no real right to be the case, which is that there are a lot of NP problems that turn out to be of equivalent difficulty in the following sense that if you've got a good method for solving one of them, then by a completely non-obvious process,

Starting point is 00:18:19 you can convert it into a good way of solving one of the other problems. The integer factorisation is not actually one of these NP-complete problems, but I'm sure we will at some stage discuss once at R. Can you just give us a hint now? It's too tantalizing to leave. Well, there's one which is the so-called travelling salesman problem, which we're coming to. Maybe I should just stop for a moment,

Starting point is 00:18:46 and we can have these examples, and we can talk about the sense in which they are NP-complete. but just the thing what I'm saying here is something that's really not obvious in the sense that if you look at you can get two problems that look completely different and it turns out that if you had a good way of solving one of them

Starting point is 00:19:05 then you could just use it for example to factorise integers when the problem itself looked as though it had absolutely nothing to do with integers there's a go-but give us the travelling salesman problem okay so the travelling salesman problem you've got a bunch of cities and you've got distances between the cities and now the question is you have some start city,

Starting point is 00:19:26 and what you have to do is work out the shortest path for starting at that city, visiting every other city exactly once, and coming back to start. So that is an NP-complete problem, or more formally the decision version of it is. But anyway, it's as hard as everything in NP. So the amazing thing is,

Starting point is 00:19:45 if you could solve that problem, you could solve factoring, as Kovah said, and as Tim was explaining, you could also solve every other problem in NP. And there are thousands. There are three thousands of them. So if you could find... Please, you're true for it. So if you could find a polynomial time algorithm for that, you would solve the whole of NP.

Starting point is 00:20:05 Now, people sitting around at the moment are saying, look, I'm going to start in London. I'm going to visit 10 different cities, and I'm not going to repeat any of them. I'm going to end up in London. It isn't all that hard. What makes it hard? You need the shortest route.

Starting point is 00:20:20 Ah, right. You didn't say that. Okay. Excellent. And here's the answer now to your DNA sequencing, is that if you could solve the traveling salesman problem, DNA sequencing is one of the applications of that. So, for example, I talk about cities, but you could say a city is a DNA fragment.

Starting point is 00:20:39 I talk about distance between the cities, but you could say the distance is some kind of similarity measure between DNA fragments. So if your goal was to take a bunch of fragments and put them in order, as biologists want to do, then you might well use an algorithm for the traveling salesman problem because you want to put all the fragments together and you want the big closest overlaps on each piece. I think I get, excuse me, I think I get that, but I still haven't got what makes it so very difficult.

Starting point is 00:21:11 Right. So what makes it so difficult, well, first of all, there are exponentially many possibilities. So a dumb algorithm certainly won't work. And the simple truth is that we don't know a clever algorithm. We don't even know whether one exists. So we don't know any... The number of possible routes between the cities is exponential. So if you just tried them, it would take billions of years to look at them all. And we don't know whether there's a clever way to find the shortest one without trying them all.

Starting point is 00:21:40 We just don't know. I think if you just think about sort of five cities or something, then it doesn't seem very hard. So maybe a way of making it sort of appreciating the difficulty is just to imagine you have two hundred cities and you want to find us more this route, then I think it becomes a bit more plausible that that would be a hard problem. The rule is you don't visit the same city twice and you end up where you started and there's a time limit.

Starting point is 00:22:01 You've got to just find out whether there is a way of doing it that's not too long. A distance limit. Maybe you can only travel at most 200 miles in total. The BBC has you an especially tight travel budget and you need to get round all of the capitals of all of the counties. How curious how this can, as you say, that goes into work on jeans. And I'd love to know more about this call.

Starting point is 00:22:26 But let's have another illustration. Because the interesting thing about this illustration is that they're very common or garden, aren't they? I mean, hundreds of travelling salesmen and hundreds of plays about travelling salesmen. And now we're going to talk about a wedding seating plan as being integral to this great mathematical problem. I mean, you think these two things are not supposed to go together. But there they are. There you are. Well, so yes, let me give you a very frivolous problem. You're hosting a wedding.

Starting point is 00:22:51 But hold on just a second, to be fair. It isn't frivolous if you get it right, because as Tim has said, if you get that right, all hundreds and hundreds of other things fall into place because by some terranious method, subterranean, it all clicks together. Absolutely. So this is dressed up to be a silly problem, but solving it would be just as important as solving travelling salesmen

Starting point is 00:23:13 because you'd have done the same thing. So I'd like you to imagine that your organisation, a very large wedding. What's large? Oh, let's say, 150 guests sitting down to dinner and that you've got a big, long table to sit them all down. And that you know that quite a few of your guests absolutely loathe each other.

Starting point is 00:23:32 They must not be sat next to each other. This is a typical wedding guest. There's various factions in the various families. And you're trying to come up with a seating plan such that nobody is sat next to anybody that they loathe, because that's going to make for a bad evening for everybody. This turns out to be another one of these problems that are very hard to solve but easy to check.

Starting point is 00:23:55 I mean, you check by looking to see whether anyone's sat next to anyone they hate. You just work your way around the table saying, no, that's okay, that's okay. I think they're on speaking terms, that one's fine. But if you imagine trying to do that from scratch, well, with 10 people, you'd be looking at about 900,000 different ways you could arrange them around 300,000 ways. You could arrange them around the table and the number gets bigger and bigger and bigger.

Starting point is 00:24:19 With 100 people, we're up to the more electrons than there are in the solar system until the sunburns out, working away, trying to solve the problem for you. Why is it so big these problems? I mean, I'm baffled that... Well, imagine you decide that you're going to be sat at the head of the table, and there's 100 people.

Starting point is 00:24:37 Well, there's 99 choices for the person on your left, and then there's 98 choices for the person on their left, and then there's 97 choices for the head. person on their left and you multiply all those numbers together and you get a massive, massive number. Now you can see that this problem's the same as Leslie's travelling salesman problem, because let's imagine that the cost of sitting two people next to each other that hate each other is a million pounds and the cost of sitting two people next to each other who get on is a penny, then working out if I could sit those people around the table such that nobody hates their neighbour

Starting point is 00:25:10 is exactly the same as saying, can I seat these people around the table at total cost? of a pound. Right. So if you could solve the travelling salesman problem, you could solve my problem. Can you take that on, Tim? I mean, as it's being said, it's clear,

Starting point is 00:25:25 I've no idea whether I remember it tomorrow at lunchtime, but still, can you take that on? That idea of these apparently ordinary problems which people get through every weekend seating people are so, when you think about it, so difficult.

Starting point is 00:25:42 And in that difficulty, the solving at that difficulty will open up a whole realm of solutions to problems which affect the deepest parts of experimental science. Yes, so there are a couple of things that it's important to understand. So one is that when we take one of these practical problems, the first thing you do as a mathematician is abstract out. You try and strip it of all its details, like what the wedding table looks like and that sort of thing,

Starting point is 00:26:05 and convert it into a purely abstract mathematical problem. So the mathematical problem in this case is you will have a network. The network consists of some nodes and maybe some links between some of the nodes. And these nodes and links can represent all sorts of things. So nodes could represent cities and links could represent roads between the cities. Or nodes could represent people and links could represent whether it's okay to put those two people next to each other at a wedding.

Starting point is 00:26:33 So once we've turned it into an abstract problem, we can then take it a little bit outside the realm of the practical. We can make these networks get larger and larger. and actually when we study it abstractly, we think we just got a network with N nodes where we think of N as just some very large number and then we're interested in how the difficulty of solving the abstract problem scales with N. So once you've sort of slightly left the real world behind,

Starting point is 00:26:57 I think it then becomes more plausible that these problems should be very hard. If you present it in the abstract form and look at it, there's just no reason to suppose that it would be an easy thing to do. And in many cases, as far as we know, it isn't an easy thing to do. Leslie, let's it go back. How final line is there between the problems that can be solved and those which become N-P-complete unsolvable?

Starting point is 00:27:21 Often, you only have to change a very small thing. So let me go back to the problem that I introduced first about taking N people and pairing them up into pairs. So the way I describe that to you, we have N people, we have compatibility. N meaning any number of people. Any number, so N is the number of people. And we have compatibilities between them. We know who loathes who.

Starting point is 00:27:42 And we want to pair them up so everybody gets matched to one other person with whom here she is compatible. That's in P. That's easy. Now, suppose I change it just slightly. And I say we've still got N people and we've still got compatibilities. But now what I want to do is split them into groups of three. So that within each group of three, there's complete compatibility. That's NP complete. So when we move from two to three, it goes from easy to NP.

Starting point is 00:28:10 completion. Why is that? Well, it simply is. What if a gang of people like each other? They all like each other. If they all like each other, if everybody likes each other, it's easy. And that's actually a really good point. So these problems, why they're hard, and maybe actually we should have explained this, why they're hard is because you have to solve every single input. So what we want is an algorithm, which, if you give me a and people, and you give me the set of compatibilities between them, which could be anything,

Starting point is 00:28:45 I've got to give you the pairing into threes. Now, that's hard. If you happen to give me a compatibility that says everybody's compatible, I'm going to have an easy job. I'm just going to say, fine, pair them up how you like. And so the hard thing is that the algorithm has to work in polynomial time, no matter what instance you give. Has to work quickly. Yeah, very practical. Yeah, I mean, coming back to the wedding table thing, If almost everybody got on with everybody and there was just a couple of known hatreds within your extended family, it would be quite easy. But if each of them hated roughly half of the other people, then it would become considerably more difficult.

Starting point is 00:29:22 So, Tim, are we talking here about these being examples of things not working out, which are sort of universally applicable to the way great problems in, or even greater problems in a way great problems in science, are not working out because of similar or other. I'm intrigued by what you said earlier in the programme, where you said, well, if we could solve that, we could solve thousands of other problems. The one is linked to the other. We don't know how. Well, we do know how the problems that are NP-complete are linked together. So I think maybe this doesn't have ramifications

Starting point is 00:30:03 for the difficulties that we're having when we do science as opposed to mathematics. So what we have here is a very precisely defined class of problems. called the NP problems. And inside though, there's another very precisely defined subclass of problems, the NP complete problems. And there's a well understood, actually, method of reducing any NP problem to any given NP complete problem, by which I mean,

Starting point is 00:30:26 if you can solve the NP complete problem, you can solve the NP problem. And it also is the case that a very large number of problems that actually come in practice turn out to be NP problems. So the ramifications if one could prove that P does equal NP, that would be saying that things that are easy to check are actually easy to find would be enormous because he'd suddenly be able to solve a vast number of problems that at present we think you can't solve. Actually, people, by and large, I think, don't think that's going to be the case.

Starting point is 00:30:56 But at least if we can't rule it out, there is still this tantalizing possibility that somebody could just come up with an algorithm for the travelling salesman problem that would just revolutionize the way we do find algorithms in general. So would you agree with that consensus, Colvin? Yeah, there was a poll done recently in 2012, and about 80% of the people who answered the poll reckoned that it was not the case that P's equal to NP,

Starting point is 00:31:20 reckon that P's not equal to MP. A smallish number think that they are equal. And some people consider the intriguing possibility that it might actually be independent. We might be able to set up different systems of mathematics in one of which P is equal to NP and another of which P is not equal to MP, but that's a definitely minority view.

Starting point is 00:31:38 Does that mean you've hit a brick wall? in this area then, Leslie? I wouldn't say a brick wall. I mean, there's a lot of progress that's made. It's just very slow progress. So what I'd like to do is maybe put it in perspective. So this problem, it's a mathematical problem. It's been around since about 1970.

Starting point is 00:31:54 Of course. In 1970, people probably thought it would be, you know, in 1971 after Cook's paper, people probably thought it'd be resolved in a year or two. However, if we look at the history of mathematics, some problems take a lot longer to solve. So, for example, if you look at Fermat's theorem, right? I mean, that was conjectured or, you know, by Fermat, the famous thing about the margin, that was 1637.

Starting point is 00:32:18 Okay. When was it solved by Andrew Wiles? 1994. So that's giving an example that sometimes in mathematics problems stay around for a very long time, but that doesn't mean they can't be solved. They might be solved. And indeed, there are lots of ideas in complexity theory. We learn incrementally small things. we do need new ideas, but there's no inherent reason to believe that we can't solve it.

Starting point is 00:32:43 And coming back to Colva's point about independence, some people have tried to actually prove, for example, that P versus NP is independent maybe of the axioms of natural numbers. And actually there was a lot of trouble with that kind of, that didn't sort of work out. There are kind of mathematical reasons why that seems not to be any easier than actually resolving the problem. We're talking about who you're talking about.

Starting point is 00:33:09 P and NP as if it's Australia and Britain. But actually there's gradations between, aren't there? So we haven't gone into that. There must be ways in which they are a bit closer, a bit closer, a bit closer, but not quite. And there's a cutoff point where P, you're looking at each other. I'm asking the right question. There's a very interesting examples.

Starting point is 00:33:29 So the factorisation problem is an interesting problem of something that's thought to be of intermediate difficulty. We don't know how to solve it. but if somebody could find a clever way of solving it, we wouldn't be able to translate that into an algorithm for solving the travelling salesman problem because it seems to be an easier problem. There's another problem actually,

Starting point is 00:33:46 this is a rather sort of timely discussion called the graph isomorphism problem. I won't say exactly what that is, but I'll just say it's a very important problem that was thought of as intermediate and there's a little rumour going around as of yesterday or so that somebody may have not completely solved it

Starting point is 00:34:03 but made an absolutely massive, advance in our understanding of this. So this is breaking news, is it? Yes, yesterday. Wait till next Tuesday and this person is giving a talk at the University of Chicago. And so what's the... Tell us about it. What's good about this? Well, the graph isomorphism problem, roughly speaking, is the following. You'll give them two networks, as I was talking about before, you have nodes and links. And you want to tell whether they're basically the same network. So whether if you can just rearrange the nodes on one side, you can produce a network that's identical to the one on the other side.

Starting point is 00:34:32 and for various reasons that doesn't seem to be NP complete, but it is NP. If someone gives you a candidate rearrangement of the nodes, you can easily check whether the two networks are the same. And what this person may or may not have done is a famous mathematician, and so the rumour seems quite reliable, is come up with an algorithm that is what they call quasi-polynomial time, which we could think of as being almost very efficient. And for many intents and purposes, it's actually almost as good.

Starting point is 00:35:01 time, don't you? It sounds very enjoyable, really. Colver, you want to... If it transpires, the possibility that P is found to equal NP after three or four hundred years, which, let us point in the right direction, these things take a bit of time.

Starting point is 00:35:21 And everything that can be checked out, can also be found. What are the practical applications? Implications, sorry. Implications. I'm going to hedge my bets slightly here. because it depends on how it's found that P is equal to NP, but let's imagine for the sake of mental argument

Starting point is 00:35:37 that we find a nice constructive way of showing that P is equal to MP and we find fast algorithms to solve all of these problems. I'll give you one good thing and one bad thing. So the one bad thing, we've already mentioned this, all internet security breaks instantly. Why? Because we can now factorise numbers that are products of two very big primes and that's the basis for all of our current cryptography.

Starting point is 00:36:00 so we would need a different way of exchanging information secretly. That's the bad thing. The good thing is that almost every single thing you buy becomes cheaper because some of the problems that are in NP that are difficult to deal with at the moment are scheduling tasks and factories, transporting goods between factories. So imagine, say, a mobile phone.

Starting point is 00:36:21 There's going to be raw materials brought in from all over the world. Doing that transportation becomes more efficient. The factory needs to maintain stock levels. Doing that becomes more efficient. chips need to be designed, taking up as little room as possible, as powerful as possible. That becomes more efficient. So at every single step of the production chain of a complicated object, we could be saving significant amounts of money.

Starting point is 00:36:42 So that would be the good trade-off for the fact that you could no longer shop on the internet so easily. Would you like to add to that, nothing? Sure, I could tell you a little bit more about the cryptography thing, because I think that perhaps people would like to know a little bit how that works. So, you know, what's the connection between that and factoring? and let me just give you a little bit more information about it. So when you buy something on Amazon or something like that, all of these applications use what's called public key cryptography.

Starting point is 00:37:09 So you wouldn't like it if you had to phone up Amazon and make a secret code and then send them your credit card. So what happens instead is that Amazon produces something called a public key and something called a private key. And your credit card gets encrypted with the public key, but they've got the private key and can decrypt it. There's lots of eavesdropping. So people do see the encrypted message.

Starting point is 00:37:31 But because we think you can't factor, they aren't able to do the decryption. So if we had P and NP different, factoring is easy. It means that anybody who eavesdrops would be able to get your credit card number. So that's, okay, so that's basically. But you don't see it as a real possibility,

Starting point is 00:37:49 due to that P will equals NP. I myself think it's incredibly unlikely. I just wanted actually to introduce one more application that is practical for the three of us, although maybe not necessarily for the... So one NP problem is, here is a mathematical problem you'd like to solve. Find a proof that is not too long and complicated.

Starting point is 00:38:14 Now, if somebody gives you a proof that's not too long and complicated, it is an easy matter to check whether it is a correct proof of the problem. So that means that the problem of finding proof, which is what we mathematicians do for us, living is in NP. So if P turned out to equal NP with the qualifications that Colver talked about, basically we'd be out of a job because computers could just very, very easily do all the work that we spent at the moment sweat away doing in our offices. So I'd like to point out that, yeah, I mean, if P was equal to NP and you could do so effectively,

Starting point is 00:38:46 you could claim all of the clay mathematics prizes because you'd be able to run your little bit of code and generate short proofs for all of the remaining ones. Donald Knuth, the most famous computer scientist arguably in the last few decades has said he thinks it's so unlikely that P's equal to NP that the first person who shows that he will award them the prize of one live turkey in addition to the million

Starting point is 00:39:07 dollars. One more thing is if it turned out that P was not equal to NP, that would be excellent for our understanding, but it would nearly be the first step because in fact what we do know is that if they turned out to be different, actually

Starting point is 00:39:23 we know that there's an infinite number of difficulties between them. And we haven't even gotten there, but there's actually loads of difficulties of problems above NP and lots of problems that are even harder that we want to understand. So if we resolved this and found that P is not equal to NB, that would be a good first step in our understanding of the difficulty of problems. What would computers have to do to resolve this problem? I mean, is there any... Just as Turing imagined the whole caboosh, and so others were working at that time too. We must be too chauvinistic about it, but we're using him as a clear...

Starting point is 00:39:55 There's another American chapels. A lot of a church. But never, he imagined it. Can you imagine, or do you imagine, a solution? Do you sit down and say, right, we can't get there? But I imagine it's this solution, just like he imagined this computer. I think...

Starting point is 00:40:11 We don't have some sort of model of an imaginary computer that solves NP problems with ease. I think, in a sense, from a mathematical point of view, whether or not you actually have instantiations of the computer concept as real computers is not so important, actually. So I don't think there's really an analogy to be drawn there exactly. Now we've got our model of computation and we've got computers. So people did briefly become optimistic that quantum computers might be able to solve quantum computers. So quantum computers are beginning to exist that they might be able to solve the NP-complete problems, but that's looking increasingly unlikely.

Starting point is 00:40:54 They can solve problems like integer factorisation, but as has already been mentioned, we don't think that's NP-complete. So it doesn't look like madame quantum spookiness is going to fix anything for us soon. I'm quite used to a world where you were mathematicians, and say, oh, we'll get there, we will solve these things. And now I'm confronted by a mountain of stuff that could be so much better if you could solve this, but not. Well, I mean, we may solve it, but probably we'll solve it by showing that P is not equal to NP,

Starting point is 00:41:26 rather than showing that they are the same. Let's confirm our existing belief that these problems are genuinely hard. Well, I think I've got, you've got through that. Thank you very much for getting through it and I'll take the egg and you'll take the ground. Thank you very much. Thank you to Leslie Ann Goldberg, Timothy Gowers and Culver, Rony Dougal. Next week we'll be talking about the great sea battle of Lepanto,

Starting point is 00:41:54 1571 between the Ottomans and the Holy League. And thank you for listening. And the In Our Time podcast gets some extra time now with a few minutes of bonus material from Melvin and his guests. The word algorithm. I got completely frozen on that. I'll tell you where it comes from, but I actually can't pronounce it. Well, it comes from, it comes from, Gorizmi, I know that bit.

Starting point is 00:42:15 And the reason is because he was... It's his surname. It's his surname. It's his surname. Al-Gorismi. But why he was named after him is he was doing equation solving and he was giving step-by-step instructions for solving these sets of equations. And so that's why algorithm is named after his name.

Starting point is 00:42:32 That would have been nice, wouldn't it? That word does go back. Yeah, but I'm not sure if Turing would actually. We've used it. Actually, I know one more thing that they might like to know. And that is that we've talked about solving NP complete problems, as in solving them perfectly. But there's a whole field of study about how difficult they are to approximately solve.

Starting point is 00:42:52 So instead of getting the traveling salesman solution optimally, instead getting within some fraction, and there's lots of results about that. And that's a whole other field. Sorry. No, go to. Another sort of interesting aspect of this is that unlike with many mass, problems. The P versus NP problem is one where people have a very precise idea of why it is

Starting point is 00:43:14 that we are finding it so hard to decide whether P doesn't equal NP. So some mathematicians have actually produced a very precise paper. It's called natural proofs in which they've shown, roughly speaking, that any proof that P doesn't equal NP would have to be incredibly strange. And this is a, you can give precise meaning to the words incredibly strange here. So we sort of know it deserves its million-dollar price tax somehow. But mathematics does move into different dimensions of thought now and then, doesn't it? Yes, sometimes somebody will come up with an off-the-wall sort of idea that nobody expected. It really changes the way we do.

Starting point is 00:43:51 This whole thing came out of one of those, because this whole thing came out of girdle showing that Hilbert's problem wasn't solvable. And this big 20th century shift in mathematics to realize that it's not just the case that all problems have solutions, some problems can't be dealt with within your current system of axioms. That was girdle working and then Turing picked up on it for his Turing machines because he was trying to go from proofs to algorithms and processes. So it was one of those big paradigm shifts that set this whole process off and it might well be that we need another big paradigm shift to bring it to completion.

Starting point is 00:44:23 There have been some attempts. Perhaps a few years ago a very exciting thing was this thing called geometric complexity theory where people trying to reduce this problem to algebraic geometry. And I guess the consensus is that, that it hasn't worked at least yet. But there's lots of different kind of angles on it. The natural proof is about circuit complexity, and then there's some kind of people doing proof complexity.

Starting point is 00:44:47 There are lots of different attacks. So there's no sense in which you're saying, look, we can't solve this. We're going to give up trying to solve it. No. I think actually some of the experts, they do say, well, I can't afford to spend too much of my time on this problem because it's such a hard problem,

Starting point is 00:45:03 and I don't want to end up with no publications. So I think most theoretical computer scientists sort of ration the time they spend on the problem and spend some of their time working on slightly Yes, I'm sort of. By the time I finish the novel

Starting point is 00:45:19 instead of sitting trying to write the greatest novel ever written and never write anything. Exactly. It's a little bit like that. Oh, yes, Simon. I think a double brandy and something. Tea, tea will be fine. There are many more.

Starting point is 00:45:35 science and discussion programs from Radio 4 to download for free. Find these on the website at BBC.co.com.uk slash radio 4.

In Our Time - P v NP

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.