The Science of Everything Podcast - Episode 100: Unsolved Problems in Science

Starting point is 00:00:34 You're listening to The Science of Everything Podcast episode 100 Unsolved Problems in Science. I'm your host, James Fodor. So this is obviously a very important milestone in the history of the Science of Everything podcast, 100 episodes. And I wanted to use this opportunity to announce some very exciting and important changes and new developments in the podcast, which I hope will really get you guys excited and basically be a way to help. help produce more content and ensure the future of the podcast. But before we get to that, I'll talk a bit more about that at the end of the show. I wanted to make this a special episode for you guys, something a little bit different and hopefully

Starting point is 00:01:17 exciting. After a lot of thought, I decided that in basically all of my other episodes, I talk about things that we know, things that we can say on the basis of scientific findings. And to spice things up a bit, I would instead use this episode to talk about some things that we don't know, or specifically unsolved problems in science, outstanding questions or ongoing debates that haven't yet been resolved. So these are, if you like, some of the mysteries of science. But of course, it's more than just an issue of saying, well, here's something we don't know, isn't that mysterious, but there's a lot to unpack about the nature of the problem, what can be

Starting point is 00:01:52 said about it, what the existing theories are, and how scientists have tried to go about solving it. because really that's the key characteristic of science is that it's not a set of facts that you memorize. It's a process for learning and understanding and discovering new knowledge and improving our understanding of how the world works. So that's what I want to showcase in this episode, and I'll do that in the guise of talking about six important unsolved problems in science, and I've picked one from some of the main disciplines in science. So we've got, first, from computer science, the problem of P versus NP. Second from physics, dark matter, from chemistry, the island of stability.

Starting point is 00:02:36 Fourth, from earth science, the snowball earth hypothesis. Fifth, from biochemistry, the protein folding problem. And sixth from biology, the Cambrian explosion. But you may have heard of some of these. You may not have heard of others. Perhaps you haven't heard of any of them. Don't worry, I'm going to explain them. And hopefully you'll find this interesting and enjoyable.

Starting point is 00:02:54 So strap in, we've got a lot to get through, and some of these ideas, obviously, because they're at the cutting edge of science, might be a bit more difficult. So don't worry if you don't quite get the details of everything. I want to give you an introduction and a kind of an overview, so you have some idea about what these problems are about and how we can go about trying to resolve them. Also, many of the previous episodes of the podcast have relevance to understanding these, but I'm not going to give specific prerequisites for this, because I wanted to be kind of accessible to everyone, and there would be too many specific episodes. episodes to mention anyway. So don't worry too much about that. I'll try to give an requisite background when I can. So all that being said, let's make a start and start with our first problem from computer science, the problem of P versus NP. So this problem is a major unsolved problem in computer science and mathematics. Indeed, it's probably

Starting point is 00:03:43 the single biggest outstanding problem in computer science. This problem is a bit technical, so I'll do my best to explain what it means. So first we have to understand what a decision problem is. A decision problem is just a problem, a formal theoretical problem that can be posed as a yes or no question. There are many of these sorts of problems and computer sciences in the business of trying to answer these problems. A simple example of a decision problem would be to take an example like integer factorization. An integer is just a number, like a whole number, like 1,073, and factorizing a number is decomposing that number into two numbers, well, like two or more, but we'll just think about two numbers that multiply together to give the first number.

Starting point is 00:04:25 So, for example, if I take 10, the factorization of 10 is 5 and 2, because 2 times 5 is equal to 10. Factorization is very easy to do for small numbers, but as the numbers get bigger and bigger, it's much harder to do. So the point is, I can pose a yes or no question. I can start with a number, like 1,074, whatever the example was, and then I can give a pair of two numbers and ask, Is this a factorization of 1,074? And the answer will be yes or no. So that's a decision problem. Another example of a decision problem would be,

Starting point is 00:04:56 take these set of integers, is there a subset of these integers that sums to zero? That's called the subset sum problem. Or give me a big long set of logical expressions. Is there an assignment of true or false values to the primitives in this set of logical expressions that satisfies all the expressions without giving rise to any contradictions?

Starting point is 00:05:16 That's called a bully and satisfiability problem. And there's many, many types of these decision problems in computer science or mathematics. So that's what we're talking about here. P and NP refer to types or classes of problems. They're complexity classes because they relate to how difficult the problems are to solve. Now, when describing how difficult different decisions problems are to solve, there's a common kind of language or notation that is used. And to understand that, we need to give a representation

Starting point is 00:05:46 of how large the input is. So when we talk about the input, it could be, say, the number of logical expressions that I have for Boolean's satisfiability, or it could be the size of my integer, if I'm doing factorization, or it could be if it's a graph problem, it could be the number of vertices in the graph,

Starting point is 00:06:03 if it's some sort of operation on a matrix, it could be the number of elements in a matrix, and so on. Again, don't worry if you don't know what a graph or a matrix is they're just different mathematical objects. The point is we can specify the size of the input, by n, where n could be the number of elements in a list or the number of vertices and a graph and so forth. So whatever the specific object is that we're dealing with, we can find a way of specifying its size.

Starting point is 00:06:26 And obviously, the bigger N gets, the bigger the matrix is or the list is or whatever, the harder the problem is going to be to solve on average. And so we talk about the size of the input as N, lower case N, and then we ask, how difficult is the problem going to be to solve in terms of that N? And usually we think of it as a function of N. So when we talk about the different complexity classes, there are two important classes that we're going to focus on here, P and NP, unsurprisingly, because that's what the problem is named after, these two classes. And we can think of them loosely as P being the easy problems and NP being the hard problems. Now, of course, it's more complicated than that, but we need to understand it from a basic point of view first.

Starting point is 00:07:05 So P stands for polynomial. A polynomial is just a type of function. I won't describe exactly what it is, but it's functions like, you know, x squared or x cubed or x to the four. or something like that. You take X and you put it to the power of some integer, and you can add or subtract constants or whatever, but that's what a polynomial is, right? You would have seen them in high school maths. So the point is, polynomial functions grow relatively slowly compared to other types of functions, like an exponential function. An exponential function describes how rapidly interest accrues on the loan you've taken out from the bank, and as you probably know, interest grows quite

Starting point is 00:07:39 quickly, especially if you leave it for long enough. So exponential functions grow very, very quickly. So, polynomial functions grow slower than exponential functions, at least if N gets big enough. So, again, the point here is P stands for polynomial and kind of think of it as slow-growing functions, which means that as N gets bigger, the problems don't get too hard to solve. I mean, they may still be kind of hard to solve, but they're relatively easy to solve compared to many other types of problems that get exponentially harder as N gets bigger. So P-P polynomial, easy problems, easy decision problems to solve. Now the other class, the complexity class called NP, that stands for non-deterministic polynomial.

Starting point is 00:08:18 Importantly, it doesn't stand for not polynomial. That wouldn't make any sense. It stands for non-deterministic. But here we can just think of it as the hard problems. Now, to explain what non-deterministic means, what it specifically means is it's not exactly hard problems, but what it means is that problems in the NP class, the solution to these problems, can be checked easily or in polynomial time. non-deterministic polynomial means. It means that basically if you somehow had a machine that could just guess the right answer, then it could check that answer in polynomial time. That might seem a strange way of phrasing it. That's what the non-deterministic means. It's like if it can randomly just somehow pick the right answer, it could check it quickly. The point, though, is not that you have a

Starting point is 00:08:58 machine that can just know the right answer, but the point is that it's just how quick is the checking. If the checking can be done pretty quickly in polynomial time, then that problem is in the NP class. So integer factorization is a good example of something that's obviously in the class NP, because if someone gives you two numbers, you can easily just multiply them together. Multiplications are pretty easy operation to perform. We can do it pretty quickly. And so then you can just check, does this, do these two numbers multiply and give the answer that they're supposed to? If yes, then you found the solution. Now, there are some types of problems that cannot be, the answer cannot be verified in polynomial time. There aren't any trivial examples of this.

Starting point is 00:09:37 because the things you learn in high school mathematics, well, they're the easy things. But there are some types of operations. For example, there's something called a matrix permanent, which is an operation performed on a matrix, which is kind of like the matrix determinant, but a little bit different, not exactly the same. Don't worry if you don't know what a matrix or a matrix determinant is. It's just an example of mathematical notation can be performed,

Starting point is 00:10:01 or a matrix determinant can actually be easily calculated, but a matrix permanent turns out it's very difficult. And the important point I want to make here is that a matrix permanent, not only is it hard to calculate, but you can't even really check if the answer is correct, because unlike, say, factorization, the checking is kind of just as hard as the finding in the first place. Maybe not just as hard, but at least it's very hard. And so finding matrix permanence is believed to be, this hasn't been proved, but it's believed to be even harder than NP. So it's believed to not be in the NP class, meaning you can't even check it very quickly for the correct solution.

Starting point is 00:10:34 If you know the correct solution, you can't even check that quickly. There are other things as well, like the halting problem, which you may have heard of, that are not in NP. But, again, don't worry about the details there. The point I'm trying to make is that some problems are easy, and we can find the solution, and obviously then check it in polynomial time, so relatively quickly. Some things are harder, and we can't find the solution quickly, but we can at least check the solution fairly quickly if we have one, and those are things that are in NP.

Starting point is 00:11:01 And then there are some things that are not even in NP, because they're so hard. that we can't even check the solution if we had it quickly. It takes a long time to even check it. So we're not going to worry about that last type of problems, the ones that aren't even in NP. I just wanted to mention it so that we can understand the importance of P and NP, because not everything's in those, but let's ignore the really hard stuff for the moment and just focus on P and NP. So again, P thinks we can find the solution quickly, NP, things you can't necessarily find the solution quickly, but you can at least check the solution quickly, or again, in polynomial time.

Starting point is 00:11:33 So hopefully from what I've said so far, you should be able to gather that if something's in P, it must also be an NP. Because remember, if a problem is in P, that means there's a polynomial time algorithm for solving it. And obviously, if there's a polynomial time algorithm to solve it, then there must be a polynomial time algorithm to check it, because checking is no harder than solving, right? So obviously everything in P is obviously in is also in NP. But are there things in NP that aren't in P? Well, there is another important subset of NP.

Starting point is 00:12:02 So if you picture the whole of NP is like a circle and kind of hanging around at the bottom of the circle is P, the easy stuff. Hanging around at the top of the circle is stuff called NP Complete. They're still in NP, so that means there is a way to quickly check a polynomial time algorithm to quickly check a solution, but they're the hardest things that are in NP. Specifically what that means is there are ways of showing that all the other problems in NP can be reduced in polynomial time

Starting point is 00:12:29 to things in NP complete. And what that means just in layman's terms is that you can prove that if you can solve any one problem that's NP complete, you can actually solve all of the other problems in NP. That's what reduced to means. You can solve the first problem in terms of the thing you reduce it to. Again, just to make things concrete, when I talk about problems, I'm talking about formal decision problems that can be expressed in mathematical computer science terms. So some examples of known NP complete problems are, as I mentioned before, bullying, satisfying. given a set of logical expressions, can I assign true false values to the primitives so that they satisfy all of these expressions? Another one is the graph coloring problem. Is there a way of coloring

Starting point is 00:13:10 all of their vertices of a graph such that no two adjacent vertices are of the same color? This is effectively what happens when you're coloring the countries on a map and you want to color them so that they're all different colors. That's the graph coloring problem. Another one is a traveling salesman problem. If I give you a list of cities and the distances between them, how do you find the shortest possible route that visits each city once and then returns to the original city. So these are examples of decision theoretic problems that are known to be NP-complete, which means that they're the hardest problems in NP. And if you can solve any of these problems, you can use that to solve any other problem in NP.

Starting point is 00:13:45 Okay, so we've got our P, which is kind of the easy stuff, NP, which is the hardest stuff that's still in NP. Remember there's things outside of NP, but we're not worried about that. Within NP, NP completes the hardest stuff. Is there anything in between? Is there anything that's intermediate between the NP-complete, the hard stuff, and the P, the easy stuff? Well, this would be NP-intermediate, things that are not NP-complete because they're not quite that hard, but also not P, because they're harder than that. This is sort of in the middle of our circle, if you like.

Starting point is 00:14:15 Not NP-complete, but not P either. Turns out, and this is the real crux of it, that no one knows if there are any NP-intermediate problems. There are some problems that are suspected to be NP-intermediate. including integer factorization, because integer factorization is not known to be np complete, but there's also no polynomial time algorithm to factorize an integer. So therefore, we don't know if it's np complete, and we don't know that it's p, so people think that it's probably np intermediate, because we do know that it's in class np, but it's not been proven to be in either the top bit or the bottom bit. And there are other things as well, like graph isomorphism, are two graphs the same.

Starting point is 00:14:53 That's also thought to be NP intermediate. But no one's been able to prove this. And this is, and finally we get to the point of it, this is the crux of what the P versus NP problem is. The P versus NP hypothesis just says something very simple. P equals NP. Let me say that again, P equals NP, meaning that all of the problems in NP

Starting point is 00:15:17 can actually be solved with a polynomial time algorithm, which means effectively that all of these problems, even if they seemed hard are actually kind of easy, because there's a polynomial time, relatively quick algorithm, to solve them. Now, this might sound crazy. This is like saying that if I have a way to quickly check the answer to a problem,

Starting point is 00:15:36 like integer factorization, for example, that's easy to check a solution, then there must also be a fairly quick way to find that answer to the problem. And that just seems nuts, because in everyday life, there are so many examples of things that it's easy to check, but hard to find.

Starting point is 00:15:49 But that's what P equals NP says, that these things are, in fact, the same. that all problems that fall inside the NP circle have a polynomial time algorithm. Now, why would anyone think this? As I said, this seems totally nuts. Well, as I said, the reason some people think that it at least might be true or could be true is because there are these problems known to be an NP, so we know we can quickly check, like integer factorization, for example, but we haven't proven them to be NP-complete, and we also haven't proven them to have a polynomial time solution.

Starting point is 00:16:20 So they might be in between. And that means that if there's something that's not np complete and also has no polynomial time solution, then p does not equal np. Obviously, because if p equals np, everything that's in np, the whole circle has to also be in the little bit at the bottom. That just means that there's a polynomial time solution to everything. So if you could prove something was in np but not in p, that is if it could prove that there's something that's n p intermediate, we'd prove that p is not equal to n p. And no one's actually been able to prove that yet. So therefore it remains an open question. Does p equal np or does it not?

Starting point is 00:16:52 To prove that p was equal to np, what you would have to do is take one of these np complete problems. Remember, that's the problems that are in np, but kind of at the top bit, the hardest problems that are still in n p. You'd have to take one of these and find a polynomial time solution to it, a polynomial time algorithm that solves that problem. Obviously, this would prove that p is equal to n p, because remember, if you can solve one np complete problem in polynomial time, you can solve any np problem in polynomial time, because all of the others reduced to that. NP complete are the hardest ones. If you can solve the very hardest ones, you can solve any of them. So to prove that P equals NP, you'd have to find a polynomial time solution to an NP complete problem. And no one's been able to do that yet. So no one's been out to prove that P equals NP. But conversely, to prove that P does not equal NP, you'd have to find an NP intermediate problem.

Starting point is 00:17:41 That is, you'd have to prove that something did not have a polynomial time solution, but also was not NP complete. And no one's been able to do that either. So at this stage, it's not a nP intermediate problem. not known whether p equals n p. It's not known whether being able to easily verify the solution to a problem means that you can also easily find the solution to that problem. Most computer scientists and mathematicians, as far as I can tell, think that p does not equal n p. But again, no one's been able to prove that, and so until the proof is forthcoming, it's still an open question. Now, you might be thinking, well, what does any of this matter? This all seems very abstract. But in a sense, it's actually not that abstract. I just gave a fairly concrete example of integer

Starting point is 00:18:17 factorization. That's something that you would have seen before. and it's a clear example of something that I think a good example to really get to grips of the problem. If someone could prove that P was equal to NP, it would mean that there's a polynomial time that is a relatively quick algorithm for factorizing an integer. And if that was true, it actually has huge implications for things that we rely on all the time, namely public key cryptography. Now this is a technology, particularly RSA cryptography as it's called, that's used widely over the internet to send financial information and, well, all sorts of information really. And basically, without going into the details, it involves being able to take advantage of the fact that some things are really hard to calculate the answer to, but easy to check. So what happens is, effectively, you can send information across the internet by sending one number, and then the other person has another number, and they multiply them together, and they should give, well, some integer, right? and you know the answer you're expecting.

Starting point is 00:19:13 And so if the person sends you the wrong number, you can easily check it, or if they send you the right number, you can easily check that, and you can verify that they are who they say they are, loosely speaking. That's only secure because basically the person at the other end can send a really long number, and the only way to know what the factorization of that is is either to know your secret number

Starting point is 00:19:32 or to have a fast way of factorizing really long integers. Now, no one has that because there's no known method to do it, and therefore this method is why they regard it as quite safe, because unless they know what your secret number is, there's no way for them to decompose that big, long integer, into the factor components. So only you can do that because you're the one who has that number, and that's sort of stored on your computer.

Starting point is 00:19:53 Again, that's a really simplified version of how it works. The important point is simply to understand that public key cryptography relies on the fact that some things are easy to check, but hard to solve. And this is precisely what an NP-complete problem is about. Before I leave this problem, I should emphasize that I've been talking a lot about things that are easy to solve, that are in P, or there's a polynomial time algorithm, and things that are hard to solve, which are NP complete, or things that are in NP but not in P. Strictly speaking, that's not quite correct, because even if we found a polynomial time algorithm for something, it might be a very high order polynomial, like N to the power of 100 or something, which would still get really, really hard to solve if N gets even slightly large. However, it would still be a lot easy to solve than an exponentially increasing function.

Starting point is 00:20:39 So it's in that sense that it would be easy. It would be a lot easier than many types of functions or many ways that difficulty could increase. But it still might not be easy enough to solve for practical purposes, and no one knows that. So public key cryptography might still be safe in practice, even if in theory there's a polynomial time way of solving, say, factorizing integers. So the implications of p equals np are unclear, But most people seem to think that if there was a way of showing that P is equal to NP, that would be a huge result and a positive result in some sense, because it would prove at least that in theory it would be easier to solve some problems than we thought. But most mathematicians and computer scientists think that probably P does not equal NP, and sooner or later someone's going to be able to prove that there's something in that middle space between NP Complete and P in that NP intermediate space, and that would prove that P does not equal NP.

Starting point is 00:21:29 But until that proof is forthcoming, the question of whether P equals NP remains an open problem in computer science. Let's move on now to talk about a major outstanding problem in physics, which you may have heard of before called dark matter. So dark matter is a form of matter that's thought to account for about 85% of all the matter in the universe. 85%, you say that sounds like rather a lot. Well, it is rather a lot. What is dark matter? Why haven't I heard of it? or why don't I see it if it's 85% of all the matter that's out there?

Starting point is 00:22:01 Well, good question. And the answer is because no one knows what dark matter is. We can't see dark matter. We can't interact with it in any known way. We don't know where it came from or how it got to where it is. So if that's the case, you might be wondering, well, then how do we know it exists at all? Well, the answer is because it has gravitational effects, and it's mostly these gravitational effects that we've been able to observe

Starting point is 00:22:25 through a number of observations that have been made in recent decades. and I'll go through some of these now. So at this point, there is very strong evidence that dark matter exists. That is that there is lots of this matter out there that we can't see. That's what dark means. It just means we can't see it or detect it or observe it in any direct way. So we can see stars, for example, because they emit light. We can observe nebula, which is gas clouds.

Starting point is 00:22:47 Some nebulae emit light, others don't, but they'll block light from things that are behind them because they still interact with electromagnetic radiation. Dark matter doesn't interact with electromagnetic radiation. and so we can't detect it with light in any way, but it does interact with gravity, and so we can observe its effect on gravitationally bound structures, and so this is where the evidence comes from.

Starting point is 00:23:07 So I'll just go through the main lines of evidence for dark matter. So one is the rate of rotation of the spiral arms of a galaxy. So you may be aware that many galaxies are spiral galaxies, so they have the kind of a globular core, and then these arms that rotate around. I'm sure you've seen a picture of that, so I won't try to describe it any better. The point is that the stars in those arms are rotating around the center of the galaxy,

Starting point is 00:23:31 kind of like the planets in our solar system, rotate around the Sun. And what we would expect to see, based on the observable mass that we can see, like the stars and the gas, in a galaxy, we would expect to see that the rotation velocity, the rate at which the stars are orbiting the center, decreases, specifically in accordance with Kepler's second law, which describes the orbit of the planets about the Sun, but we don't need that level of detail. The point is that the rotation velocity should decrease from the center. But we don't actually see that.

Starting point is 00:23:59 Instead, the galaxy rotation curve is flat as the distance increases. So, stars that are close to the center of a galaxy, the orbital velocity, should be much faster than stars that are very far out, that are just on the edge of the galaxy. But that's not what we see. They go around at about the same rate. That's what we mean when we say the rotation. Now, that's very strange because it means there must be much more mass around the edges, around the arms of a spiral galaxy, not like the solar system where most of the mass is concentrated around the centre, where the sun is. But we don't see all that mass. Most of the mass we see is in the stars and the gas,

Starting point is 00:24:30 and that's mostly near the centre, where all the stars clump up. So where is all this extra mass? That's a big open question. I think this was the first line of evidence for the existence of dark matter. Now, people at the time, I think this was back in the 70s or 80s, said, well, maybe there's types of matter that we just haven't detected properly, or perhaps our theories of gravity are wrong. And so people tried to develop things that are called modified Newtonian dynamics, or Mond, which were modifications of Newtonian dynamics, which would explain why small scales Kepler's law works. Kepler's law is derived from Newton's gravitational law.

Starting point is 00:25:04 But at a much larger scale, Newton's gravitational law breaks down, and we have to change it. This is pretty radical, you know, modifying Newton. I mean, Einstein did it, but still, it was radical then, and it's still radical now. We know that Newton breaks down at very small scales and in terms of general relativity, but this is not a general relativity effect. And so there was thought there were additional modifications were needed. And so people debated these modified Newton and Demerax models. But then different forms of evidence for the existence of dark matter were produced. People began to observe galaxy clusters.

Starting point is 00:25:35 So this is, well, clusters of galaxies, so not a single galaxy, but a bunch of galaxies clumped together. using x-ray emissions from hot gas and gravitational lensing to observe how the light from distant galaxies was bent as it traveled past massive objects. So that's a general relativity effect. And basically what they're able to do is measure the mass directly using the amount of gravitational lensing. So that's again the amount by which space is bent by the existence of mass. That's a general relativity effect. It allows direct measurement of the amount of mass. Don't worry if you don't understand how general relativity relates to

Starting point is 00:26:10 bending space. Just understand that it does. General relativity means the space bends when there's massive objects in it, and it allows you to know how much mass is there. So we can see how much mass is by the amount of bending of light, and we can see how much light there is by looking at the x-ray emissions. And we can compute the mass to light ratio against other objects that we know. And basically the point is that there's a deficiency. A lot of these galaxy clusters, there's just not enough mass that we can see. We can measure that there must be much more mass there, but we can't see it. There's not enough mass being produced from all the stars that we can see on the gas. So again, there's missing mass.

Starting point is 00:26:42 Maybe you could use modifying Newtonian dynamics to explain this as well, but then there's more evidence for the existence of dark matter. The cosmic microwave background radiation is type of light, so it's electromagnetic radiation, but it's in the microwave spectrum. So it's not light that you can see with the naked eye. The wavelength is much, much longer. And it's very close to a perfect black body distribution,

Starting point is 00:27:04 which means that it's kind of close to the theoretically perfect emissions that you would expect to see at an object of a subject. certain temperature. I think it's 3 Kelvin or something like that. Don't worry if you don't know what that is. It's just a particular distribution of the frequencies that you would see because objects at different temperatures emit light at different frequencies and kind of the universe, if you like, emits light at 3 Kelvin in this nearly perfect black body curve, but it's not quite perfect. There are small anomalies of a few parts in 100,000. Tiny variations from one bit of the sky to another. And it's these small variations that are thought to account for the formation of structure

Starting point is 00:27:39 in the universe, basically because there were areas that were slightly hotter and therefore were slightly denser and eventually galaxy clusters formed around those. Anyway, the point here is that the map of these small imperfections in the temperature of the background radiation can be analyzed in what's called an angular power spectrum, which is just a way of analyzing how much the light varies from one bit of the sky to another. Don't worry about the details there. But when we do this analysis, we see a series of peaks, basically corresponding to different sized fluctuations of the density. So there's some smaller ones, some slightly larger ones. I mean, they're all very small, but some are bigger than others, right? Now, the point to all this is that angular power spectrum of the microwave

Starting point is 00:28:20 background radiation that you expect to see depends on what stuff is in the universe, what stuff you think is in the universe. We can make one prediction on the basis of just all the matter that we see in the universe. And then we can make another prediction based on the matter that we see plus a whole bunch of extra dark matter. And they make different predictions about what you're going to see in this angular power spectrum, where the peaks are on this spectrum. And it turns out that the power spectrum that we see from the actual cosmic background radiation is very well fitted by something called the Lambda CDM model, which is a type of dark matter model. And it's very difficult to reproduce this by competing models, such as the modified

Starting point is 00:28:58 Newtonian dynamics models that don't have any dark matter. So basically, the imperfections or the temperature fluctuations in the cosmic microwave background radiation, those are only really explicable in terms of dark matter cosmic models, and they don't really make sense if you take the dark matter out of those models. So this is really strong evidence for the existence of dark matter, because you can't explain this away with modified Newtonian dynamics. It just doesn't work. And so at this point, the evidence for the existence of dark matter is overwhelming. even if it does have kind of a science fictiony name. But the question remains, what on earth is dark matter?

Starting point is 00:29:35 People have been debating about this ever since dark matter was discovered or hypothesized, and at this point we can say discovered. So what is it? Well, one proposal is that dark matter consists of small particles that don't interact with any of the forces that we know, apart from gravity, obviously, and maybe the weak nuclear force, or perhaps even some of the power. something else that's even weaker than that. But it doesn't, in particular, does not interact through the electromagnetic force, otherwise we'd be able to see it, and doesn't interact

Starting point is 00:30:04 with the strong nuclear force, because then we'd also be able to see it, most likely. So these are called weekly interacting massive particles, so they're elemental particles, kind of like an electron, but probably more massive than that. And because they're weakly interacting massive particles, they're called WIMPs. So new type of elementary particles that have not been discovered yet, but are hypothesized to exist, that are massive, so they can account for the gravity, effects that are observed, but they don't interact with other things, which is why we can't see them. There are many candidates for what these might be, including something called super symmetric versions of existing particles. These are derived from models of theoretical physics that I won't even

Starting point is 00:30:41 attempt to explain because I don't really understand them. Or perhaps there could be a wholly new type of particle altogether, so not things that have been previously theorized, like supersymmetric particles or axions and other things, but perhaps something else altogether. There's something called a sterile neutrino that's a hypothetical particle that interacts only via gravity, not even via the weak nuclear force. So ordinary neutrinos, which you may not have heard of, but they are known, they've been discovered, they interact with the weak nuclear force, but sterile neutrinos might not even interact with that. So good luck trying to measure them in a laboratory because we do rely on them interacting at least occasionally to detect any of these particles. So this is an

Starting point is 00:31:19 interesting hypothesis. I don't know how we'd observe such things. So that's one hypothesis. They could be sterile neutrinos. As I said, there could be other types of particles that have been hypothesized, or there could be something wholly altogether different. So the WIMPs are probably the leading hypothesis at this point, because basically there are lots of types of particles that theoreticians would like to exist because, you know, they would fit into their theories, and that could be massive and also wouldn't necessarily interact with any of the other forces, and that would explain why we can't see them. So these things are kind of the leading candidates for dark matter. But they're not the only possibilities. Another possibility,

Starting point is 00:31:53 of primordial black holes. So this is a hypothetical type of black hole. I won't explain what that is. It's a very dense region of space and time that's black because no light can escape it. If that doesn't make sense, don't worry. I'll do an episode on them at some point after I get around to doing relativity. Just if you know what a black hole is, most black holes are formed when a star collapses in on itself, but it's hypothesized that some black holes may have been produced in the very early universe when there was high enough densities for that to have happened. So these primordial black holes wouldn't necessarily have a cloud of dust surrounding them, which allow us to detect many black holes that are formed from former stars. So that's one hypothesis.

Starting point is 00:32:33 I understand that this is not widely held these days for complicated reasons, but it's still a possibility. Another possible identity of dark matter are something called massive compact halo objects. So these are, I think, the most interesting. And as I understand it, the name was coined after people started talking about Wimps, weakly interacting massive particles. And so, of course, they had to come up with a competing hypothesis, which they called massive complex halo objects, or macho for short. So it's machos versus Wimps. Obviously, theoretical physicists have a great sense of humour.

Starting point is 00:33:06 But anyways, a macho is supposed to be a body that's composed of normal barionic matter. So that's like ordinary stuff that the sun and the Earth and someone is made of. But it admits little or no radiation, and it just sort of drifts through space unassociated with the planetary system, or maybe not even associated with a galaxy. So because they're not luminous, they'd be very hard to detect. So these could be things like black hole. So primordial black hole would technically be a macho.

Starting point is 00:33:29 Or neutron stars or brown dwarfs, which is like a failed star, or even unassociated planets. Unassociated planets are a super cool idea, by the way. An unassociated planet is a planet that doesn't have a star. It just drifts through interstellar space by itself. And I so want to see a science fiction story set on an unassociated planet because that would be super interesting. But anyways, it's thought that Marchos,

Starting point is 00:33:50 cannot account for the overwhelming majority of dark matter, just because basically there couldn't be enough of them for us to not see at least some of them. I mean, there's no question that neutron stars and brown dwarfs and unassociated planets exist. It's just that how many of them would they have to be in order to account for 85% of the matter of the universe. They'd have to be so many that there would be ways of detecting them, basically. And for this and other sorts of reasons. It's thought that machos can't make up much of dark matter. They're certainly there, but there's probably not that much of them. And so weakly interacting massive particles are the leading candidate, but you'll still find physicists defending other views. At this point, the question

Starting point is 00:34:28 is still open as to what is dark matter, and therefore it remains one of the big unsolved problems in science. Let's move now from physics to chemistry and talk about the island of stability. It's got rather a whimsical name, as a lot of these do, actually. So the island of stability is not a physical island you can row to in the ocean, right? The island of stability is a hypothetical set of isotopes of super-heavy elements that may have much longer half-lifes than known isotopes of these elements. So an isotope is just like a variety or a variant of an element that has the same number of protons, but a different number of neutrons. Different elements and also different isotopes of a given element have different half-life. So the half-life

Starting point is 00:35:10 is basically how long that isotope lasts for. If an isotope is what's called stable, then it kind of lasts forever. So most things that we interact with are stable isotopes, like the carbon in your body, for example, most of that stable isotopes, carbon 12, for example. It doesn't decay. It just sort of sticks around. Some types of elements and particular isotopes of those elements are radioactive, which means they are unstable, and so they decay. They fall apart over time. They literally break up into smaller pieces. And probably the most famous radioactive element is uranium, which is used to power, some nuclear power plants. It's used in some atomic bombs, and we can utilize that power because it's unstable. It breaks down over a certain

Starting point is 00:35:47 period of time. Now, different isotopes of different elements have different half-life. Some decay quite slowly, some decay quite rapidly. And it turns out that as you go higher and higher on the periodic table, so uranium is the first element that doesn't have any stable isotopes, and then you go higher than that on the periodic table, so you keep adding protons. The elements, on average, get more and more unstable. Until you get to a point, by the time you get to the hundred and where the elements are very, very unstable. So they will only last a fraction of a second. The highest numbered element that's been created so far is, I believe, 118, so that's the number of protons it has, then it has a bunch more neutrons on top of that. But another interesting

Starting point is 00:36:29 thing that happens as the elements get heavier as you get more protons, is that you need more and more neutrons to stabilize the nucleus, because remember, protons are positively charged, and in the nucleus, they all clumped together. Now, positive charges, repel each other. So how do they all stick together in the nucleus? Why don't they all fly apart? Well, actually, the protons in the nucleus are, in a sense, trying to fly apart. What keeps them together is neutrons, because neutrons are not positive with charge, they're neutral, but they are made of quarks, protons also made of quarks, but quarks interact with the strong nuclear force, and that's an attractive force. So that's a lot more complicated than this, but basically

Starting point is 00:37:08 you can think of it as if neutrons provide strong force glue that keeps the nucleus together, whereas protons provide repulsion force that pushes the nucleus apart. And so you need the right balance of protons and neutrons in order to have a nucleus being stable. If you have too few protons, the nucleus will be unstable, basically because if you have too many neutrons grouping together, the neutrons become unstable and they decay into protons. So you can't have too many neutrons because they'll just convert back to protons. But if you have too many protons, the repulsive forces outweigh the attractive forces and the nucleus splits apart. So it's this fine balance between proton, number of protons,

Starting point is 00:37:47 the number of protons determining which element it is and the number of neutrons determining which isotope of that element that it is, that determines the stability of that particular isotope. On average, as you increase the atomic number, as you move up and up the periodic table, you need more and more neutrons to stabilize that nucleus, not just proportionally more, but disproportionately more, so like more than a one-for-one increase, but also even with that increase in the number of neutrons, the nuclei still get more and more unstable, which is why the isotopes have such short half-lifes. They just don't last very long. They just fall apart very quickly. But that's on average. There are some exceptions to this, and in nuclear physics, there's this concept known as

Starting point is 00:38:24 a magic number. A magic number is a number of protons or neutrons, or indeed both, that are arranged in complete shells within the atomic nucleus. So you probably learned in high school, or maybe from my earlier podcast, that around the nucleus there's different shells that electron sit in and electrons like to sit in full shells because that's how the energy levels work out basically and electrons like to go to the lowest energy state. While you probably didn't learn, but nevertheless is the case, is that there are also these shells inside the nucleus. And like anything else, nucleons, protons and neutrons like to be in the lowest energy state possible. So the nucleus prefers to be in lower energy states than in higher energy states

Starting point is 00:39:05 and there's sort of shells for those as well. And if you can fill up those shells, then that makes it a more stable state. And it's quite difficult to predict exactly when these magic numbers are. So there seems to be, from what I can gather, some disagreement on exactly what the magic number is, depending on precisely how you model the complicated forces that exist inside the nucleus. However, current theories predict that 114 is a magic number of protons, and we have created the element with atomic number 114. It's, again, it has a very short half-life.

Starting point is 00:39:36 It's called Florovium. However, it's not just the number of protons it has, but it's also the number of neutrons to stabilize that nucleus. And the magic number of neutrons is thought to be 184. So 114 protons, 14 protons. And we haven't yet been able to create an isotope of Element 114 neutrons. From what I could gather, the most neutrons we've been able to get in at 177, so quite a few short of the 184 goal. The point of this, though, is that the half-life of Element 114. 14 was about 30 seconds, which was quite a lot longer than had been predicted by other methods

Starting point is 00:40:14 and quite a lot longer than some of the sort of surrounding isotopes that have similar numbers of nucleons, but don't have those magic numbers of protons to stabilize that nucleus. So this is thought to be an example of something that is kind of near or getting close to that island of stability. Just to clarify, because I don't think I fully explained that name, if you plot on a graph with one axis being the neutron number and the other axis being the proton number, you can plot the half-lives or how stable all the different elements are, and what you'll find is as you get higher and higher proton numbers,

Starting point is 00:40:48 the elements tend to become less stable, but you also have to have the right balance of proton and neutrons. So they sort of taper off instability, and you get very unstable, Euclite. But it's hypothesized that if we get some really super-heavy elements with numbers of protons and neutrons around these magic numbers, there might be another island that pops up in sort of the sea of instability, that represents a much stabler isotope with a much longer half-life. And these would be the super-heavy elements that would be in our island of stability.

Starting point is 00:41:16 As I mentioned, the 177 neutron isotope of Element 114 is thought to be kind of on the way towards the island of stability, but because it doesn't have the full shell of neutrons, it's still not there yet. A couple of important things to understand about the island of stability. Because we don't observe these elements in nature, we know that they are radioactive, and they must have half-lifes that are shorter than billions of years, because otherwise there would still be these elements around in detectable quantities. We find them on Earth, but we don't. So they can't be that stable, but they could still be stable enough to have half-lifes

Starting point is 00:41:50 even in the hundreds of millions of years, because even a half-life that long would still have decayed to nothing, basically, given how old the Earth is. It's billions of years old. So it's not known how long the half-lifes of these super-heavy elements in the island of stability could be. They might be minutes, hours, days, could be, millions of years. No one really knows. It depends on exactly how you model the stability of the nucleus, and that's very difficult to do, and there are different models of exactly how that works. So the only way we're likely to find these is to be able to synthesize them in a lab.

Starting point is 00:42:20 But that's very difficult to do, because basically the way these are made is by smashing small the nuclei together and hoping that they stick. And obviously there's more to it than that, but also that kind of is the gist of it. And as you might gather, that's quite a crude method. And as you want to make bigger and bigger things, you need to smash bigger projectiles together, you need to have beams that are higher intensity, so you need to have more of them, and that just becomes more and more difficult to control these beams. So experimentally, it's very difficult to do. That's why we're able to make the 177 neutron a fluorovium, but not the 184 neutron version, which is the version that's hypothesized to be really stable. And again, as I said,

Starting point is 00:42:56 although it's thought that this is probably going to be in the island of stability, we're not exactly sure that this is where the island of stability is. It could be at a slightly different location. And it's also unclear as to exactly how stable it will be. Maybe it's just a little bit more stable. Maybe it's heaps more stable. And if this island of stability does exist, what chemical properties might these elements have? We can't really measure the chemical properties of many very heavy elements because they don't last long enough, so you can't, you know, do chemical reactions with them. But if things are sufficiently stable, that means they would be, they'd last long enough, and also they would potentially be not too radioactive, so they wouldn't be

Starting point is 00:43:30 too dangerous, we'd be able to perform chemical reactions on them, and they may have very interesting chemical properties, but that's still all unknown. So, despite the tantalizing progress that's been made recently with some very heavy elements, it's still not known whether the island of stability exists, and it's still not known if we can make the elements there, if so, how long they might last, and what properties they might have. So until that happens, the island of stability remains a major unsolved problem in chemistry. Let's move now from chemistry to Earth science and talk about the Snowball Earth Hypothesis. Again, a very kind of intriguing name. Now, unlike Island of Stability or P versus NP or even Dark Matter, the Snowball Earth hypothesis

Starting point is 00:44:12 has a very descriptive name because the idea of the hypothesis is that hundreds of millions of years ago, not just before the dinosaurs, but before there was any life on the surface of the Earth, that is, all life was in the oceans at this point, and it was only very small life at that. So, hundreds of millions of years ago, the Earth, perhaps once or perhaps multiple times, became basically a giant snowball. That is, the entire surface of the Earth, including the surface of all the oceans and all the continents, was covered in ice and snow. Currently, of course, we know that, you know, there's parts of Earth that are covered in ice

Starting point is 00:44:43 as snow, including there are mountain glaciers, there's the Antarctic, which is a continent that's covered under ice and snow. Then there's the Arctic Ocean, which there's no continent there, but the ocean is permanently frozen above a certain point. Then there's, of course, Greenland, which has a massive ice sheet on top of it. The hypothesis of Snowball Earth was that hundreds of millions of years ago, the whole Earth was like that. Now, you might wonder, well, how could that possibly be the case?

Starting point is 00:45:06 I mean, wouldn't it melt, especially near the equator that gets a lot of sun? How can there be snow and ice covering the entire equator? Well, this hypothesis was first proposed by a Russian climatologist, whose name I won't try to pronounce, in the 1960s, who was doing simple climatological models. And what he found was that one possible stable state for Earth, like one thing that could happen, was that the entire Earth could be covered with ice and snow.

Starting point is 00:45:33 And the reason this was possible is because if you put more ice on Earth, ice is basically white, ice and snow are white, so they reflect a lot of light, whereas the oceans are much darker, the continents are much darker, they absorb a lot of light. And absorbing light heats up the Earth, whereas reflecting it cools down the Earth. So as you put more ice on the earth, it cools down the earth, which allows more ice to form.

Starting point is 00:45:54 And so basically the earth, if this sort of is allowed to take hold, can freeze from the poles down to the equator. And he showed that it's at least hypothetically possible that enough light could be reflected by all of this ice and snow, such that the earth will be kept cool enough to keep it all frozen. Now, he rejected this possibility because he looked around and said, well, although it's cold in Russia, it's not that cold everywhere. So clearly this isn't the case now, and therefore it couldn't have ever been the case in the past, because according to his model, there'd be no way of getting out of this. Once frozen, always frozen. Now, since then, it's been discovered that there are many triggering mechanisms that could get the Earth into this snowball state and potentially get it out again.

Starting point is 00:46:34 So one example that could get the Earth into this state is the eruption of a supervolcano that threw up a whole heap of greenhouse gases into the atmosphere, and you might think greenhouse gases cause global warming. Well, that's actually true, but supervolcanoes, also throw up dust and other types of gas in the atmosphere, which can block out the sunlight and cause global cooling. It very much depends on exactly what gases are put up there. So this could potentially cause cooling or warming,

Starting point is 00:47:00 depending on exactly the situation. And if enough of these super volcanoes went off, they could actually melt ice themselves and potentially, therefore, begin the process of thawing of the Earth. But there's other possibilities as well, such as changes in solar energy output, because they fluctuate over time, perhaps perturbations in the Earth's orbit.

Starting point is 00:47:16 but there's other possibilities as well. So it's hypothesized that at some points in the Earth's past, the Earth totally froze over, up to and including the equator, and that maybe it lasted like that for tens of millions of years, and then eventually it thawed out again. So that's the Snowball Earth hypothesis. And more specifically, it's thought that this happened, perhaps a couple of times, in a period,

Starting point is 00:47:38 a geologic period of the Earth's past, was around 700 million years ago, called the Cryogenian, named because it was cold then. There were two glaciation events that occurred during the Criogenian, and they are known to be the greatest ice age is known in the history of Earth. So we know the Earth was very cold during these times, but we don't know quite how cold. Before going on, I should explain what a glaciation is.

Starting point is 00:48:01 A glaciation is just an ice age, so it's a period in the Earth's history when there are large-scale glaciers, particularly in mountain areas, but also at the poles, because that's where it gets coldest. Now, you might be thinking, well, hang on, there are glaciers at the poles and on mountains today. Granted, they're melting, but they are still there. So doesn't that mean we're in an ice age? And the answer is, yes, we are in an ice age. Most people don't realize this. I think I've talked about it on the podcast before, but bears

Starting point is 00:48:25 repeating. The Earth's been in an ice age for about two and a half million years now. Currently, we're in an interglacial phase of that ice age, which means it's a relatively warm part of an ice age. If we don't do anything about global warming and all of the ice caps completely melt, then I guess we won't be in an ice age anymore. But that's not going to happen any time immediately because there's quite a lot of ice still to melt. Nevertheless, the point is that an ice age is just when there's ice, large amount of ice somewhere in earth, especially at the poles. And most of Earth's history has not actually been during ice edge. Most of Earth's history, there is no large-scale ice on the surface of the Earth. So we're living in an unusual time now,

Starting point is 00:49:00 but there were two big latiations, or ice ages, back in the Crianian. And these two events are still highly contested. And the debate is not whether they happen, but the debate is how cold did it get? two camps here. The Snowball Earth Camp, which was the whole Earth is frozen completely, and the other camp is called Slushball Earth, where a large part of it froze, but there was a band of open sea that survived near the equator.

Starting point is 00:49:25 It's thought that a tropical distribution of the continents is necessary to initiate a snowball Earth. That means the continents have to be clustered around the tropical areas of the Earth and away from the poles. Presumably know that the continents move around in a process called Continental Drift. I will

Starting point is 00:49:40 get around to doing an episode on that, haven't done it yet. But throughout the history of Earth, they've moved around a lot. Currently, we have a relatively mixed distribution of the continents. So we're not a full snowball Earth, but times throughout the Earth's history, it's thought that there was a large concentration of the continents around the equator. Because the equator gets the most sun, and it turns out continents actually reflect a bit more sunlight than the ocean does, that can trigger, or is thought to have triggered global cooling. Also, when the continents are around the tropics, they're subject to more rainfall,

Starting point is 00:50:10 because it rains more in the tropics, and this leads to more erosion. And when rocks are eroding, they engage in a general chemical reaction which pulls carbon dioxide out of the atmosphere, at least for silicate rocks, and most rocks are silicate rocks. So more erosion means pulling carbon dioxide in the atmosphere, which means less greenhouse gases, which means reduced global warming effect. So there's thought to be a number of these types of mechanisms that, with the right arrangement of the continents, and perhaps with a bit of extra help of the sun cooling down just a little bit because it does fluctuate over time.

Starting point is 00:50:41 The earth was triggered to begin to cool, and as it starts, if it gets over a certain threshold, it becomes a runaway process whereby it cools down a lower latitude freezes. That reflects more sunlight, which leads the next lower latitude to freeze and so forth, until you get, according to the full version of hypothesis, right down to the equator, and the whole Earth is frozen up. It's thought that moving into and out of the Snowball Earth hypothesis would not have been rapid, so it's not something that happens over a few years or even a few days like you see,

Starting point is 00:51:08 and that movie the day after tomorrow, although it was a fun movie. The idea of the earth cooling that rapidly, like in a few days, it's completely ridiculous. Even rapid versions of the Snowball Earth hypothesis involve thousands of years, but nowadays it's thought to have occurred glaciation, declassation over a period of hundreds of thousands to a million years. So it's not a rapid thing, but nevertheless, did it occur? Scientists still can't agree as to whether we had a full Snowball Earth, or whether there was this region around the equator. One argument for there being a warm region around the equator

Starting point is 00:51:44 is because, well, life survived somehow, and it was hard to see how you would have had enough energy reaching the oceans and access to the sufficient nutrients and so forth if the oceans were completely enclosed under ice. But people have counted that maybe it was hydrothermal vents, maybe volcanoes, perhaps. There were very small regions here and there that allowed just enough nutrient exchange.

Starting point is 00:52:06 But it's still a little hard to see exactly how, life survived if the whole earth was covered, but you know, life finds away, as they say. But there are other issues that are subject to debate. In particular, we have paleomagnetic measurements taken from old rocks, which indicate where on earth the rocks were located when they formed. And these rocks can give us an indication as to were the continents in the right location for the snowball earth to happen and where were they at a particular time. For example, can we find clear evidence of rocks that were formed in a glacial environment because that leaves traces, that were formed nevertheless at or near the equator.

Starting point is 00:52:42 That would be smoking gun evidence. There's some evidence being proposed for this, but paleo-magnetic measurements aren't super reliable, and there's questions about how the Earth's magnetic field might have changed over time. Some scientists have put forward evidence of sedimentary features that would only form in open water, like ripples formed from waves, found in rock that dates to the so-called snowballer time periods, and this seems to rule out the pure snowball scenario.

Starting point is 00:53:07 But again, you have to firmly date that to within the time period, and dating is often not super reliable. I mean, we can get a ballpark, but it needs to be more precise so that it gets it to within the narrow range that the snowball period was supposed to cover. Climatological models of this scenario are still poorly understood because we don't understand all of the feedback mechanisms. I mentioned one of them relating to the erosion of silicate rocks, which changes the amount of carbon dioxide in the atmosphere, which affects amounts of global warming. But there are many other complicated interactions as well, including from volcanoes, but also including the amount of accumulation of carbon oxide in the ocean as a result of decay of organic matter and photosynthesis and other things. So we don't really understand exactly how to model all that correctly. And so there's still questions about those models. Another question relates to the amount of an isotope of carbon called carbon 13. Carbon 13 is an unstable isotope, unlike carbon 12, which is the stable version.

Starting point is 00:54:05 but the ratio of carbon 13 to carbon 12 that we observe in certain types of rocks that are formed from organic matter varies depending on the amount of life that existed at that time, basically because living organisms, biological processes preferentially take up carbon 12 over carbon 13, and so they change the isotope ratio. So depending on the observed isotope ratio, we can estimate how much stuff was kind of alive at that time and what sort of processes were occurring. So a lot of the debates around this hinge around these variations in what's called the Delta 13C number, which just refers to the change in the ratio of carbon 13 to carbon 12,

Starting point is 00:54:45 and how we can explain this. So it's thought that a negative value of this is associated with formation of glaciers and the Snowball-F scenario, basically because that kills off a large part of life and reduces the amount of photosynthesis and things that can occur, so reduces the amount of organic matter. But the timing doesn't necessarily match up with the evidence. evidence that we have from other measurements like paleomagnetic, so how do we reconcile that? As you can see, it's quite difficult because each source of evidence that we have,

Starting point is 00:55:12 like paleomagnetic, sedimentary features, climate models, Delta 13C, fossils, and other evidence, it's all quite uncertain, and they generally have big uncertainty date ranges around them. And so at this point in time, we still have scientists on kind of on other side as to whether we had that slushball scenario of the open ocean around the equator, or whether it was a true snowball earth scenario with the whole planet just being like the planet Hoth. And until that can be resolved, the snowball earth hypothesis will remain one of the unsolved problems in science. Let's move on now to a big unsolved problem, probably the big unsolved problem in biochemistry, which is called the protein folding problem. Proteins are big molecules

Starting point is 00:55:56 that make up many of the important structures in our body at a microscopic level. So they make up our bones, they make up our organs, they make up the structure of the key enzymes that operate in basically little of the cells of our body, so they're pretty important. A protein's function depends on its shape, or what's called its three-dimensional confirmation, just means the shape. And proteins are basically big long strings of balls, and those balls are called amino acids, which are made up of different atoms put together in a particular way. They're a type of organic molecule, but we didn't worry about that. I have talked about it in previous episodes, but The point is a protein is defined by it's an amino acid sequence.

Starting point is 00:56:34 So it might be a few hundred amino acids long. Some of the B-1s might be a few thousand. And the exact order of those determines its confirmation, which determines its structure. So this is the mantra, structure determines function, and the sequence determines the structure. So it's sequence structure function. All well and good. But there's a problem.

Starting point is 00:56:52 When our cells, particularly the produce proteins, they produce it based on a DNA template, which again I have discussed in previous episodes as to how this works. But the point is the proteins come out in a big long string, in a linear string, they literally come out in a straight line, kind of literally off an assembly line, off a molecular assembly line. But that's not how the protein stays. It doesn't stay in this unfolded state.

Starting point is 00:57:14 It folds up and arrives at what's called its native confirmation, which just means like the way it should be folded to fulfill its task. Sometimes proteins have different confirmations that they can follow up into, and all confirmations are not rigid. They do move around and flop around to some degree. But for simplicity, we'll just talk about one protein as if it has a single native confirmation, which is like the one true confirmation that it's supposed to get to. Again, just for simplicity here.

Starting point is 00:57:37 The problem is, how does the protein find this correct confirmation? Because if it just tested all the possibilities randomly, even if you only had a few hundred amino acids, the number of possible configurations that it could form is enormous, and it would take, even assuming very rapid trialing, it would take longer than the age of the universe to find its correct. confirmation if it was just checking randomly. So obviously proteins can't just fold up randomly and form into the correct confirmation because proteins

Starting point is 00:58:06 generally fold up quite accurately and quite quickly. We're talking in terms of microseconds to milliseconds. So fractions of a second and although some proteins do misfold it's they still fold with remarkable accuracy otherwise you would die. So the protein folding problem is just asking how does the protein do this? In particular there's sort of two questions here. One, how is the folding mechanism encoded in the amino acid sequence? And two, how does the protein know sort of how to follow those instructions? So what is the mechanism behind the folding? These are related, but they're distinct because one is where is the information

Starting point is 00:58:44 about folding and the other is how does the protein actually sort of utilize this information to fold up? We know these answers when it comes to DNA being converted to protein. So DNA, we know the information is encoded in the sequence of nucleotides. We have the alphabet, if you like, for that, the codon sequences that correspond to amino acids, so we know that, and we know the mechanism because we know about how DNAs are converted to RNAs, which are converted by proteins using the ribosone system, which is in all cells. So we know those answers when it comes to nucleic acids, but we don't know them when it comes to protein folding.

Starting point is 00:59:18 There's also sort of a third question, which is, how can we predict the native structure of a protein from only its amino acid sequence? So that's not a question about how the protein does it. That's a question of how we can make a prediction. That would be very useful for pharmaceutical purposes, although less interesting in terms of understanding the underlying mechanism. Of course, we might use knowledge of the underlying mechanisms to solve the protein folding problem,

Starting point is 00:59:40 but current methods don't really do that very much. They use protein homology, which is basically like use a similar protein that we solve the structure of to predict the structure of one that we haven't solved. And plus you have a bunch of fancy machine learning algorithms that kind of extrapolate from that. and of course there's a lot of detail to it, but at any rate, it is a separate question from just asking how does the protein do protein folding. So what do we know about this question?

Starting point is 01:00:03 Well, we know that proteins have what's called a funnel-shaped energy landscape. You can visualize a funnel, you know what a funnel looks like. Now, it's not a smooth funnel. Imagine lots of bumps, some very large bumps on a funnel. This represents the energy landscape of a protein. What this means is that the vertical up and down into the, from like the top of the funnel down its nozzle, represents the energy from high to low energy. energy. We know that proteins like everything else in the universe want to achieve a low energy state. The sort of x and y axes along the surface of the funnel represent different confirmations that

Starting point is 01:00:33 the protein can be in, so different physical rearrangements of its structure. And the idea of the funnel-shaped energy landscape is that just there's more different confirmations that can exist at the high energy states and fewer and fewer as you go further down the funnel and get to lower energy states. So that's one mechanism that's used is a sort of a constraining mechanism, is that you don't just jump straight to the low energy, but you're sort of guided down the funnel to get lower and lower energy, and as you go, you get closer and closer to the sort of final answer at the bottom, which is the low energy state. But there are, of course, bumps along the way, and the protein has to find ways of moving around these bumps. The funnel model doesn't explain

Starting point is 01:01:12 everything, but it does give us a framework for thinking about how it kind of works. So it's a gradual bit-by-bit process. It's not an only one thing, and it doesn't work by trying randomly each confirmation. we know couldn't work. So we know the unfolded protein starts at the top of the funnel and gradually works its way downwards. And this means that if there's a bumpy funnel that there are going to be many intermediate states along the way, which are not fully stable, but are partly stable. And so in theory, we should be able to experimentally measure these intermediate states, and that would help us to understand the process of folding. In practice, it's very difficult to measure those states reliably. We know some of the underlying physical mechanisms that mediate folding. These

Starting point is 01:01:51 are ways that the protein kind of sticks together on itself, include hydrogen bonds, vanderval's interactions, electrostatic interactions, and hydrophobic interactions. I won't go into the details here. We know about these. They're just ways that the different amino acids can stick to each other, but exactly how that all fits together in terms of specifying one specific confirmation out of the huge number of possibilities, which is different for different types of proteins. It's still very unclear. We can model protein folding on a computer using force field. A force field is just a computational model that allows us to simulate how proteins move in a computer. And I've actually published some papers on this for work that I used to do.

Starting point is 01:02:29 If anyone's interested, you can shoot me in email, I can send you some of the papers that I wrote. These force fields are not the best. I mean, they're okay, but also they take a lot of computing power to use. So we can't just stick a big long, linear protein into one of the simulations and just watch it and see, oh, how does it fold? A, the simulations are not really accurate enough to be sure that that would be accurate. And B, it would take too long for the big proteins to fold. We have used supercomputers that are specially built for this to test whether small proteins can fold up in the way that we know whether they do, because we can compare what happens in this simulation to what we know happens experimentally, and they do a reasonable job, but these are only small proteins.

Starting point is 01:03:08 It'd be much more interesting to see what happens for much larger ones, and that's just outside of our capabilities at this point. Also, even if you can watch something in a simulation, it doesn't necessarily tell you what's happening mechanistically because you kind of have to work out what's... I mean, you just see all the atoms and they're kind of moving all over the place. You've got to work out from that world, what's the underlying mechanism here? So computer models can't solve everything, although they can be helpful.

Starting point is 01:03:30 Structural prediction methods, as I mentioned before, have improved a lot over the last few decades, but they still rely heavily on homology sequencing, so basically comparing it to similar proteins. And these are getting better, but it's partly just because we know more and more structures now. So we kind of know the answer more often, we know a similar answer. So I'm not sure if these improved algorithms are really telling us that much about how the

Starting point is 01:03:53 folding actually happens. These are some things that we can say about the protein folding problem, but as I said, there are still many, many open questions. In particular, we don't have very much experimental knowledge of the protein folding energy landscapes because we can't really accurately determine these intermediate structures that we would need to know about to figure out the steps on the way down the funnel. That's very difficult to do experimentally. We still don't really know why proteins don't aggregate, which means like clumped together

Starting point is 01:04:20 in a big mass more often than they do. Proteins do this. It's responsible for certain aging and other diseases, but we're not sure why they do it sometimes and not other times, and we can't really predict when they will do that. That relates to misfolding, so if we could understand why that happens, it would be probably a breakthrough in terms of understanding protein folding. We still don't have very good algorithms to predict the binding infinities of drugs and small molecules of proteins.

Starting point is 01:04:43 that's a real problem because it shows that we don't really understand the interaction forces like hydrogen bonds, Vandervilles, and so forth that I mentioned before. The default should be responsible for the actual physical mechanism of protein folding. We know that they exist, but we don't really understand how they work very well. And we don't know how good our computer models are for simulating these sorts of things, particularly for the trajectory of the folding states. Again, we know that we can model it accurately with small proteins, but it's still unclear. how you're good it's going to be for the bigger ones, which are more complicated. So, until some of these questions experimental and theoretical are addressed, the protein-folding problem is going to remain a big unsolved problem in biochemistry.

Starting point is 01:05:27 Moving on finally to our last unsolved problem, the Cambrian explosion. Now, this is probably the best well-known of the unsolved problems that I mentioned, although people like Neil deGrasse Tyson and Stephen Hawking, I think, are probably popularised dark matter a fair bit, So perhaps that's fairly well known also, I'm not sure. But anyway, a lot of people have heard of the Cambrian explosion, if you're into science. The Cambrian explosion was a event or period of time, about 540 million years ago, that occurred at the start of the Cambrian period,

Starting point is 01:05:59 in which most major animal phyla appear for the first time in the fossil record. So it's an explosion not of gas, but an explosion in terms of the number of phyla from animal fossils that we observe. In particular, it's a time when we first see hard-bodied animals with skeletal fossil remains in the fossil record. And the problem is, well, how do you explain it? Evolutionary theory predicts a gradual, not necessarily at a constant rate, but nevertheless a fairly gradual development of different forms of life over time. And this idea that we sort of suddenly see this appearance of huge diversity of different philer of animal life in the fossil record in a relatively short period of time,

Starting point is 01:06:40 seems contrary to that. Now, this problem has been around for over a century. I think Darwin was aware of it, and I think it's become significantly less mysterious over the last century or so than it used to be, but there are still many unsolved issues. So let's make a few points about this. First of all, there's a known principle,

Starting point is 01:07:00 which is that we're never going to see the first or last organism in any given taxon in the fossil record, because we only have a very small fraction of organisms that are preserved in the fossil record, and we're just never going to see the first or the last one, meaning that when we see organisms appear in the fossil record is always after they actually appeared, right,

Starting point is 01:07:21 because we're never going to see the first ones. So it's thought particularly for some phyla, some classes of organisms, that their appearance may have been quite a bit before we first see the fossil evidence of them, including before the Cambrian explosion. So that's one factor, although it doesn't account for everything. So poor preservation is not believed to fully account for the Cambrian explosion,

Starting point is 01:07:43 although it accounts for a part of it, particularly because around this period we start to see the evolution of more organisms that have hard skeletons or exoskeletons that are easy to fossilize than soft-bodied creatures at most of KB4. But it's not thought that that accounts for the whole of the effect. But that's one thing to bear in mind. Another thing to bear in mind is that the Cambrian explosion is a relative concept. It's fairly quick in terms of geological time, but it's not quick overall.

Starting point is 01:08:07 It's thought that the period of time that we're talking about is something on the order of 20 to 25 or so million years. That's not exactly an explosion in any kind of human time span. Also, it's not the case that animal phylogist appeared in the early Cambrian period when there was just no antecedents beforehand. There is a whole range of different types of organisms, or different types of fossils, I should say, that have been discovered that predate the Cambrian. and these are called Ediacaran life forms, and it's often not known really where they fit in in terms of the tree of life.

Starting point is 01:08:44 They're probably intermediates, many of them, between animals and fungi or algae or something. This is very poorly understood, this region of evolutionary history because we have socio-fossils, but they do exist. And so these Ediacara biota include many large and multicellular organisms, but that really don't resemble any modern organism. So they seem to predate the splitting of many important, animal phyla and therefore probably the ancestors or some of them are the ancestors of multiple

Starting point is 01:09:12 branches of animal filer which means they're going to look very different from anything we know today because an animal philop is a very big group of animals so it includes so cordata for example which is all vertebrates plus a few other things are one phylum and echinoderms which is like starfish and similar things represents another phar so if you try to imagine something that's across between like a lizard and a starfish you're sort of getting at the level of what on earth would this thing look like And so these Ediacara biota are often very strange and we don't really understand how to classify them. Also, results from genetic molecular clock estimates, which basically looks at the rate of accumulated mutations and sort of project that backwards in time, indicate that many of these different animal phil are have common ancestors that some of them date in the Ediacaran period,

Starting point is 01:09:56 which is the 100 million years before the Cambrian explosion. But some of them predate that even, looking at the cryogenian period, which was the 100 million years even before that, which I talked about before in terms of the snowball earth period. So we don't have many fossils from the cryogenian, but it's thought that many important splitting events in terms of the last common ancestor of two different types of animals occurred during this time. So the point is that there's fossil and molecular evidence

Starting point is 01:10:24 to indicate that there was a lot of evolution happening in the 200 million years before the Cambrian explosion, in the cryogenian and in the Ediocaran. And so it wasn't just an all-of-sum thing that happened in the start of the Cambrian. It was something that had been happening for a long time, seems to have reached some sort of climax or an acceleration, where we see many more fossil forms observed in the Cambrian period.

Starting point is 01:10:47 Furthermore, it's not just before the Cambrian period that we begin to see things that start to look like animals, although I should say the first fossil evidence of many different animal fire that, including chord dates, does date to the Cambrian period. So that's sort of when animal life starts, but there's, as I said, precursors over the previous 200 million years. But it's not just during the Cambrian period, but it's afterwards in the Old Division, or the Lake Cambrian and also the Ordovision period that we see many different types of animal

Starting point is 01:11:12 genre and species being observed as well. So it's not like that everything poofed into existence in the early Cambrian period and then kind of everything just stayed constant afterwards. Diversity of life continued to increase very rapidly even after the Cambrian explosion, even if most of the philer had already been observed by that point. Indeed, models have been developed to predict how we would expect the diversity of life forms on the earth to change, particularly of animals, over time as a result of evolution. And the basic idea is what we'd expect to see is an exponential increase until all of the available

Starting point is 01:11:49 niches and resource constraints begin to become binding and then a leveling off at some fairly high but roughly constant level. Of course there'll be extinctions and other variations around that. So this is one explanation of what happened in the case. Camerine explosion, that this period of geological time just represents kind of the peak of the exponential curve increase of the diversity of biota, which continued into the Ordovician, but then leveled off afterwards as many of the niches became occupied. There are, however, other explanations of the Cameroon explosion, which would try to account for why it's a particularly special time in evolutionary history, something different about

Starting point is 01:12:24 that time. One is actually the snowball earth hypothesis that I just talked about, so the idea being that during the Crygenian period, the Earth, the Earth, was a snowball covered in ice and it was very difficult for new forms of life to evolve during that time. During the Ediacaran period, the Earth thawed, and that opened up new evolutionary opportunities for new forms of life to proliferate. One problem with this is that the snowball episodes occurred over 100 million years before the start of the Cambrian, so it's hard to see how what's called an evolutionary bottleneck could have really been the explanation for the Cambrian

Starting point is 01:12:56 explosion because it's just such a long lead time. But nevertheless, some people still argue that this might have been a contributing factor. Another explanation is that around 600 million years ago, there was an increase in oxygen levels in the atmosphere. The reasons for that are kind of complicated, but it happened relatively rapidly over tens of millions of years. And this has thought to perhaps contribute to the ability of more complex, larger forms of animal life to evolve,

Starting point is 01:13:20 because obviously animal life requires oxygen. The main problem with this explanation is, again, that the timing doesn't quite work out. There's at least a 50 million-year lag time, maybe more. and we do see evidence of ediocar and biota, which earlier, but nevertheless, it's still a possibility. Another explanation that's been put forward is an evolutionary arms race, so the idea that some organism evolves, and then another organism involves to eat that organism, and so the first organism involves defense mechanisms that require it being more sophisticated and maybe more intelligent and larger and whatever,

Starting point is 01:13:52 then the prey evolves mechanisms to get around that, and so you have this sort of arms race between predator and prey, which can ratchet up the complexity of organisms, which didn't happen before predators evolved. This may have contributed, but there's no real correlation between the rate of evolutionary development after the Cambrian explosion, including during the later Cambrian and the Ordovician periods, and the predator ratio to prey.

Starting point is 01:14:19 And also we see predators much earlier during the Eidoccurin period as well. There's evidence of predation. So it's not clear that this was, was a key explanation, although it may have played a role as the initial triggering, even if it didn't sort of continue to accelerate the rate of diversification afterwards. And a final explanation is just that at some point a complexity threshold was reached in which life became sufficiently complex so that many new evolutionary possibilities became opened. Because evolution tends to kind of work with adapting what it already had, and so maybe it

Starting point is 01:14:51 required certain basic morphologies to be developed before those could then be adapted into a wide range of environments, perhaps more oxygen in the atmosphere, and a predator prey arms race may have also contributed to these things. The real difficulty is that these factors are very difficult to model because life is so complicated, and we don't really understand very well how these organisms live or what they look like in a lot of cases in terms of levels of detail necessarily to model this stuff. So I don't know if we really ever know the answer in terms of why the Cameron explosion occurred, but perhaps we will get better information about the climate at the time, and also hopefully new fossils will become available from the Ediacaran and maybe also cryogenian periods,

Starting point is 01:15:30 along with perhaps better genetic, phylogenetic reconstructions of the diversification of different life forms that may clarify some of the issues here. So I think there are possibilities for this being resolved or at least clarified at some point in the future. And until these can be resolved through scientific investigation, the Cameron explosion will remain an unsolved problem in science. So that brings me to a conclusion of the series of six unsolved problems inside. I hope you found that interesting. Before finishing up, I wanted to use the last bit of this episode

Starting point is 01:16:02 to mention a few of the exciting developments that are happening with the podcast. I'm thinking that I might produce a special episode separately from this to go into a little bit more detail. Also a bit of background about the history of the podcast and a little bit about me personally because a number of viewers have asked about this and I thought it might be interesting. a bit of behind the scenes as a bonus for the 100th episode. But to just mention a few things here. As you know, over the past couple of years, the podcast has not been released as frequently as it had been in the past. And I have been upset about this, especially over the last year. There have been a number of reasons for this. One of them is that my life circumstances changed. I moved out

Starting point is 01:16:38 of home nearly two years ago now, and that involved a lot of stress, and obviously time spent moving, and I've moved a number of times since then. As I also mentioned, a couple of episodes ago, I had a number of personal things occur in my life that triggered a pretty bad depressive episode, which made it a lot harder for me to work on things like the podcast, and it just meant that I had to take a bit more time for myself. Then last year, I started a new degree, and that took up a lot of my time into last year, and then on top of that, I have to work part-time to support myself. I also do some volunteer work, so unfortunately the podcast in the last couple of years has kind of got squeezed out a little bit.

Starting point is 01:17:18 And I feel quite bad about that. So I've been thinking about this, and I've come up with a way that I can hopefully provide more quality content to you guys, whilst also being able to fulfill my other obligations. So the idea is this. I have a Patreon page for the podcast that I did set up a couple of years ago now, but I haven't really done much with it. And partly that's because I haven't produced that much content in this time. But also I felt a bit bad asking for donations. But I think that if I could get even a few donations of a few dollars per episode, and that's the model I'll go on, terms of each time I release an episode, I'll up the thing up on Patreon, and then if you

Starting point is 01:17:53 are a subscriber there, you'll be deducted that amount you sign up for, and that will contribute to the time that I spend working on the podcast. I do spend quite a bit of time researching episodes. I estimate at least 10 or 12 hours an episode, including research, recording, and editing on average, and as I said, it can vary. Also, there are other costs associated with hosting the podcast, including the website and the hosting for that, because I have to pay for the amount of bandwidth that I need for everyone to download it. And I've always just paid that out of pocket. But it would be nice to have a bit of funds to recoup those costs.

Starting point is 01:18:26 But the most exciting thing is that if I can get just a little bit of money coming in from the podcast, I can actually reduce the amount of work that I have to do. I actually do private tutoring to earn some money. And I was thinking, well, I really like teaching, but in private tutoring, I generally only get to teach one person at a time. Whereas with the podcast, I can teach thousands of people at a time. So it would be really exciting if some of you guys would like to get on board with that. I want to iterate a promise that I've made before on the show that the podcast will always be free

Starting point is 01:18:55 and I never intend to put ads on it. And so any donation that you make would be purely voluntary. And I'm not big into gimmicks and things. So maybe there'll be some perk that you get from being at a different donation level. I'll have to think about that. But mostly this is just a voluntary contribution if you think that this work I do is valuable and you benefit from it. if you'd want to contribute a few dollars every time I release an episode, that would be really much appreciated.

Starting point is 01:19:20 And if I can get a certain amount of traction with this, then I can actually spend a lot more time working on podcast episodes. So I know I haven't been the most regular recently, and people don't want to sign up for something if, you know, then you don't produce the content. So there's a commitment that I want to make going along with this, that I've already released, this would be the second episode this year. I've got two episodes that I've actually pre-recorded from ages ago and hadn't released. So those are already lined up ready to go. Then there's also a series of eight episodes, the longest series I've ever done, that I've been researching for actually a long time now. It's something I've been working on for well over a year about economic growth and development,

Starting point is 01:20:00 which is really exciting. And it does require some more work, but I hope that I'll be able to finish that by the end of this month, or maybe the start of February from when I'm recording this. Anyway, once I've got that, that will be 10 episodes that I'll release at roughly two-weekly intervals over the first few months of this year. I plan to have episodes released roughly every two weeks for at least the first half of the year.

Starting point is 01:20:22 But I'm hoping that if enough of you are able to be generous and contribute to some of the costs of the podcast by signing up on Patreon, then I'll be able to continue to produce podcasts at roughly that rate once every two weeks. and that's how often you would pay the subscription. And I would like to continue that indefinitely, basically. But to do that, I need your guys' help.

Starting point is 01:20:43 I need to be able to get enough to support myself and cover my expenses while studying, as well as some incidental fees with the podcast. But most of all, I'd love to be able to produce more content, and I could do that if I could cut down the amount I privately tutor and instead spend more time on my podcast. I prefer doing that, and it gets more content out to you guys, so it's a win-win.

Starting point is 01:21:02 So I'm really hoping that this can work out. But I need your guys help on this. And if you're not in a financial position to help, by all means, I perfectly understand. But if you are and if you really value the podcast, including the work I've done in the past and the work that I'm going to continue to put in over the course of this year, then I'd really appreciate your support. And I'm hoping to make this something that I can sustain longer term and just get much more quality content out for you guys.

Starting point is 01:21:27 So if you would like to support the show, visit patreon.com and search for the Science of Everything podcast, and you can become a subscriber there. Other ways you can support the show is also to go to the Facebook page and give the page a like, and you can follow updates and news about upcoming episodes. You can also send me an email. My address is FOD12 at gmail.com. That's FODS12 at gmail.com.

Starting point is 01:21:52 Please feel free to send questions, suggestions, or other feedback. I always like me for my listeners. So thanks for listening, and I'll talk to you next time.

The Science of Everything Podcast - Episode 100: Unsolved Problems in Science

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.