The Joy of Why - How Is AI Changing the Science of Prediction?

Episode Date: November 7, 2024

Scientists routinely build quantitative models — of, say, the weather or an epidemic — and then use them to make predictions, which they can then test against the real thing. This work ca...n reveal how well we understand complex phenomena, and also dictate where research should go next. In recent years, the remarkable successes of “black box” systems such as large language models suggest that it is sometimes possible to make successful predictions without knowing how something works at all. In this episode, noted statistician Emmanuel Candès and host Steven Strogatz discuss using statistics, data science and AI in the study of everything from college admissions to election forecasting to drug discovery. 

Transcript
Discussion (0)
Starting point is 00:00:00 Here's the thing, science isn't only for the PhDs of the world, it's for everyone. On the Shortwave Podcast, we dig into the latest research with a humorous touch. In under 15 minutes, it's serious science, even if it doesn't sound like it. Listen now to the Shortwave Podcast from NPR. Making predictions is a challenge woven into every part of our lives, often in ways we don't even think about. Will it rain this afternoon? How will the stock market respond to the latest news? What would mom like for her birthday?
Starting point is 00:00:50 Typically, we build up a knowledge base and a theoretical understanding, at least in science, and apply what we know to predict future outcomes. But that approach faces sharp limitations, especially when the systems to be analyzed are profoundly complex and poorly understood. I'm Steve Strogatz and this is The Joy of Why, a podcast from Quantum Magazine where I take turns at the mic with my co-host, Jan Eleven, exploring the biggest unanswered questions in math and science today. For this episode, we're joined by mathematician and statistician Emmanuel Candes to ask, how are data science and machine learning helping us approach complex prediction problems like never before? And how confident or
Starting point is 00:01:35 skeptical should we be in their predictions? Can we figure out ways to quantify that uncertainty? Emmanuel Candes is a chair and professor of mathematics and statistics at Stanford University. His work lies at the interface of math, statistics, information theory, signal processing, and scientific computing. He's a member of the US National Academy of Sciences and has received a MacArthur Fellowship, a Kolotts Prize,
Starting point is 00:02:03 and a Lagrange Prize. Emmanuel, welcome to the joy of why. Thank you very much for having me. And since you mentioned the National Academy, let me start by congratulating you on your election. This is truly awesome. Oh, you're too kind. Thank you. I'm honored to be joining you and all of our other esteemed colleagues.
Starting point is 00:02:21 Well, so let us begin here by talking about something on the mind of just about everybody nowadays, machine learning models, we keep hearing so much about them. We know that they can pour through massive data sets and often pick up patterns that no human being could detect. But these models, people refer to them often as black boxes. And I'm just wondering, would you yourself use this phrase?
Starting point is 00:02:45 And if so, what do we mean by a black box? As you said, a machine learning algorithm takes as input data collected in the past and given a set of features, tries to make a prediction about an unknown label. So I have to say that the predictive modeling culture is as old as the field of statistics itself. Statisticians, starting with Galton and Pearson and Fisher, have been very focused on making predictions from data, but they use relatively simple models, models that could be analyzed mathematically, models that we teach at college,
Starting point is 00:03:25 for which you can provide sometimes reliable inference. But I don't think I need to tell you that now we're past this simple regression, that we're using deep learning, gradient boosting, random forests, a lot of techniques that have become very popular sometimes in combination. And now this becomes so complicated that it's very difficult.
Starting point is 00:03:47 And we use the term black box to refer to algorithms that are so complex that they will resist analysis. There are, of course, a lot of theoreticians who try to understand what's happening in the black box. Steve McLaughlin Thank you. Wonderful explanation. It's a whole new universe of statistics, it seems like. Absolutely, but it doesn't mean that we have to trash what we've done so far.
Starting point is 00:04:10 What my research group has been doing and what a lot of groups are doing worldwide at the moment is to try to get the output of these black boxes and treat them as statistical objects. And so we see a whole branch of statistics that is reasoning about the output of these black boxes without making any modeling assumption so that the results of analysis can be trusted, so that we can quantify uncertainty, so we can make reliable decisions. And so all the stuff like the p-values and the confidence intervals, they are present in one way or the other. The concept of p-values is essentially a measure
Starting point is 00:04:50 that quantifies how surprised should you be by a certain experimental outcome. And in the context of black boxes, if a black box makes a prediction, I can still ask how surprised I should be from this prediction. And so I need to be able to quantify the element of surprise. So I would like to be able to transform the prediction into what you refer
Starting point is 00:05:11 to as a p-value so that I can actually calibrate what comes out of the black box. So surprisingly, we do not have to abandon what we've been doing. Mostly we're moving towards a world where there are less parametric models, but the concepts of having outcomes that are well calibrated of quantifying uncertainties is still there. It's so interesting. I really like the way you put that, that it's sort of like the black box can stay black. We don't have to look underneath or inside the model to make sense of what's going on analytically. So it's like we are taking the old methodology of the old desires of traditional statistics to quantify uncertainty and rebuilding the theory for this new world of these black box
Starting point is 00:05:59 models. Absolutely. And actually this rebuilding of the new world comes in many different flavors, but I'll give you an example. Let's imagine a world not too far in the future where people apply to colleges, for example, and because colleges say, we see so many applications, we're going to outsource at least part of the decision process to a black box. And so let's say that now students apply to Cornell to your home university and that somehow you decide to predict how well they will do at Cornell using a black box, right? And so the question is how calibrated are
Starting point is 00:06:33 these predictions? But what you can do is you can say well I've trained my model and now I have reserved a set of students for which I know the outcome and I can actually see how the black box predicts these outcomes. And now I can try to understand for what kind of students is the error large, for what kind of student is it low, what kind of accuracy do I get from this black box. And from that you can calibrate now when you're going to use a black box to perhaps screen a few candidates. Because you have observed the mismatch between the black box prediction on the test set,
Starting point is 00:07:12 then you can understand a bit the accuracy of the black box and what you can actually conclude. By observing the outcome of the black box on a group of students for which you have the outcomes or labels, you're able to actually produce not a point prediction of how the students will do, but you can obtain a prediction interval that has a probability of containing the true performance, a prescribed fraction of the time. And when I say this, I say you do not have a model, you do not have a Gaussian distribution anywhere in sight, you're only using the fact that you draw students at random, you look at what the black box
Starting point is 00:07:50 does on the random subsets, and you use these observations to actually generalize to unseen students. And it's very much a statistical spirit, which is you collect the data of features of students applying and what the black box says about these students. You're learning from this to be able to say things that are valid into the future. Good. I really want to unpack this example. It's so provocative.
Starting point is 00:08:18 The language of features and labels, I think, is a little abstract maybe. So let me see if I get what you're saying. But so if I imagine a cohort of high school seniors applying to Cornell or Stanford, your institution either way, features might be things like their high school GPA, whether they played on a varsity sport, whether they are African-American or Latino or male or female, all kinds of things like this. You would call those features? Yeah, these are features. These are essentially what's in your application file. So these are what you know about the applicant that can be sort of digitized.
Starting point is 00:09:01 But I think we live in a modern world now and so a feature might be your essay. Because your essay will become a string of numbers that's the revolution around large language models. And so that is also a numerical feature that you can use to predict how well do you write English? What is the richness of your vocabulary? You know, there are lots of things you could use. Yes, but in terms of what the college might want to predict, just to make it simple, what if we said we want to predict the GPA upon graduation of the student? Or it could be even more simple, will the students graduate in four years? And so in this case, let's look at your first example, you want to predict the GPA after two years of undergraduate education.
Starting point is 00:09:43 Let's look at your first example. You want to predict the GPA after two years of undergraduate education. And I can say what does a black box say about these students? And so by looking at sort of the distribution of errors, that is a difference between the true GPA of the students and the black box predictions, I might have a sense of the typical errors that the black box makes on a random student. And so when a new student comes in, I have a sense of the errors that I'm going to suffer. And I can sort of, instead of giving you just a point prediction,
Starting point is 00:10:15 I might give you an interval that likely contains the true outcome of the student. And to our surprise, it might be that for some applications or some students, this interval is short or fairly confident of how well they'll do. And for others, it might be wide. In this case, what would the interval be in this example? One interval might be, I'm predicting 2.9 to 3.9, so the centers are on 3.4. And the other one, I'm predicting 3.3 to 3.5. And the interval is much shorter. They have 3.5 and the interval is much shorter.
Starting point is 00:10:46 They have the same center, the prediction is the same, but the range is very different. And so if I'm an admission officer, this is something I'd like to know about my prediction engine. How accurate are you? What level of uncertainty is associated with your point prediction? If we're doing finance and I have an investment strategy and I say, Stephen, I can promise you a 6% rate of return. There's a very different scenario between 6% plus or minus 1% and 6% plus or minus 10%. You might lose a lot of money and a lot of customers if you're in the second category.
Starting point is 00:11:22 All right, good. So this example that you've given, either in the context of finance or GPA, really does help underscore why we care, not just about means or what we might call point estimates, but also intervals within which we might have high confidence. I mean, anyone can see, I hope, how valuable it is to be able to make predictions of intervals, not just numbers. So if we could, I'd like to move now to another real world example outside of the collegiate setting having to do with election forecasting. Just to be clear for our listeners, we are recording this podcast a few months before the 2024 US elections, but this episode we predict, if you'll pardon the pun, will air sometime right
Starting point is 00:12:07 in the aftermath of the election. So I'm sure this is something very much on the minds of our listeners. I know you have worked in this area and your students also. The question is, what insights can you give us into some of the complex models that are being used to forecast our elections? So, perhaps first I should be clear, I don't have really firsthand experience with forecasts of elections.
Starting point is 00:12:29 I'm working with students at the Washington Post, where there's a data science desk, and they actually do the work. And I'm just going to be a messenger for this part of the conversation, if that's all right. I would like to give some credit to the young people who are involved in this. And I also feel like you may be a little bit modest, as is a nice quality that you have.
Starting point is 00:12:49 But isn't it true that Lenny Bronner and Stanford undergraduates who were working, I mean, at least in Lenny's case for the Washington Post, didn't they build on some of these techniques that you helped develop? That is true. But as you know, when you actually work in the trenches on something of consequence, such as predicting the outcome of an election, even though the general principles are in some of the papers we wrote, there's still an enormous amount of work that they've done to make it all work. Okay, good. Thank you. So what a news organization
Starting point is 00:13:24 will try to do essentially is some poll closes and some precincts are reporting and some counties begin to report. In fact this is a very cool problem because the ballots are already in the box so to speak and you have not opened the box yet and like you know what's in there and a lot of the statistical work that is ongoing, for example, at the Washington Post, which is the organization I know best, is they're trying to predict unreported counties. And so instead of giving their viewership a point estimate of, well, Santa Clara will vote this way, you can tally up the forecast for unreported counties, aggregate them at the state level, and have a very nuanced picture of how California will vote.
Starting point is 00:14:11 Now, how is this done? So, obviously, we're going to need to predict how counties are going to vote, and this is going to be based on a lot of features. Is it a predominantly urban county? Is it a rural county? What's the level of education? What's the socio-economic variables associated with the counties? And most importantly, how did the county vote last time? And so you're using all these features, you're trying to learn a model that can predict accurately how counties are going to vote. And that's black box if you will except that they use models that are not too complicated from what I've seen that
Starting point is 00:14:48 are fairly simple but then the second part is a calibration because you cannot just go on air say oh California will vote this way when in fact it's just a point estimate this has enormous consequences if you get it wrong. And so what they will do is they will report a range of possible outcomes for the state of California that is dynamically updated as the election goes along that reflects truly their knowledge about what they think will happen when the vote will have been completely tallied. And so it's very cool what they're doing because they're really projecting errors, they're projecting uncertainty. And you can see that their uncertainty band, of course, narrows as more and more counties are becoming reported and they're fairly
Starting point is 00:15:39 faithful. They are back testing them. As we say in the field, that they are saying, okay okay let's see how this model would work in 2020 and they want to make sure that the intervals that they project contain the true label the true votes the prescribed fraction of the time and so it's all engineered very well and I think kudos to the Washington Post to being so respectful of their readership in not just giving you point estimates, but a real sense of accuracy of their point estimates. Now, just to be dead clear about this, we're not talking about forecasting the election based on polls a year in advance or anything like that.
Starting point is 00:16:21 This is election night forecasting based on the results that are coming in. Exactly. So the reader has to imagine that basically there are ballots in a box somewhere and the only thing is that the box has not been opened yet. But I've seen similar boxes open elsewhere in other counties, other precincts, and I'm going to use this knowledge to make a prediction about what's in this box. And I'm going to use this knowledge to make a prediction about what's in this box. And it's going to be a very well calibrated prediction following the principles we laid out earlier. And you do have the right to use polls as features, as predictive variables in your
Starting point is 00:16:59 model. I suppose you could. I think a lot of people out there may be skeptical of polls. We've seen how difficult it is to do polling, but then again, the model may take that into account. Maybe it doesn't assign much weight. Exactly. The model will take this into account. Now, what's kind of tricky about polls is that polls might be different in different counties. Typically, when you fit a statistical model, you'd like the features to be the same for all units in your data set.
Starting point is 00:17:25 So going back to the example we had about college admission earlier, right? Everybody has a high school GPA. Everybody has a yes-no answer to are you on a varsity team? And so what might be tricky regarding your poll to use it as a feature is that some counties might have it, others may not. And so you have to be careful about this. Good. All right.
Starting point is 00:17:47 Let us take a little break here and we will be right back. Welcome back. We've been speaking with Emmanuel Candes about statistics, prediction models, and the inherent uncertainties in them. So let's move along to another real world example. I'm thinking here in the context of medical applications of prediction models, drug discovery, that are of course very important with life and death consequences.
Starting point is 00:18:13 So for example, there's a move to generate artificial data using artificial intelligence to increase our sample size. That sounds kind of hard to imagine that that could work, but apparently it can be a helpful strategy. So what you're asking is very very interesting and I think you're touching again on the future of statistical science as a discipline. Statistics has always been an empirical science that tries to make sense of the world around itself and so now we're dealing with Gen. AI for, or extremely fancy machine learning algorithms. So to understand drugs, we started in vivo, like we would just inject people
Starting point is 00:18:51 with stuff. Then we did this in in vitro. And now we're moving in silico, as you point out, right? Which is that now we want to use algorithms to make predictions about what drugs will do. And so let's say you're a big pharma company then you're sitting on a huge library of compounds and it can be 400 million 500 million and you would like to know which of these compounds will actually bind to a target. So what do you do? Well, you should take your compound one by one and experiment whether they will bind to your target. But as you can imagine,
Starting point is 00:19:37 this takes an enormous amount of time and money. So now people are using machine learning to guess whether they will bind. And in the past few years, we've seen things like alpha fold. So now people are using machine learning to guess whether they will bind. And in the past few years we've seen things like alpha fold. We've seen a lot of models that try to predict the shape of a compound from just sequences of amino acids, for example. Now that will not replace physical experiment, but what machine learning does in this instance
Starting point is 00:20:02 is going to prioritize the compound that you should try first. One of the things we do in this area is to say, okay, we're going to train some extraordinarily fancy models. And they're really black boxes. I mean, they're so complicated, I have no idea what they do really. But they produce an affinity score, an affinity of a compound for a target disease. And I say, can I trust this? And so without any statistical model, just looking at what the algorithm predicts on molecules on which it wasn't trained, we were able to select data adaptive threshold, if you will, that says that if you select all these
Starting point is 00:20:41 molecules whose predicted affinity is above this threshold, you're guaranteed that 80% of what I'm giving you is actually of interest to you. Downstream, you will do some real experiments on some real things, but here what's very exciting is that AI can really speed up the prioritization of drugs that should be passed onto a lab. No, it gives a whole new meaning to the concept of an educated guess. These are now brilliantly educated guesses. They still have to be tested, as you say.
Starting point is 00:21:10 They still have to be tested. Now, there's another thing which is perhaps this time a bit more scary, which is that what if we use gene AI to build what people might call digital twins, things that are not physical but can be generated by generative AI. And so here there's a new line of research. So for example suppose I want to study statistical properties of some drugs, right, and the problem is I have too few samples. Let's say I want to kind of estimate which fraction of drugs have a certain property. and the problem is I have a lot of sequence of amino acids for which I have not measured the property and as you can
Starting point is 00:21:51 imagine the tendency is to use a predictive model a black box and replace the real measurement with a prediction and then pretend that it's real data. And then average now these predictions and say that's the overall fraction of drugs that have their property and that's wrong because this method introduces biases. We want to use this predictive model. We want to use GenAI to filling missing data to possibly create new data sets. But at the same time, we need to understand how we can remove the biases to draw conclusions that are scientifically valid. Let me give you an example. Let's say I want just to estimate the mean of a random variable. Let's call this y. And I have some feature, let's call them x.
Starting point is 00:22:44 And so what I could try to learn is I could fit a model to predict Y from X and now I can replace the true labels a true amount by the prediction when I don't have it and I could average those but they're gonna be biased but guess what I can remove the bias because I have an estimate of bias from the label data you gave me. And so if I do this correctly, I can effectively augment the sample size enormously. If my prediction has reasonable accuracy, then it's as if I had a sample size which is much bigger. And so the level of accuracy of what I can tell you is much higher.
Starting point is 00:23:29 Well, I can't resist asking you since it's such a rare treat for us. You're very well known for contributions to an area that people call compressed sensing. And I don't know if it exactly fits into our discussion today, but I feel like I want to ask you to tell us how does compressed sensing and maybe its applications to medical imaging, to MRI or other things, does that fit into the framework we're talking about? And even if it doesn't, could you tell us a little about it? It doesn't fit directly. I think compressed sensing is the fact that sparsity is an important phenomenon. So what we're seeing at the moment is people measure everything under the Sun because we don't know ultimately what will matter, right? And so
Starting point is 00:24:16 we need people like you and me to sift through what matters. What compressed sensing says is that if we measured a lot of things but if only a few things matter and if we use the right algorithm of the kind suggested by compressed sensing theory then we should be able to build a very accurate predictive model. Like we will understand that a lot of variables have no business in predicting the outcome and it will quickly focus on variables that have something to say about the outcome and then build a good predictive model from then on. So you've been using the word sparsity.
Starting point is 00:24:53 In this context, does it mean all those variables that don't matter, we can effectively set their contribution to zero? Exactly. So it's saying that in this case, just for our audience, it might say that even though I measured a million genetic variants, the distribution of the phenotype does not depend on this million thing. It may be dependent on 20, on 30. That's sparsity. And so the question that compressed sensing asks is that when something depends on a few but unknown from a long list how do you go about and find them? So the technique or the method will identify which are the key 20 or
Starting point is 00:25:33 whatever small number it is? Exactly, exactly. Let's think about this as almost a matrix problem, right? So I have a matrix, it has a million columns because these are all the genetic variations. And then I have a response, Y, and these are the rows of this matrix. If I want to solve a system Y equals AX, like which genetic variations matter to predict Y, well, classical theory will say that, well, I need as many people as I have unknowns. But compressed sensing theory says, no, that's not true. Because if you know ahead of time
Starting point is 00:26:09 that only a few of these genetic variations matter, then you can deal with fewer people. And that's why we can develop predictive model for phenotypes that do not need a million patients. It's wonderful. One of the big issues that seems to be everywhere in science these days is a crisis of reproducibility. And I just wonder if you have statistical comments for us
Starting point is 00:26:34 about that. Yeah, it's very interesting that you asked this. I think, first of all, I will make an observation about the reproducibility crisis. It occurs at a moment where people have enormous data sets at their disposal, usually prior to the formulation of scientific hypotheses, access to extremely fancy models that depends on billions of parameters. And so I would say to start with that it's not a
Starting point is 00:27:01 coincidence that this crisis occurs at this time because I give you a data set, you believe it's gold, you're going to try a model, it doesn't pan out and you're going to try something else. And so you're fine-tuning parameters, you're fine-tuning a lot of things until something clicks and there's nothing wrong with that. But I think what we need to do as statisticians, and there's a lot of us that are working on things like this, is how can we build safeguards around the freedom you have in selecting models, parameters such that at the end of the day, the discoveries you claim have a chance of being reproduced by, let's say, an independent experiment.
Starting point is 00:27:43 The statistical community is developing a lot of methods so that when you think you have something, you really do have something. And so this is a very exciting moment for the field to develop methods that are not really quantifying the uncertainty in your prediction but actually calibrating in such a way that when you report findings, we make sure that a good fraction of what you're reporting is correct. Well, I would like to sort of now back out to a broader, like even societal scale, to think about education just for a minute. Every learned or educated citizen
Starting point is 00:28:21 should know something about the ideas of probability and statistics, including in their modern incarnation that we've been talking about. And I wonder if you have thoughts about this, what we could be doing as either educators or communicators to promote greater statistical savvy. That's a good question. I think what I see at lower levels of statistics teaching is a reliance on formulas, you know, which formula should I apply when, and I think that's not helpful. As a student, I learned, of course, mathematical reasoning, and that was important. And then a bit through high school and college, I learned physical reasoning, and that's distinct from mathematical reasoning, and it's extremely powerful.
Starting point is 00:29:09 But in grad school at Stanford, I learned about this new thing called inductive reasoning, which is neither of the first two. And I think we need to be doing a good job at teaching this at an early stage. What is inductive reasoning? It's the ability of making generalization out of particular observations. And how do we do this? Okay, so I would promote an approach which is not too mathematical in nature, which is trying to make kids understand how it's possible to generalize from a sample to a population to individuals we haven't seen
Starting point is 00:29:41 yet. And what makes this possible? There's a bit of a tension between fields. Should we go towards more mathematics, or should we go more towards CS and where AI is mostly taking place? I think there's a danger of losing the ability to reason statistically if we go either too much towards math and too much towards CS. It might be a bit abstract what this is but I find a statistical reasoning extremely powerful, extremely beautiful because I don't want to talk about it in general. I'll give you one problem and it's a famous thing that happened in the 30s, I think. I think Corbett was studying butterflies and he went to Malaysia
Starting point is 00:30:28 for a year. And he was a very conscientious man, so every day he would try to observe species of butterfly. And he wrote in a notebook, these species I've seen once and these species I've seen twice and these species I've seen three times and so on and so forth. And he came back to England and he approached one of the founding fathers of the field, R.A. Fisher, and he asked, if I go back to Malaysia for six months, how many new species am I going to see? This is the kind of question, it's different from math. The answer is not in the question. And I don't think deep learning can be very helpful. And that's what statisticians do.
Starting point is 00:31:14 And this is a very modern question, which is that you have a lab and they're looking at cancer cells. And they're going to do exactly the same thing. This is how many cancer cells I've seen once. This is how many cancer cells I've seen once. This is how many cancer cells I've seen twice. And they say, how many cancer cells I have not seen yet. And if I continue looking for cancer cells for six months or a year or two years, how many new types am I going to expect to see? So this is what you learn when you study statistics. And I find it fascinating. Oh, well, that's just great.
Starting point is 00:31:47 It's really interesting to hear about the culture of statistics, how it's distinct from that of math or computer science. Because nowadays with the rise of what people are calling data science, there's a kind of muddying of the waters. Who owns statistics? Why are we calling it data science? Why isn't it statistics? I'm sure you have an opinion about this.
Starting point is 00:32:07 Of course, because there are lots of activities in data science that you will not find traditionally represented in the stats department. So I have a colleague, Giorgi Laskovic, and he's a very recognized data scientist. Cornell PhD. Exactly, exactly. He's a brilliant person.
Starting point is 00:32:24 And so when COVID hit, people were calculating this beta number, like the model where you're susceptible, exposed, infected, recovered, and you have these differential equations, and you know, and if the beta number is greater than one, we have a problem, stuff like that, right? And so this is a very macro model. And what Yuriy Laskovitch did was created an enormous digital data set. He tracked 100 million Americans in all major US cities. And so he would see where they would go during the day, where they come home at night. And so instead of feeling like the epidemiological model everybody knows at the global scale,
Starting point is 00:33:05 which doesn't really make sense because the behavior in California and the behavior in Florida were very different, then you can fit it at kind of nodes on a graph. And so you're going to fit a model which is adapted to the mobility of the people where you are. And that is data science because what Uri did, which you will not see in a stats department, is it basically tracked 100 million people for a few weeks. I would like to claim that I have some colleagues in the stats department who do something like this, but I cannot name any. And that is modern data science. This is not something I typically see in a stats department. So my position is quite clear on this. Data
Starting point is 00:33:46 science is much bigger than the traditional field of statistics, but statistics is one of its intellectual pillars. Oh, I'm so glad I asked you about that. I hit a gold mine with that one. But all right, you've already expressed your fascination with statistical thinking. Is there something in your research that brings you particular joy? Yeah, I think so. My job at Stanford is unique in the sense that the students I get to work with are phenomenal. I feel that it keeps me young.
Starting point is 00:34:20 It keeps me alert. I don't fall asleep because I just have to catch up with them all the time. And I feel that it's strange to say this on air, but I'm going to age better because of this, because like mentally, physically, they keep me fit. And it's a joy to see them develop, become great scientists. Last year, I had two former students who received the MacArthur Fellowship in the same year, so the students I've got to work with are tremendously accomplished, and so it's just a privilege. It's a privilege to feel so much energy, so much enthusiasm for the subject, and selfishly I would say that it's good for my health.
Starting point is 00:35:06 Well, thank you very much. It's been really fun to talk to you. We've been speaking with mathematician and statistician Emmanuel Candes. Thanks again for joining us on The Joy of Why. Thank you for your time. It's been a pleasure. your time. It's been a pleasure. Thanks for listening. If you're enjoying The Joy of Why and you're not already subscribed, hit the subscribe or follow button where you're listening. You can also leave a review for the show. It helps people find this podcast. The Joy of Why is a podcast from Quantum Magazine, an editorially independent publication supported by the Simons Foundation.
Starting point is 00:35:53 Funding decisions by the Simons Foundation have no influence on the selection of topics, guests, or other editorial decisions in this podcast or in Quantum Magazine. The Joy of Why is produced by PRX Productions. The production team is Caitlin Fault, Livia Barak, Genevieve Sponsler, and Merritt Jacob. The executive producer of PRX Productions is Jocelyn Gonzalez. Morgan Church and Edwin Ochoa provided additional assistance.
Starting point is 00:36:21 From Quantum Magazine, John Rennie and Thomas Lynn provided editorial guidance with support from Matt Carlstrom, Samuel Velasco, Arlene Santana, and Megan Wilcoxon. Samir Patel is Quanta's editor-in-chief. Our theme music is from APM Music. Julian Lynn came up with the podcast name. The episode art is by Peter Greenwood, and our logo is by Jackie King and Christina Armitage. Special thanks to the Columbia Journalism School and Bert Odom Reed at the Cornell Broadcast Studios. I'm your host, Steve Strohgatz.
Starting point is 00:36:58 If you have any questions or comments for us, please email us at quanta at simonsfoundation.org. Thanks for listening. from PRX.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.