Decoding the Gurus - Open Science, Psychology, and the Art of Not Quite Claiming Causality with Julia Rohrer

Starting point is 00:00:29 A podcast or a psychologist and an anthropologist of sorts. Look at online gurus and assorted weirdos. But occasionally we don't do that, and we sometimes try to speak to people who we think are actually doing good things and have interesting approaches to topics. And so we have an interview today and a guest joining us,

Starting point is 00:00:56 who is Julia Rohrer, and an academic Matt and I are both a fan of who works in the Wilhelm Wund Institute for Psychology at Leipzig University and has been active in open science movement for as long as I have paid attention to them and now more recently does a lot of work about causal inference so we are having you on julia not as an intervention for emerging guruness but we're all out to model that not all academic sort of the terrible bastards that the gurus keep saying they are. So yeah, thank you. Thank you for having me.

Starting point is 00:01:34 And I was a bit concerned when you talked about online weirdos, and this introduction would go into another direction. I know, it's a perfectly accurate introduction. Thank you. So, Julia, you've been working in causal inference and the whole area, I guess, reforming psychology, educating psychologists, trying to get us to do better methods. Do you still feel optimistic and excited about the future of the field?

Starting point is 00:01:58 or are we wallowing and beyond salvation? You're already starting with a mean question. I think maybe like 10 years ago, I wrote a blog post when I was very much into open science, how like the field wasn't doing great, and I was still very optimistic about the future. I'm not sure how optimistic am I am. I think we are like making good progress.

Starting point is 00:02:27 I'm like, as I'm getting over. I think I see more maybe fundamental conceptual issues that is also related to the causal inference stuff that does make me sometimes wonder like, right? I see all these manuscripts that get submitted to psychological science. I'm not entirely sure we all know what we are doing here and why. So I'd say it's mixed, but it depends also on the angle of the other person where they come off as optimistic or pessimistic. I think my coherent structure for this is that the causal inference

Starting point is 00:02:58 stuff can be blackbellings slightly. But whenever I've heard you, Julia, talk about the open science movement, you know, like you say, it depends on the framing and the audience. Well, you want to frame it positively or negatively. But for those in our audience, I think most people would understand the replication crisis and the concern about methodological reform in psychology and other sciences and social sciences. But it's being now, like 14 years since Darrell Bems, building the future paper and the open science movement now is like a standard fixture. Badges are attached to lots of journals and stuff. So in terms of open science, setting aside causality for mental inference,

Starting point is 00:03:46 how about that? Is there cause for optimism there in the spread of open science and methodological reform? Or is it similarly a deal of half measures? I would be more positive on that side, I guess. So it has been like 14 years. So actually I started like my undergraduate degree when all of this started in 2011. And I think there has actually been a lot of changes to training, to practices, to also what's just the default norm. So if you're now like, I'm not going to show you my data.

Starting point is 00:04:19 That's none of your business. That is going to raise eyebrows now. And I think that is huge progress. And sometimes it's easy to forget that. because you would think, oh, by now all data get shared by default if it's possible and so all. And that is not the case. However, we now have a norm that people understand that data ought to be shared, and it's normal to request that, and it's okay to check and so on.

Starting point is 00:04:42 And if you compare that to other fees, I think, where it's still like, oh, why would I give my data to anybody else, right? Like, this is not how it works. So I think there have been, like, huge shifts in norms, and it's easy to forget about them because I think it's also kind of easy to forget how bad things used to be. And so sometimes there's like a really bad paper where you can just tell, okay, they tortured the data until they found something across 20 studies. And the hypothesis is why to begin with, right? And then we're all like, oh yeah, let's party like it's 2010.

Starting point is 00:05:12 Because we know that this doesn't like fly anymore at the good outlets at least. And so I do think there has been like huge progress and it's hard to donate. It has taken 14 years. So the question is whether that's quick or slow. I don't know. I mean, it has been a large chunk of my life, but I think it's fairly like fast because scientific norms are incredibly sticky and slow to change. I think there's one small cause for optimism there, and it's a very personal one, which is that I think for me, one of the obstacles to sharing the data and also making the research paper fully replicable in sharing all of the code that produces the key results and that triggers and so on, was just that it was just a lot of extra work to.

Starting point is 00:05:57 to document it and make sure it's all presentable rather than the sort of hacked together thing that most scientists do. But I've actually found that a little bit of help from the old AIs there actually takes a huge burden off. They can quite quickly create, you know, very presentable scripts from your things. You can check for yourself that, yes, it's produced the same results. And boom, done. It doesn't have to be a huge job, right? Yeah, I think the technology has gotten a lot more accessible. So the open science framework is part of that as well. But I think also like training has shifted a lot. So I think like a famous Leipzig anthropologist Richard McAryth sometimes likes to talk about how when he was like a student, you know, you could maybe like run a T-test

Starting point is 00:06:43 or something like that. And now I think the next generation really does have a much higher skill level also when it comes to all things, computational stuff. And so and I can see that like during my own career. Like I learned no art during my studies and now, and then at some point we started introducing it and the master students we had to teach them. Now they are taught by default in statistics. And so the level just keeps rising and a lot of things where maybe some years ago I would have said, well, now students don't need to bother with stuff like version controls, that your code is fully documented in all steps. And other students are asking me, like, how does this good thing work? Can you explain that to us and so on? So I think that is really accelerated. And now, of course,

Starting point is 00:07:25 them. So I struggle a lot, for example, with version control, which ensures that you can reconstruct how you changed your code over time and so on. And Git is kind of very confusing, but it's the industry standard thing to do that. And that is made so much more easy by LLMs that can just like answer your stupid questions in a manner that online forums just can't because people tend to presume more knowledge than I tend to have. And so it's become so much easy. And you can see that is, of course, I'm like a big part of it, making it just easy. And if it's easy, people, we'll do it. It's a big challenge, though, isn't it?

Starting point is 00:08:03 Like, in terms of teaching psychologists and social scientists, even, God forbid, anthropologists, how to do the technical stuff, notably statistics. Because what happens still where I am is that students get taught a, you know, a grab bag of recipes, including tea tests and multiple regression and so on, that are kind of, you know, turn the handle and get the answer. Assumption testing is taught like a checklist. And you know the complaint. It's not taught as it would be taught to a mathematical statistician.

Starting point is 00:08:36 But of course, it can't be taught properly because these are social scientists, right? Their main field of study is psychology or something similar. And the statistics and the technical stuff is like an extra thing. And it's probably not, in many cases, it's not the skill set. that they may, you know, have, or it's not the passion that drew them to the field of study. So I feel like we've got a challenge there in terms of how much people can be taught. Well, what's your take on that? So, first of all, I'm not sure whether mathematical statisticians are trained in a manner

Starting point is 00:09:12 that they are better prepared to actually apply statistics to answer substantive research questions. Oh, that's true. So sometimes there are those, like, studies with completely wacky model, fitting, whatever. And then they actually had a mathematician, right? And it's like, yeah, I mean, the mathematician is like the one responsible for making the mathematics, like, work, but it doesn't mean that they make sense. True.

Starting point is 00:09:34 And but, yeah, in any case, I still hear about that a lot. And in particular, so it's early career research are starting to teach and they learned all these things. Oh, wait, it's all a linear model. So we don't need to teach these 100 things and flowcharts and so on. But then they are in a context where you have to stick to a certain curriculum. and it's like set up in a certain way. That being said, I think this is changing quite a bit.

Starting point is 00:10:00 At least in Germany for psychology, I can observe it that a lot of the courses have shifted the structure. So it's less like, I mean, we're still teaching T-tests and so on here in Leipzig, but there's always like an eye on, okay, now that's the broader model and that's the broader model. And this is how to think about it. And this is what P-values really mean and so on. So I think it has improved a lot. I also think there is, for example, I'm teaching. research methods for undergraduates right now.

Starting point is 00:10:26 And this is a distinct lecture from statistics, but it's the same course. And there aren't great materials actually to teach that, I would say. So it's a lot of like, okay, so here is a list of features of experiment, and here is a list of this, and here is a list of that, and so on. And I kind of feel like we are like missing like a cohesive perspective on that. So for example, when I teach my lecture, it's essentially all causal inference. I mean, in the beginning, there is some, there is some, like, theory of science and so on, which is a fun part.

Starting point is 00:11:00 But then it's a lot of, okay, so we need a causal angle on the world for everything, not just if you want to make causal claims, but for example, if you just want to document, oh, how many percent of people are depressed, you still need to kind of understand, like, what generated the data, like, what affects how people respond to the questionnaire, what affects whether they end up in your survey in the first place, right? like depressed people are maybe less likely to participate in service and so on. And so I am personally trying to build something where it's like kind of like it's all like that causal perspective and you draw like a little graph and you write down the things in the world that you think that matter and how they affect each other and try to reason with that. But it's not been like spelled out anywhere in a textbook. So this is just like making it up as I go. And I think if there was a very nice textbook that taught this approach,

Starting point is 00:11:50 I think people would start using it because people like. using things that just work out of the box. So there is some stuff missing there. I also do see a lot of good statistics teaching, actually, and I actually think that statistics teaching is on average way ahead of the practices of the papers that I see submitted to journalists. So I actually think like the other parts may be lagging behind more than teaching. I find that when I went through like statistics instruction,

Starting point is 00:12:21 as a graduate student because I transferred from anthropology. I did a second master specifically to do like basic quantitative analysis. And the course that I took there, they were fine. They were what Matt's describing the kind of old approach, right? Here's how to do a novas, here's how to do T-tests. And then after I finished my PhD or during it, I got interested in, you know, open science and so on and took Daniel Lekins online and other resources that were online, like online material. And probably if I didn't have the foundation of being forced to sit the statistics lectures,

Starting point is 00:13:00 it would have been more of a struggle. But I did kind of go back then and understand what I had been told because I was putting it in the context of like here is basically like causal entrance and like scientific theory around things. And I teach research methods and intro stats now as well. And I know this is getting to the causal inference part a little bit. But I find that in teaching students about papers, not design, but like reading the papers, in almost all psychology papers, you can write like X to Y as the that's what they're doing in the paper. Whether they say it's causal or not, like that is what.

Starting point is 00:13:47 there and students learning to identify what is the X to the Y being claimed. And then they've operationalized that in like a very specific, sometimes in seeing way. But in my case, I'm teaching psychology students and sometimes they haven't done that, even though they've been through foundation courses. So it's kind of interesting because they have all this knowledge about psychology, but not very much about the building block approach, I guess I would put it out. And this is in Japan. I'm in Japan, not the Frosciate on the rest of the courses in Japan. It could just be the students of high courses. But yeah. I think there's a very funny thing happening there. So I teach first

Starting point is 00:14:34 year undergraduates, right? They are not even learning inferential statistics yet. It's like next semester and so on. And for them, when I'm like, yeah, and usually people are interested in how things affect each other that is like the bulk of research. It's not everything. It's maybe not even the most important part, but it is a large of part of the studies in psychology. And so, of course, you're interested in how this affects that and that is your research question. Yeah, yeah, sure, of course, of course. Of course, that's what we're doing here. It makes perfect sense to them. But then if you have like master students who have spent like years immersed in that literature where you're like, oh, no, we are not interested with the X effects. Why? We are interested

Starting point is 00:15:11 and whether there's contingentries between subject changes in X on Y and so on, they get really confused. And I mean, they mostly just get confused. But some of them go so far as to absorb that this is how you talk about these things and presume that these are like meaningful research, questions about intra-individual contingencies and so on. And it's kind of funny.

Starting point is 00:15:35 So I think it's something that comes very naturally to you if you haven't been trained for a very long time that you do not talk about causal effects unless you have an experiment, then it's fine, right? But it also leads to funny things. So there was recently that paper about the effects of multilingualism, and it was a bit of a mess. So it supposedly showed that multilingualism

Starting point is 00:16:00 led to slower biological aging, and there were many, many things wrong with that paper, among of them that they didn't even know whether people spoke multiple languages. So in the end, it was just the country. It was a paper that found that people in Luxembourg are healthy, because that is the data point that drove everything. But in any case, so they even said like something, right, so everything was obviously interpreted causally, you know,

Starting point is 00:16:27 subsequent reporting as well. And then they had a sentence in there, oh, that future randomized controlled trials are needed. And if you just pause for a second, like, What is the randomized clinical trial to investigate the effects of country level multilingualism? Are we going to randomized countries and some of them become Luxembourg's and others don't? And it's just like if you think about it for a second, just as like a naive person, you're like, wait, how does that? Well, what precisely do you have in mind here, right?

Starting point is 00:16:58 Like, is that multilingualism what you would randomize here? Like, how's that supposed to work? But if you spend a lot of time in psychology, you will just be. But yeah, of course, you know, randomized clinicals, like the randomized control trials will solve that. And they just read it, like they added to every paper and I see this so often. And as a review, I always have to say, like, could you please spell out what you have in mind here? Because usually the paper is not a randomized control trial because the thing they are interested in it cannot be randomized. Yeah, yeah.

Starting point is 00:17:26 Part of what you're speaking to there is, of course, you know, about causal inference and so on. But also part of it is the inculturation that happens in a discipline like psychology. Like we learn the correct phrases. We learn that this is the way you normally lead an introduction. These are the caveats you make in the limitations. And the danger, I guess, is that people just at some point stop thinking and say all the right things like a catechism, like a religious riot. And the whole system sort of encourages people to do that, right?

Starting point is 00:17:56 Because if you don't make those obligatory sort of statements in some shape or form, generally reviewer two will smack you over the wrist. Yeah, and so I'm trying to kind of work against that as a reviewer and as an editor, right? So I will be the one pushing back. But then sometimes I give introductory talks on causal influence and people are like, okay, so how am I supposed to deal with that in my paper? Right. And what are the reviewers going to say? And then I can just say, okay, so it really depends on who that reviewer will be.

Starting point is 00:18:31 And so some reviewer just wants you to add that one sentence. This was not a randomized experiment, so we cannot interpret it causally. Here's our causal interpretation anyway. And then there's reviewers like, Mejou, will be like, well, this is just inconsistent, right? Please instead spell out your assumptions. I think there are ways to balance that and leave everybody happy, including the one who's just going through the checklist. But it is more effortful than just working off the checklist and adding that sentence, that it wasn't randomized. so future longitudinal and experimental studies are needed.

Starting point is 00:19:08 And then nowadays it's also like you need to say that your findings won't generalize to weird populations and so on, which everybody knows, of course, but you still need to edit. And so I think it's a bit like different people are pushing into different directions and it's always the easy way out to do what, like just at the boilerplate and so on. I'm trying to pushing against it and to trying, like I'm trying to normalize, like write the papers as if you mean it. and actually stand behind what you're saying,

Starting point is 00:19:36 but it is a bit of a process, and I always feel sorry if people then end up having bad experiences. So this is like a systemic problem, I think, what the peer review process pushes people to do, even if they actually would like to do it better. Yeah, they get caught in the middle with the kind of, those are the standards of the field. They didn't make them,

Starting point is 00:19:58 and they're just like kind of trying to get along, right? Yeah. Yeah. And actually, I thought, Julia, when I've been talking about, like, people who haven't ingrained bad habits yet, whenever talking about basic causal inference, and a lot of people have heard, like, two things, even if it's their introductory course, is that causality doesn't equal causation, right? And then the second is you can't infer causality without experiments, right? Controlled experiments and randomization. And in those cases, I've asked people, people first that, you know, can you infer causality without experiments?

Starting point is 00:20:39 And they're like, no. And then how did we find out that smoking caused cancer? I mean, I know there are experiments in that literature, but the primary finding is the like longitudinal data of people who were smoking like epithymological studies, right? And then animal studies and whatnot. But you can't randomize people into like heavy smoking. conditions or not and see if they develop cancer, right, for ethical. Or similarly, the fact that the majority of scientists agree that a large asteroid collided with Earth, and that's why there

Starting point is 00:21:15 aren't so many dinosaurs except for birds around. But there's like a causal historical, like fact that is part of science, right? But it doesn't rely on randomized experiment. And most people seem to accept, oh, yeah, yeah, that is right. Like, you can talk about causal, things that have happened and you can discover things that are like causally related. But it's like social sciences and psychology has created a rule, which is, but not here. Not here. You won't be allowed to infer anything unless you have a controlled experiment. And, you know, there are some good reasons behind that.

Starting point is 00:21:54 But yeah, it seems like people are with examples able to override that heuristic or at least realize there's exceptions. So yeah, I think this is really like a training issue that what sticks with people is that correlation does not equal causation, which is technically correct, right? And that randomized controlled studies are a great thing. And I agree. I think they are the closest thing to magic we have in causal inference, which is amazing for many purposes, unfortunately not for everything.

Starting point is 00:22:29 And I think one problem with teaching that is so, I mean, obviously this causes issues for people who do non-experimental research because then it's like, okay, you're not allowed to make any causal claims and causal claims are what is actually interesting. So yeah, it's just tough luck. Your research is not going to be interesting or you have to come up with some workaround to pretend you're doing something else. But I actually think it's also bad on the experimentalist side of things. And that's because I do think, so to some extent, people say, start to believe that randomization will be sufficient to warrant any type of causal claim because the thing that matters for causality is the randomization. And this happens to that I think leads to all sorts of bad stuff.

Starting point is 00:23:16 So, for example, people won't understand that you can only get precisely the causally fact of the thing that you randomized. And the randomization might not necessarily do what you wanted to do. some people might not adhere to the instructions and so on. And that causes all sorts of issues. And then, for example, in psychology, there is a thing where you just kick out the people that do not follow your instructions. So you assign them to do something.

Starting point is 00:23:42 Some people don't do it. And you're like, yeah, okay, I'm just excluding them from my study. Because they didn't do what they are supposed to do. And there is very little awareness that the second you do that, you are actually no longer operating with an experiment. because you are essentially doing something depending on what happens after randomization. And that actually means it's no longer a randomized experiment.

Starting point is 00:24:06 So the one, I think the simplest example I can come up with. So you might be interested in whether reading an article about causal inference increases your well-being at the end of the day or maybe even decreases it. And so you randomize students to either read that article or do something else. And then you see that, like, I don't know, half of the students don't actually read the article. And then you just exclude those who did not do that. But excluding those who didn't read the article will probably selectively exclude those who lacked motivation and maybe had other stuff going on. So suddenly you're looking at, like, the most motivated students who had a good day who read the article to the control group that did something else.

Starting point is 00:24:49 And suddenly it looks like, oh, yeah, actually it makes people like super motivated and happy to read about causal infants. And that's just because you kicked everybody out who wasn't, right? And so this happens like a lot. This is called post-treatment bias because you introduce information about what happened after the treatment, like whether they actually followed or not. And that's a huge issue. And I think it's very hard for psychologists to wrap their heads around it. So sometimes the experimentalist like get some inkling and then they find out,

Starting point is 00:25:18 oh my God, this has huge implications for mediation analysis. But they lack the terminology because they've never been. trained in like systematic causal inference. So they notice that something is all. They can't quite put their finger on it and often they will just not notice actually. So I often have experimentalists who are at some point like, oh, but wait, so what does it mean if I exclude people who did not like pass the manipulation shake where we assume all the manipulation didn't work, but maybe these people are also different and so on. And so usually like people have some ideas. I mean, experimentalists are smart people, they work out stuff, it would be so much easier for them to work

Starting point is 00:25:58 it all out if we had ever taught them about causal inference in the first place. Now, this is probably an obvious question, but clearly there are other sources of evidence for causality other than randomized control trials or experiments. They may not be, you know, no single piece of evidence will probably be the smoking gun. It won't be definitive. but what are some other ways in which a researcher could gather evidence for causality? So yeah, that will strongly depend on the specific topics. But for example, since you already brought up smoking,

Starting point is 00:26:41 so when it comes to smoking, right, by now we have a good mechanistic understanding on what the cigarettes contain, what that stuff can do, and your body how it accumulates and so on. So this actually corresponds to a thing that is formalized in causal inference. So you can identify causal effects if you know all the mechanism. It's called front door identification. If you want to do it like fully worked out on paper, the assumptions are very strong and so on.

Starting point is 00:27:12 But every day we're life, we take a much weaker form of it. And this is that, okay, we know that this thing affects this other thing and this is part of a mechanism that leads to this and so on. And I mean, for a lot of medication, it is essentially like, okay, it seems kind of plausible that it could work here because we know how the body works. And I think one great example for that. And so there is a beauty YouTuber, like Labmuffin Beauty. And she has like full videos about like how industry influences and then there's peahicking and so on.

Starting point is 00:27:48 But why she still believes in certain things. like retinoles for anti-aging. And this is like no one. Understanding is very good of what these things do in the skin and so on. So it makes sense to assume that even if there are no randomized control studies for economic reasons, because there's no incentive to do that for like the general thing that works because then other companies can just snatch it up and so on.

Starting point is 00:28:11 So mechanisms are of course like one huge part there. And then it might even just be associations. and that is the thing that throws people up, but it's just like associations that are hard to explain a way otherwise. Right? So if there is something that you cannot explain in your current worldview, but this one like theory fully accounts for it, then you're like in the realm of, okay, so let's do hypothesis testing,

Starting point is 00:28:40 and then this is actually strong evidence. And this works as well, and I don't think it is actually distinct from causal inference because the important part is like you can't explain the association otherwise, which is the same as saying, okay, we don't have any like alternative confounders and so on that can explain this. So my take is always a bit of causal inference. I think if you like go really deep into the formalization and the different ways to identify the effects and so on,

Starting point is 00:29:08 it is kind of all the same. Like the things that intuitively work, they also work formally if you spell it out, why you think that works, right? their assumptions, like we can't explain this in any other way and so on. And clearly people do have an intuitive grasp on how these things work in certain circumstances. And then the problem is that it's always fallible. So you can always come up with an example where like, oh, everybody thought like the COVID vaccine was going to do this and that, but instead it did this.

Starting point is 00:29:40 Isn't this counterintuitive? And you can only find that out through randomized control tries. And then is the part about causal influence is generally like fallible. Like things go wrong because you put in assumptions and assumptions can always be wrong and so on. So the randomization still plays a very important role. So I know that some causal influence people go all the way to the other side and then like, oh, no, randomized control studies are horrible. They lack external validity and so on. And to me it's more like, no, it is like it is close to magic.

Starting point is 00:30:09 It is like one very important part of the toolkit that does a lot of heavy lift. Unfortunately, can never do all of the heavy lifting and you kind of need the whole toolkit to really understand how you can also like work across scientific fields and so on. So I think if you're a psychologist, it's just like many of the causes we're interested in. We can't manipulate them directly. This would be different if we were interested in the effects of certain pills. And that is just the nature of the field. So it's just going to make the causal entrance a bit harder. Yeah.

Starting point is 00:30:41 I mean, it's a topic dear to my heart because I work in addiction. and you cannot give people addictions. It's not possible. Can't be done. Not yet. You can. You can give them additions. You're just not allowed to.

Starting point is 00:30:57 Yeah. Companies do it all the time. Trust me. I've asked my ethics board. They said I can't. Trust me, Chris. And, you know, what you've also spoken about at length and written about is that these heuristics we have around causality and just not.

Starting point is 00:31:13 taking, I guess, a more nuanced approach means that you have a weird sort of schizophrenic approach to many papers where researchers will write the entire motivation for the paper and the theoretical treatment and introduction in causal thinking, interpret their results largely, implicitly, causally, and then provide a disclaimer that, oh, by the way, we can't refer this,

Starting point is 00:31:38 someday somebody should do an experiment. So, yeah, like, what is the way to fix? this? I mean, what is the way to not erroneously infer causality when we shouldn't? On the other hand, to be honest about what it is that is motivating us to do the research and I guess, yeah, hit the right tone. So I think the important question here is there will be a lot of uncertainty when you do causal inference with observational data in particular, but even in experimental setups. And the question where you put that uncertainty, like how do you cope with it? And so I think the current paradigm, and it's something I do believe that has evolved to cope

Starting point is 00:32:20 with contradictory requirements. You need to tell a nice cause of story. You must never claim causality based on observational data. So you write that weird paper that kind of tries to have it both ways with always, like, I think the focus here is always like on plausible deniability. So you get away with the causal storytelling. When you're challenged, you can also, oh, what I just said it predicts it. And I added that paragraph that everybody is.

Starting point is 00:32:45 So clearly I didn't do anything wrong. And so I think this is like the adaptive response to the contradictory requirements. And it puts all the uncertainty in the research question. Like what the heck are you even doing here? Oh, I'm interested in whether this predicts that longitudinally. I'm not quite sure. Maybe I'm also interested in causality. And I think the way to move forward here is to take all that uncertainty.

Starting point is 00:33:09 push it into the conclusions. So it would be like, I am interested in this causal effect. And this is like easy to justify because usually the causal effect is the interesting thing that matters for science. And then you say, okay, so how can I get at this causal effect? And this will usually involve them. It will always involve in some assumptions in any context. And sometimes these assumptions might be quite lightweight in a randomized experiment.

Starting point is 00:33:35 It might be, okay, the manipulation actually works the way we think it works and does not affect anything else. Maybe we're even just interested in the pill that we can randomize as is and so on. But often it will involve more assumptions. And even in a randomized study, this will be assumptions that your manipulation actually does what it does, which you cannot always empirically confirm. In an observational study, it will usually be assumptions of the type of, okay, we did not forget any unobserved confounders here.

Starting point is 00:34:05 And this is a super strong assumption. So I tell people to think of it more in degrees. Like we didn't forget any huge confounders here. I think this is the important part. We did not miss anything. That's obvious that everybody in the literature is talking about. We also did not accidentally control for the wrong variables because it can also make things go really wrong and so on.

Starting point is 00:34:27 And these are the assumptions. And under these assumptions, you have conclusions. And you can draw these conclusions, but then the question is, should you trust them? And you should trust to them, depending on the degree to which you trust your assumptions. So now the uncertainty is not, oh, is this a correlation or does this reflect a causally fact? But rather it's, so this could be an estimate of the causal effect,

Starting point is 00:34:48 but its credibility hinges on these assumptions, and there will be uncertainty about whether these are true or not. And people might even disagree, right? And when I edit papers or review them, I try to be like maximally permissive. So I'm like, I'm fine with this heroic mediation analysis, As long as you tell me the assumptions, right, like you're saying here that negative effect and depression are not confounded by any common causes. It's okay, please spell that out, right?

Starting point is 00:35:16 And then just say, okay, this is the estimate of the indirect effect under the assumption that negative effect and depression that are conceptually overlapping do not share any common causes apart from this one variable in the study. And then just spell that out, right? And I think this way you are being honest about what you're trying to do. you are also fully transparent of what needs to go into it and the reader can evaluate it on their own whether they go with the assumptions or not and you can actually even pinpoint on where people disagree.

Starting point is 00:35:48 Now, the huge issue with that, so I try to write my papers like that. I try to encourage others to write papers like that, which involves being like a bit more lenient when a paper does spell out assumptions. That sound ridiculous, but everybody's making them anyway, so let's not punish people for talking about it. The problem is that's in a paper by Barron Boyman-Purl that assumptions are self-defeating in their honesty.

Starting point is 00:36:14 If you say, oh, I'm assuming this and that, then it's always like somebody would be like, oh, no, but that's not plausible, right? And I think it is like just how we train people that you should always find like the weak spot and then go at it. And so if people offer up their weak spots, their assumptions, then reviewers will go at them. And this is a problem. I think it's a problem of training. It's a problem of culture that people don't notice the assumptions when they are hidden, but notice when they are put out in the open. But I still kind of firmly believe, okay, this is the way forward.

Starting point is 00:36:49 The assumptions need to be out in the open. And I do think it's possible to get there. And I'm mainly saying this also because, economics has that, I mean, it has its own problems, but it does have that very strong norm that you focus on the plausibility of the causal conclusions and then you spell out your identification strategy, which essentially are the assumptions that you need for the causal inferences. And that works because if you don't do that, the reviewers will just pounce at you, right? And I mean, economists like doing that and call each other out and so on. And so if the reviewers

Starting point is 00:37:25 are able to spot the assumptions when they are still hidden, then they can tell people to spell them out, and they won't just punish people for being transparent because they can tell when something is being hidden. But that is obviously like a huge training task that we are facing right now. I think we are making some progress. I sometimes see other reviewers like pointing out these points or saying just things like,

Starting point is 00:37:50 oh, you know, there's this paper that says that statistical control requires causal justification. could you spell out your justification here and so on? So I think this is slowly changing. It's clearly not changing for researchers who have played the game for a very long time and are very used to not handling things that way. But this is also something where I might be mildly optimistic

Starting point is 00:38:11 that we could get there eventually. I think I can give you a note of optimism, Juliet. So one thing is for any of our listeners who aren't familiar with that academic style of writing, this is a common thing where people will say X causes Y or they're claiming a causal relationship and then the reviewer request them to remove the causal language so they go for and replace it with X is associated with Y but the rest of the paper stays the exact same so it's it's like they've just done find them replace for any time causal language is there but the whole logic of the

Starting point is 00:38:46 paper implies causality so you might think that's strange like wouldn't that not actually do anything and the answer is yes, but it is a coffin request. Hey, Chris, I have to interrupt you because I think, Julie, you might find this funny, because I read in your blog that you're interested in that, you know, so-called Granger causality and longitudinal models, cross-lite models generally. And for everyone listening, it's basically you've got a couple of time series. And if X happens before Y regularly, then you can say you've got a certain prediction in time. It doesn't really provide strong evidence for causality because there's lots of reasons for

Starting point is 00:39:22 that temporal relationship. I actually published a paper last year, which was trying to improve on that a little bit. And then I saw the same concept in your blog just today, which is that as well as doing that, you can check for inverse grandeur causality, yeah, not just X predicting Y in the future, but if Y also predicts X in the future, then it's kind of a wash, right? There's no evidence that X precedes Y more than the converse, right? So you can approve on it a little bit to get temporal asymmetry. So anyway, this is a whole concept of the paper,

Starting point is 00:39:55 was a methods paper, all about causal inference, improved causal inferences in the title, and the reviewer made me do a find and replace to remove all the things where I was talking. I had to remove the word, I literally had to remove the word causal in certain spots and with directed prediction or some stupid, oh, directed association.

Starting point is 00:40:17 It's directed association. It only goes one way, not the other. I thought it was funny. That's pretty good. But I'm still going to attempt to give a silver lining because we listen to like generally terrible people on the internet. And often people that have some form of academic background, like people like Andrew Huberman or there's a psychiatrist, Dr. K. and so on.

Starting point is 00:40:48 So these are people that invoke or Jordan Peterson, like, you know, in that moment. And these are people that invoke academic terminology and academic expertise a lot. And oftentimes what Matt and I are doing on the podcast is literally just trying to work out what they are claiming, like what they are saying caused this to happen. And they live in the realm of dream associations. It's kind of like a Jungian wall, where they will throw out connections between multiple concepts. And they're very clearly and sometimes very explicitly describing causal relationships or, you know, saying that this is caused usually by the woke mind virus or whatever the case might be. But the thing for me is that when I was listening to your talks in preparation for the interview, that, you know, you're talking to academics who all agree, we want to get it right.

Starting point is 00:41:47 and we know we're doing things wrong and there's bad practices and we got to reform it and so on. But it made me think that like in the wider world that's out there, there's people making terrible, terrible causal entrances constantly and they don't care at all about the quality of evidence. It's all kind of vibes based. And in that respect, I feel like sometimes academics might be too negative about themselves because the fact that all the stuff that you're talking about now is legitimate and a legitimate concern. But it's just so much better than what passes for causal inference, you know, in the world. And in those cases, those people often have like much bigger influence and influence on grant getting. And, you know, with the US in the current

Starting point is 00:42:38 administration, they are actually selecting research priorities based on RFK juniors, causal inference. So this is just to say I'm not giving academics a pass, but I've just said it could be so much worse. To be honest, that is a bit of a... I mean, first of all, I'm glad you did your homework and listened to some talks of mine. I mean, it is quite a low bar. Yeah, because I would, I guess if I were like an outsider,

Starting point is 00:43:08 I would presume that psychologists had better training in causal inference, given all the causal claims they make it. And sometimes people will be like, they will just presume that psychologists had good reasons to believe something. And it's also like people in other fields will be like, economists will sometimes be like, how do you psychologists actually know that? Because these variables, they are so endogenous.

Starting point is 00:43:30 And it's like, oh, no, it's just a correlation, you know. And they're like, what? They call us it past, different norms and so on. But no, yeah, of course. And I mean, in academic research, there are also people who are very, motivated to support certain claims, right? And that strong prior will affect also how they employ causal inference.

Starting point is 00:43:51 And so initially I used to make the mistake of look at who cites my primer to cause a graph. And there were a lot of people who just wanted to make the claim that something affects something. And so they will just say, oh, you should not control for these variables because they are all mediators of the effect. and then they're citing like Rora 2017. I'm like, oh, no, just stop looking at people citing me. Now, this is bound to happen. So when people have strong motivations to find out either way, they are going to use all their smarts in a targeted manner

Starting point is 00:44:28 to come out at one way or the other. Now, the thing is, for the general public, to some extent, I guess I do have some hope in the, that I think actually many people are interested in how to actually draw causal inferences. And I think of one example, so something that I'm like really like routinely notice is like people notice they have some digestive issue and I was like, okay, what is causing that, right? And this is something where like causal inference can go so wrong and people will just assume they can't have this and they will stop eating that for 50 years or something.

Starting point is 00:45:10 when it wouldn't have been necessary. And actually, I think there would be space to provide those people with the right toolset. And they are already nerds that rabbit hole in like, oh, I do an elimination diet where I remove everything and then just eat plain rice for some time to establish a baseline and then ingredients one by one. And this is essentially like doing like trying to do controlled causal inference. And I mean, I think the most extreme nerds even like try to randomize themselves, right, which is kind of hard to blind yourself. is possible. And so I think

Starting point is 00:45:42 people do, like people are naturally motivated to find out about cause and effect because it is really relevant, right, like for your life to figure out how you can like affect your outcomes and so on. So I think there is like interest in that. But then if the scientists are already

Starting point is 00:45:58 so bad at it, I guess we shouldn't expect people to figure out how to do proper causal infants in their own lives. And I think one area where that is very obvious is people with infants. and babies and toddlers, which is like the most

Starting point is 00:46:15 noisiest situation you could imagine because kids are just very noisy in the sense of like kind of unpredictable and so many variables affecting them and so on. And people come up with the most bizarre thing, right? People will be like, oh no, I'm breastfeeding. I can't eat gummy bears because my kid will like scream all night long and so on. And I think there there would be like a lot of space for like public education to teach people so that they don't misleadowing.

Starting point is 00:46:40 themselves as well because I think people generally crave accurate information about how the world works and we are just like not teaching them how to do that. I'm going to agree, but I'm also going to say that like my experience on online and looking at the people that cultivate the huge audiences is that if the message that academics have is that look, we're bad at it, but we're trying, okay? And we know there are ways that work, right, to get it better. And there is good research out there. And you're right about people having in a hunger to work out, you know, hear about science

Starting point is 00:47:21 or hear about how to make inferences, right? But the problem is that the people you're competing with for attention are telling them that the approach which feels good, like all the things that you're saying is wrong, right? that that is actually the correct way to do it. And also that the people that will tell you that's wrong, they're liars or that they're kind of elitists in ivory towers. So I'm just thinking, Matt, about the case we were listening to an influencer who, they've got like a genuine health thing.

Starting point is 00:47:56 They've had sudden hearing loss. And they want to work out how to fix it. And so they're doing multiple treatments, experimental treatments. at the same time. And they released a video to their audience explaining, you know, all the treatments are going to, they talked about possible links to COVID vaccines. Just they weren't saying they were just talking about there might be the association. They don't know.

Starting point is 00:48:18 But they then were like quite confident that whatever they do and if it works, they'll feed it back to their audience and this will, you know, help people that are in the same situation. So like for tons of reasons, that's a terrible thing. Taking multiple experimental treatments. And then if you return to baseline, you know, where most people with sudden hearing loss do get hearing. But you will infer that it's because of the things you did. But in the comment section, and then I know people caution around that, but there's so many people that are like, thank you so much for, you know, you're doing the real kind of investigations that we need. This is what scientists won't tell you.

Starting point is 00:48:58 And then when I see like scientists talk about it, they're kind of like, yeah. Look at the craps the journal are popping out. And it's true. I have the same feeling, but I'm kind of like, but their message is we are empowering our audience to find all these things that like scientists are lying to your bite. And then the scientists tell them, scientists are pretty shit at their job and they're not even good at like it's true. So like I'm just wondering that this is why we are sometimes accused of being establishment

Starting point is 00:49:32 chills because we're saying, but it's better than what they're doing. Yeah. I see the dilemma here. And I also see like you're on a different spot, right? Because you are paying so much attention to the worst going on there. I try to be like more focused on the positive. And so for example, I have a side thing that I'm actually doing substantive research about like birth order effects.

Starting point is 00:50:00 And I quite often talk to journalists because. this topic is just like catnip. Like, oh, like, what does it mean to be a bigger sister and so on? And so, for example, in my experience, and that might be, there's probably some selection bias going on there. But the journalists I usually talk to are like super interested in getting it right. They are paying a lot of attention. If the paper's open access, they will even read it.

Starting point is 00:50:22 They even had people like challenge me about the methodology of like one classroom experiment we did and so on. And so my focus is more on the, oh, there are so many people who actually do want to understand things and get it right. But then you are focusing on the gurus, right? And that is kind of like conditioning on the worst of it. So I think there are multiple issues here. So one thing is if it's all just storytelling,

Starting point is 00:50:45 then you just always do better when you're not constrained by reality. Right. Because you can craft really nice narratives about, I don't know, the evils of modernity and so on. If you're not by any means constrained by reality. And I think if this collides, with people who are desperate, which I think is also like many people

Starting point is 00:51:07 with health issues and so on who follow these people, then they might just want to buy the hope, right? I mean, this is actually more your topic, my topic, right? And this is like an unfortunate combination. I don't know whether this is like, like, I have no idea what's the average dynamic of how these processes play out.

Starting point is 00:51:26 I don't even know, like, what's the average, like, attempt of a non-scientist to find out something. something about something, right? Like, I don't know what would be representative, but there's definitely like huge heterogeneity in there. Right. I can, for example, and I understand that the public health people, I think in particular, often quite like insistent on, no, like you should tell people to trust the science. And that, I think that might have positive effects to just

Starting point is 00:51:52 be like trust the science. And there was actually a huge issue also initially with the open science movement. Well, people were complaining, right? Like, psychologists are like laundering their dirty clothing in public and everybody will be like, oh, we can't trust science anymore. And then they will stop getting vaccinated. They really hope this didn't happen because I really hope people are capable of mentally separating like academic psychology, which is very different from medical research and so on. But no, it's always a real risk. Like if you criticize science, then push scientists to do better that might be perceived by the public as, oh, they don't have their house in order.

Starting point is 00:52:31 And I wouldn't disagree with that. The problem is that there's, like, a lot of nuance there, right? And there are things that you should believe and ought to believe for very good reasons and so on. And other things that you shouldn't. And we often have this with our secretary who's super interested in all of psychology. And sometimes she will be like, oh, man, that all sounds really horrible. Like, I don't know what to believe anymore. And then I'm trying to provide her some heuristics.

Starting point is 00:52:57 Like you shouldn't believe this type of psychological research without asking me first, because then I'll look at the study for you. But then there are other things where I'm like, I'm not going to research that because I trust the mechanisms in that field. And I trust all the regulatory processes and so on. So I will just take any vaccine I can get, actually. Right. And I do see how it's hard to get that nuance right. in particular in a polarized environment, right,

Starting point is 00:53:26 where they are either like, oh, yeah, science, that's the best thing in the world or like, just like, oh, science is horrible. Those are all shills and so on. So it's hard. I'm not that negative about it, but this might also be biased because I'm not in the US. The situation in the US might feel different.

Starting point is 00:53:43 I'm not paying much attention to the online gurus, although sometimes enjoy to listen to your episode just to get some, like, impression of the amazing, like, human variety that is out there. I'm just saying, Julia, you can make a lot of money if you just changed your tone. You would be like the greatest weapon

Starting point is 00:54:03 of somebody that could legitimately criticize. But I completely agree that it doesn't mean that therefore you have to present an overly rosy picture you shouldn't talk about the problems you shouldn't like argue for reforms. I don't think that's the solution. And I think that generally

Starting point is 00:54:20 that all of the efforts that people make and the criticisms, the very public criticisms, are necessary and important. It's just, I kind of have sympathy in a way for the people in the middle. I don't always think they're just unwilling receivers of this

Starting point is 00:54:39 because I think it's flattering to be told that you, despite doing no research ever or spending a day learning about methodological things, you now can completely. completely criticize entire fields by just your intuitions. Like, I get that because I have to see him, you know, drives. But I do think that it is hard to explain to people that sometimes, like you said, with

Starting point is 00:55:06 your secretary, if you want to explain to someone how to identify good and bad research, there's statistics. But in general, it actually would take a little bit of time to learn how to separate. this is a good, very well-conducted study with reasonable inferences versus this is a low-quality, like quite hyperbolic study, right? Because it can look in structure almost identical if you don't have the training to distinguish. That is absolutely correct.

Starting point is 00:55:40 And I mean, even scientists sometimes seem to be lacking the training to do this reliably. And I think I guess my hope would be, right? I mean, I am training psychologists and my hope would be that we can ramp up the norms and people's ability to discern, that the field as a whole becomes more credible

Starting point is 00:55:59 so that it is more like, okay, now if the expert actually agree on that you can believe that because we do have these mechanisms, we do have all these controls. And that's a bit aligned to the take of Simein Vazir, who's always by, okay, we want to be credible, but the credibility needs to be earned. And then there is always the

Starting point is 00:56:17 issue like how do you know whether a field is credible to begin with on and so on and so on. And these are legit concerns. I'm, I guess I'm just less prone to believe that we will all be out competed by the online gurus, which I don't know. I don't know. From my perspective, they are like a fascinating, like niche phenomenon. But I'm also my social sample might be biased because I don't befriend people. I hope you're right.

Starting point is 00:56:43 I wish you were right. I wish you. I hope that we're just over-negatively biased by our sample. But, yeah, I fear. Hey, Julie, just to get back to your birth order paper, just to double check there. You did analyze a lot of data, and I think you found essentially no real effects on from birth order. Is that correct? Yeah, so I can give you like the short summary of the birth order literature.

Starting point is 00:57:14 So it's a mess. So first of all, so this is definitely, I mean, this is part of the reason why I think the open science stuff was so relevant to me because the old studies are just a mess from many, many different angles. But then there is like a legit research interest in whether there are systematic birth order differences. And then there's actually a different question whether these are effects of birth order. But I think the descriptive differences are already interesting enough. The one thing that we do find that is fairly reliable. in Western samples, I should say, is that first bonds tend to be a bit smarter. So they have like, it's maybe an, it's an average difference of two to three IQ points.

Starting point is 00:57:58 So what I always say when I talk to journalists. So what you're saying is all else being equal, if you need nothing else about us, you would assume that I had an IQ two or three points greater than Chris. That's what you're saying. I'm the second. And I probably as a physicist and I know a lawyer. So, yeah. We're not making erroneous causal inferences here, Julia.

Starting point is 00:58:19 This is science. We're just doing science. The thing is, so what I always tell people, first of all, if you've ever talked to somebody, you will probably have more information than the birth order position provides. And then this is like a very small average difference. So if I tested anybody, like twice on consecutive days, the measurement error alone would exceed, like,

Starting point is 00:58:40 the average difference that you get over 1,000 of siblings. pairs. And then it's not deterministic. I always have to say this because I're spoiling Matt's fun, right? Because I'm a first one, right? And so when I published like the first paper and that got like picked up a lot by the media and then my, the back then boyfriend of my sister was like, so your sister scientifically demonstrated that she's smarter than you. And then sent out a press release to all major German news out. And so in my defense, even in the paper, So we just check like the, so we have data from Germany and we just look at diets, so people who have like one sibling. And then where we actually have data for both siblings.

Starting point is 00:59:22 And if it was like no, if there was no association, it would be like a coin flip who's smarter, right? It would be 50-50. So then there would be no association. So it's not 50-50. In our data, it's more like 60 to 40. But that still means that for like 40% of the sibling pairs, the later born is the smarter one. So this is what I always met also in defense of my little sister. But just in general, like this isn't a huge story.

Starting point is 00:59:48 And this is just the, I think the most reliable effect that we have. And then we usually don't find much going on for personality. In particular, once you take into account that first ones are by definition older. So if you compare somebody who's a bit older to somebody who's a bit younger, you will find the effects of personality. maturity maturation, that personality, at least in self-reports, gets on average a bit more mature, so people will appear more conscientious and so on. And so this is like another story for my sister, where one time she was like sitting in my living

Starting point is 01:00:23 room and just looking around. She was like, do you think like in three years I will also have like a dining table and plants in my room? You just need three more years and then we can see, right? It would be unfair to compare it now. Like you have to compare us at the same ages. So there is not much going on. So this was the topic I started working on.

Starting point is 01:00:43 That was just by coincidence. So I applied to become a student research assistant and the professor who's still my current boss. I was like, yeah, I had that one thing I always wanted to look into birth data. And I'm like, yeah, sure, why not? And I found nothing. Except for the intelligence effect. We published that.

Starting point is 01:01:00 So this was my bachelor's thesis, but we later published it in PNAS. And then people were like, oh, wow, you published in PNAS. You need to do this for the rest of your life. I'm like, well, we didn't find anything, right? And so this is part of the reason why I was like, okay, I need to do something else like substantively because I can't be doing this for the rest of my life if I don't believe that there's actually much of interest

Starting point is 01:01:23 or even of practical relevance going on there. But in terms of people preferring a nice story and, you know, and this is something which is true in the gurusphere and in the public discourse, and it's also a problem in academia as well. you know, we like a sexy story. It's easier to publish and it's tempting to, you know, bend things towards that direction. I mean, did you find, for instance,

Starting point is 01:01:48 that when the journalists were interviewing you about the sexy birth order topic and there was sort of this sort of disappointment came over their faces when you told them, actually, there's not much to it. There's nothing much happening with personality. So actually, I think, and this might be, again, me like the journalists who are willing to report about these things, but they were really into the, oh, there's actually nothing going on. I think it's just like, maybe because it is

Starting point is 01:02:15 controversial to say that it doesn't matter systematically. So I had the impression they were really into that. And that is maybe also part why I think that the journalists are great. And that was even, so you might know like, what's her name, Catherine, like the Princess of Wales. and she had a third child, right? And at that time, like, I got, like, journalists contacting me whether I could talk about what it means to be a thirdborn, right? And I just gave interviews, yes, like, birth owner doesn't seem to have any effects and so on. And then later on, when that boy started school, I got contacted again, so six years later, and gave an interview. And it was really like the questions were like, yeah, what does it mean to be, like, a thirdborn?

Starting point is 01:03:00 I was like, I mean, usually we don't find any effects, but in a royal family, it will of course probably be different, right, because the first one plays a special role. And then, of course, they are like in the media and people are speculating and whatever. So surely this is going to be different in the individual case, but I can't make any statements about that. Right. I need thousands of people to say anything about birth or effects because if there's anything, it's tiny, I can't say anything about that. And they were really like lapping it up and they printed that in a newspaper. It's like a small. And so I think it is also like a cool narrative that people enjoy being a bit like, oh, we are debunking things or even like there are interesting aspects to it, right?

Starting point is 01:03:40 Because there will be, like you will have a different role in childhood if you're the eldest daughter, right? So this will make a difference for your family relationships. And that might even like, so I also get interviewed like when Christmas season approaches because during Christmas people meet their families. and then it's like, oh, it's just like 20 years ago, right? And the oldest daughter will, like, do all the cleaning and so on. And then I can be like, yeah, no, that is probably like an accurate observation in many families. We just don't see that it affects personality, which might speak to the fact that we need to distinguish between personality and how people act in certain contexts, which might as well be like hugely shaped by their birth order.

Starting point is 01:04:20 It might strongly depend on the family. They think people do also think of that as a narrative. right? It's just like more nuance than, oh, because you're a first one, you're smarter now, which is something I think the Ladd Bible said about my study. So there are different, I think, narratives you can eag out there. And I don't think the simplest, sexiest one is the one that people like best, because I think people also like some degree of sophistication. And again, I might be biased because I'm just like being contacted by journalists

Starting point is 01:04:52 who think that the finding that there's mostly nothing going on is useworthy. But in general, I have the feeling that this has an appeal to people, and I was asked to write a book about it. And I was like, I'm not the right person to write a book about siblings because I can only tell you that where they don't matter, right? But people are interested in that as well. Yeah. And I think also, you know, there's been plenty of coverage of like the replication crisis or like the Stanford prison experiment re-evaluations and so on. So there is an interest in those kind of counter-narrative. maybe perhaps a very strong interest in some quarters.

Starting point is 01:05:31 But Julia, I had a question that I was hoping you could help me resolve because I can put the two things together in my head about it. So in the open science movement, which I've been generally, like very supportive and in favor of, even with the various little controversies that have covered about within the open science movement arguing with itself, did you pre-register your own? study well enough and so on. But I have seen the papers, you know, where people, and I've done it myself with students, where you get them to compare the pre-registrations to the actual finished paper and you reveal, lo and behold, there are usually deviations that aren't mentioned in the paper,

Starting point is 01:06:12 right, depending on the person, or that things that are in the open data set are completely uninterpretable because nothing's been labeled and so on, right? It's just there and you get the bad. And now, you. The thing is that I know there's been legitimate complaints about this because this is an issue, right? This is kind of people focusing on the badge or receiving the accolade without doing the... Open washing? Yeah. Oh, is that what it's called? I've not heard that. But at the same time, I see a lot of papers, the famous one, which is kind of pre-any of this, is the one looking at null effects in NHLBI.

Starting point is 01:06:56 clinical trials, right? This one where it was actually introduced in 2000, the requirement to register your trial, right? In the UK for, I think it was cancer trials or something like that. But, and then you see before it, there's all successful results. And then after it's basically all null except for like one or two studies. And similarly, your blog colleague and Daniel Lekins were looking at registered report versus standard reports and so on. And the, pattern is always the same that like the effect size, decrease, the amount of null results shoot up, like basically researchers become much better. They're much more similar to me in being able to collect null results. I don't impress them. But if it were the case that everybody or like a

Starting point is 01:07:45 large amount of people are just kind of going through the motions to get the badge, how can those two things, you know, kind of come together where we're saying people are doing that. And, imperfectly and not well. But yeah, when it happened. Yeah, so I think I can give you fast allowance already. So I mean the bed and the pre-registration thing, this is usually just pre-registered studies. So you have a pre-registration, you collect your data,

Starting point is 01:08:13 you write it up, you publish it. What Anishel and Daniel Larkens looked at were specifically registered reports. Registered reports. And I think these are a lot more potent. So this is where you write up your plan and then the plan gets reviewed and you get issued in principle acceptance

Starting point is 01:08:29 once the reviewers are happy with your plan and then you implement that and then you have zero incentive to spin anything. Whereas if you pre-register, you still have all the incentives in the world to spin your story to make a nice narrative and then because we know that pre-registrations aren't routinely checked by reviewers and so on

Starting point is 01:08:48 it's actually you can do the spin and then you even get a bet. So this is the open washing aspect And this is where people are like, oh, everything is going to get game. And I think this is true. Like everybody is trying to game everything. Like people are also trying to gain rigorous causal inference now and so on. And so I think this is just something that happens.

Starting point is 01:09:07 It's also something where I think we know in principle how to react. And this is actually checking the pre-registration during the review process. This is something that we do now at psychological science. Where we also check whether the data and the code can actually reproduce the findings. So when a paper gets accepted, it gets like accepted pending the reproducibility check. And if you made it at that point, like psychological science is still a prestigious outlet. So it's like, okay, now this paper is accepted, but only if they can actually get the numbers, right, like out of the data. So there's a lot of incentive to make this as smoothly working as

Starting point is 01:09:46 possible. It is still a lot of work. Like it's a huge amount of effort. People will be like, oh, who's going to do that? And so we have a huge. huge team of star editor, so that is like statistics, transparency and rigor. But actually even that is not enough. So we need like a volunteer network because it is so much work to check somebody else's results. So it is like a huge investment. I also think it actually should just be the norm for like it's just normative reasons.

Starting point is 01:10:13 Like if you read a paper in a peer reviewed journal, I think as a layperson, it would be justified for you to assume that somebody checked the numbers and did the math. And at the very least, like, if you take the code and the data, you get the results that are reported. I think this is a low bar, actually, right? And that this is still like, when I tell people about psych science, they're like, oh, what, who's supposed to do that, right? Like, it can't be the reviewers. We can't expect that from reviewers. But I'm like, why, why actually not, right?

Starting point is 01:10:44 Or like, from journals because it's like, it's about quality control, but we don't even do that quality control. We are just, like, used to churning out papers. And this is something I think where we are, like it's a hard shift. But it's also something where I think we might finally catch up with public perception. So there is a fun anecdote there about my boss when he had his very first paper. And then he asked his advisor, like, okay, so I mean, this was back in the day. So he's like, how do I send them the data for the peer review process? Is that like a floppy disk?

Starting point is 01:11:18 Like what format do I need? And I was like, no, no, no, you don't do that. and say, how do I do not do that, right? How would they check my findings if they don't have my data? And it was like a no-brainer for them. And I think it should be a no-brainer. Like, this is how we should want to do science. This is not how we are currently doing it.

Starting point is 01:11:35 And he was like, I don't know, 30 years ahead of his time maybe. But I think we are getting there. I know that some political science journals are doing that. It is a lot of like upfront investment. I also see how the cost could just, go down over time because if you want to get into those journals, you will make sure that it runs, like the code just runs and outputs the results in a manner that you can easily look at them. And I did one of those reproducibility checks. And it was actually, I mean,

Starting point is 01:12:06 there was some R package issues, but when that was done, I clicked like one button and all the results were reproduced. It's absolutely possible to do that. It is hard initially. It's much easier now because you can use so many tools to make coding easier and so on. And if we turn that into the norm, then also checking the numbers gets easier. You can automate parts of the process, right? There are also people using LLM tools now to like check whether the pre-registration and the published study match and so on. So there's so much we can gain there and make it smoother. We just need to like make it clear that, okay, actually we want this. Like we want to be sure that published studies have numbers that can actually be reproduced from the data and so on. It's

Starting point is 01:12:46 not a high bar. I think everybody would agree that this would be the ideal state. So we just need to put in the effort and maybe also reward people who put in the effort and so on and just maybe try. So this is what we're doing at psychological size. Like we're trying all the radical stuff. Let's see how it will go. It is a lot of work. But I think it's a worthwhile attempt. But speaking about people, you know, the incentives and people gaming any system, you know, what you're describing sounds great, but it is all kind of pro bono. volunteer hard work by yourself and others. So, you know, I'm also thinking of the, you know, the way the bigger publishers work in terms of their monetization model and, and also the

Starting point is 01:13:33 pressures on academics to publish, publish, publish, publish, publish, or teach courses. Some of us teach courses. Some of us teach courses. I don't know. That's too big a question, I suppose. But, but do you feel like it's rewarded? Like, are you rewarded for doing this work? I'm a postdoc. So, I mean, I'm a postdoc, so I'm probably not getting rewarded for it. But I do think, yes, I do think the incentives matter. I also think that a lot of the incentives are social in nature.

Starting point is 01:14:07 And that also means they can shift over time, right, as can expectations. And so sometimes I think people have that like, oh, it's a race to the bottom. everybody churns out as many papers as possible and that will get like the maximum reward and so on. But that's clearly not what we are observing on average. So I think there is like, you have these people who just have like 200 papers and predatory journals, but they are not outperforming others for like the actually prestigious positions. And there is like some people out competing others by cutting corners. So I just think it needs to be like shifted on multiple levels. There needs to be soft pressure. So I had a PhD mentor who was an economist, and the economists, I think, are more aware of, like, reputational costs and how you can use these to enforce norms. So there could be a thing.

Starting point is 01:14:59 We're very far away from that in psychology. But I think in economics, if you publish too much, it looks suspicious. And people will look at the papers and they're like, it's not even in a top five outlet. What are you doing with your life there, right? Like publishing two papers a year. I like, Julia, that you're probably not aware of a fellow called Gary Stevenson from Gary's economics, because our listeners will have heard recently, like his presentation of the economics field.

Starting point is 01:15:28 And the economics field, as you said, you know, it has its own issues. But his presentation is not of a robust, like empirical science, but a pure voodoo science in things. So when you say, you know, I don't know. No, no, no, no, I don't think... I know that you're not endorsing like economists as the model to hold up, but you're also not presenting the picture he is rich, is that a single model that controls all economists, and it is the single agent model, Matt, right?

Starting point is 01:16:01 That's the only model, and you're not allowed to model inequality. That's the main thing he says about it, and that all economists are rich, and that's why they refuse to report any results that say there's any... inequality. And I think this is like absolutely fair. So I'm thinking less of it that we should, I think we should emulate like the best parts of what economists are doing right. As well as the economists should emulate some parts of psychology because they have huge issues as well.

Starting point is 01:16:30 Right. But I'm thinking more of in terms like, okay, so there is a counterfactual science where actually having your name on more stuff can impose costs if those. papers are bad. It is possible to build a community that operates more like that as opposed to the more papers, the better.

Starting point is 01:16:52 And I think like just seeing that there is variation between the fields means it's okay, you can find an equilibrium elsewhere where it's not just quantity. Now I do think we have like lots of issues also with the prestige outlets. And I think they are now just

Starting point is 01:17:08 cashing in on their names, right? So all the like kind of bad nature and science brand journals that really are publishing horrible research and researchers pay a lot of money to get in there and say, yeah, okay, you're cashing in on the reputation that has accumulated over 100 years with this, like, usually tightly guarded outlet.

Starting point is 01:17:29 And I think that is an issue. I kind of hope that this fixes itself by just, like, destroying the brand name over time. It will take some time. So it still works, right? I do remember, so there was, there one journal that started, Nature, Human Behavior. And I mean, literally what they had initially was the Nature brand name, right? And it can only take off if people submit work that is potentially high impact and so on.

Starting point is 01:17:57 And it totally worked out. And I was kind of observing that. It's kind of curious, right? And people were like, oh, it's a nature journal. Yeah, but it doesn't have an impact factor yet. But yeah, but maybe it's a nature journal. Everybody will submit there, right? And so on.

Starting point is 01:18:09 and so it still kind of works. I'm really just hoping that it will collapse at some point where you exploited your reputation to an extent that the reputation just fades. And right then I know that, for example, psychological science had a very good reputation, but it was also the outlet for the sexy, cool findings. And then that kind of gave it a bad name. And then they came around and had like editors who were like strong reformers and so on.

Starting point is 01:18:39 and is slowly rebuilding. It's still not back to where it was. And now we have all these additional requirements, if you want to get in there. But I think these things do change over time. Maybe I sometimes wish they changed faster. Maybe sometimes wish people would realize, like to which extent journals are also like scamming them

Starting point is 01:18:58 without providing added value, apart from career progression, right? So, yeah, I think it's mixed. I try not to judge people too much if they are just like trying to play along to maybe eke out a living, but I do judge people harshly who are at the point where they have a tenure position and still play along. Right.

Starting point is 01:19:19 And then they're like, oh, what my PhD is doing? And it's like, yeah, well, you're still like just hacking your way through. So I cannot respect that. But most importantly, I think for me it's more like, okay, I wouldn't want to get into the position where I look back onto like an academic career. And it was all kind of bogus. I think they would be frustrating. So I'm trying to like focus on the positive stuff.

Starting point is 01:19:41 I feel there are a lot of people in that position and they're quite quite protective and upset about it. But you know, I have something that I think is like possibly a positive note towards the end of not holding you hostage to much longer, Julia. But you know, Matt, I'm not usually the one offering like sunny TX, right? But you were asking Julia about who's going to do all this work. And I think that's a legitimate question. I do think part of the answer is AI's. But the other answer, and I've seen you give talks about this, Julia, talking about examples of it, that now we have a bunch of new journals that are very friendly to open science

Starting point is 01:20:26 and kind of other improvements, registered reports and so on and like open peer review. and I'm always amazed at the amount of initiatives by academics where they are doing free labor. They've like come up with a little thing where they're going to review papers for journals, and they'll give them a badge from their collaboration network to outsource the work for journals. And not just that, but the example you give which I hadn't heard of recently was people setting bounties on their papers. to look for statistical errors. And so far it's like four out of four, or at least it was at the time that you talked about it.

Starting point is 01:21:09 So not a huge incentive to do that. But I thought that's the kind of impressive thing that is, like, I know I really do have an issue that I'm looking at the worst parts of the internet talking about it. But it's so counter to the image of academics that's presented in the guru material we have, which is that they're all out just for themselves, they're all constantly, you know, lying, producing low quality research, and they don't care if things are true. And I see so much around the open science movement, around like reforms,

Starting point is 01:21:43 that it's clear academics do care and they do unpaid labor and they set up blogs and they, you know, they go on podcasts about being paid. And yeah, it just, it's so counter to the image of like the reprecious ideologically invested researchers that I think it's worth noting and you, Julia, you know, you're a self-deprecating academic, but you are somebody that really walks to walk in that regard. So, yeah, I think you are part of the solution, even if you can't say so yourself. That is very kind, but it's also my impression like them. In particular, in the early open science movement, so like in the beginning of the society for the improvement of psychological science and so on, there's so many people who do care. And this also includes,

Starting point is 01:22:31 senior academics who were essentially like blowing the whistle on their own work as well, right? And so I do think there are a lot of people like I'm interested in doing the right thing. Of course, at the same time, people don't want to hurt themselves. This is also very, I think, just natural, right? And so I don't, because if it were the case that everybody was just ruthlessly optimizing the metrics to like have a career, and they are individuals that do that. Right. I am a personality psychologist. So I always see like the range of behaviors.

Starting point is 01:23:04 And I think we need to make sure that we don't just like select the very worst cases. But I think we are already not doing that because at some point people are saying, that person is just a scam, right? And everybody can see it. And so I see a lot of initiatives that try to do something differently. And it's a lot of just trying out stuff. And sometimes I'm like, no, no, that's not going to work. And then some other stuff works.

Starting point is 01:23:25 It's hard to predict, right? Just because you brought up the bug bounty program. So this is run by my co-blogger Malte Ilson. So he got a huge grant and they are doing essentially reviews of published papers searching for arrows. And I can tell you by now five reviews have been completed. And there were five papers with errors. So this is absolutely the norm. And I think it's good that we are talking about.

Starting point is 01:23:50 It is also something that the norms are shifting. So I think now you can talk to academics and say, oh, actually there are like errors in all papers. And most would be, yeah, probably. I mean, maybe not mine. But yeah, no, there are probably like issues that we are working on. So overall, yeah, I know. Maybe I'm still optimistic. I see a lot of potential, a lot of people doing good stuff.

Starting point is 01:24:10 A lot of problems also, like problems in how we like select for publications, select for people who get into positions of power and so on. But overall, I think the last 14 years, I think, well, positive development. I don't have any comparator because I started my career in 2011. But there's a positive association, it's not a causation. It's just a relationship with the positive direction. And by the way, Mike, you muted yourself. I was going to propose just placing an incredibly small bounty on finding error in my papers.

Starting point is 01:24:43 Like I'll pay like $5. But no more. Don't look that hard. Just have a quick skit. No, I second all that. I think it's a positive vibe. Yeah, Julia, we didn't really have time to get into it. But actually, I think for people listening, some of this stuff around causality,

Starting point is 01:25:01 in particular that directed us to like graphs and just essentially making a visual diagram of what you think causes what, how things are related. It is intuitively how most people think about everyday life and things that matter to them. And I think it comes very naturally to us if we just apply some of these tools. So we'll link people to some of your materials there, Julia, and maybe one day we'll have a more focused chat about it. You know, not just academics, you know, but you're kidding, MacArthur academics.

Starting point is 01:25:36 I actually never thought of that as a like suggestion that originates from me. So I think this is just coming from the causal inference literature where you usually have like a well-defined intervention, right? You have like control and treatment. And then you look at the difference in the outcome and you're trying to make sense of the unit there and whether that is a lot or not. And the outcome measure is just the outcome measure.

Starting point is 01:25:59 And I think this is a particular weakness of psychology that we always jump to the latent construct. So instead of saying like, oh, being later born, right, means that on average your IQ is two to three IQ points smaller or something like that. It would be like, oh, there's an association between birth order and intelligence. And I've even seen paper calculate correlation coefficients for that. It's like, what is a correlation between like birth order and intelligence? Like, what does it even mean, right?

Starting point is 01:26:29 Like, we're comparing people here. And so this is just naturally, like, something that arises from the causal inference perspective where you're thinking of even just like hypothetical intervention, right? Like, what if I gave you five points more grid? What would that do to your educational attainment and so on? And trying to tie it to more specific metrics. And I think this, I mean, this is partly because it's like, feel like maybe public health and so on, where you do want to have a tangible.

Starting point is 01:26:54 outcome metric. It's not just people are healthier, but it's like, oh, we are like saving that many lives or that many years and so on. And this is important because it's also important for policy. And I think it would be healthy for psychology to go into that direction. Maybe

Starting point is 01:27:10 also to sometimes realize, yeah, maybe this isn't that important, even if it's there. I mean, it's already big if if it's there, it's not important. And I mean, this is also something I've seen in the open science movement that people, right, like you start caring about P-hacking and all these studies are P-hick, but then you also notice, but even if they weren't

Starting point is 01:27:28 peahed, right? Like, what would they be good for? What would we learn from them? Would they have any applicability, right? And so this is part of the birth order research. And I think one economist even asked me, I mean, what does it even mean? Like, you cannot, like, surely you wouldn't adjust the number of children to tweak their birth order position to reap the benefits, right? So, yeah, it's a good point. Yeah. And so I'm more moving into that direction, I think, that you do need to quantify effects that actually matter in some way and that might also be on a theoretical level but even then you need to like consider

Starting point is 01:28:00 okay how much does this account for in the outcome is this important? It's practically irrelevant can we do anything with it and I think that's a helpful angle I think a lot of open science people have developed that right because they see so many bad studies where they're like even if it were true it wouldn't be interesting right but also from the causal inference angle where you're like well it's impossible to identify that effect

Starting point is 01:28:21 but then why precisely are we interested in it? And sometimes there are basic science reasons why we might be interested, but sometimes it's like, okay, we wouldn't be able to do anything with that knowledge because we cannot intervene on that thing anyway. Then I think it's fine. I mean, I always respected if people like really like rabbit tone and are like, no, but this is the thing I'm interested in. I can respect that if you do it well.

Starting point is 01:28:42 And I think it's also like a process of realizing, okay, maybe birth order is not the most exciting topic in the world. And then I did well-being research. I was like, oh, that field is a bit of a mess. And they do have measurement issues, right? And, like, trying to find something where you feel like you are actually contributing something. I know that this is also maybe a privilege of being a bit early career that you're not yet fully booked. Or, like, that one thing you're going to do for the rest of your lives because everybody wants you to do that. But I've also seen, like, senior people, right?

Starting point is 01:29:11 Like switching, moving fields or maybe turning more to, like, meta research, which is maybe something that is, like, you know, just like at a certain point, you're like, looking down on what I did. Oh my God. What are we going to do now? And I think these are all healthy developments. I recommend being an applied statistician, that way you can change what you're working on every few years, and it really doesn't matter.

Starting point is 01:29:32 The topic is beside the point. But honestly, I just like, there's so many of the issues that you focus on, like measurement theory, you know, having a clear concept of what you're measuring versus the construct, the concept of what you hope you're measuring, having a clear model in mind, not just throwing statistics at it, but thinking about a model and by all means, a dag, a graphical one is the best way to start there, I think. These are the sorts of issues.

Starting point is 01:30:03 Like I consult with PhD students all the time. People send them to me when they don't know what to do. And the problems they're having all revolve around the sorts of stuff that you focus on. And so I think you should, yeah, you should definitely write that textbook that you were talking about. Just doing your spare time. It's on my to-do list. It's on my to-do list. And I also ward psychologists.

Starting point is 01:30:31 I went to a couple of conferences with cross-cultural psychologists. And because of these issues, there were people giving talks like, what we need to do is to extend the field work with a small community for multiple years and, like, not trying to generalize out. It's like, you're describing anthropology. That's not the solution. I know that thing. There's dragons that way too. So I just say, be careful for looking at this locius in the wrong places.

Starting point is 01:31:00 There's space for ethnographic work, but it's not always the greatest science producer. But Julia, thanks so much for coming on and talking to us and listening to us Waffle about gurus and so on as well. it has genuinely been a pleasure and we'll link like Matt said out to your blog and your work and some of the talks that we referenced but yeah

Starting point is 01:31:26 I do think for people listening they often say oh you guys hear all this stuff what sort of stuff do you like stuff like yeah thanks Julia yeah

Starting point is 01:31:40 thanks Julia yeah thank you for having me it very fun

Decoding the Gurus - Open Science, Psychology, and the Art of Not Quite Claiming Causality with Julia Rohrer

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.