Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1793: Measuring the Unmeasurable

Episode Date: January 5, 2022

Ben Lindbergh and Meg Rowley banter about Fanatics purchasing Topps and MLB Network reportedly parting ways with Ken Rosenthal because of his criticism of Rob Manfred. Then (27:36) they kick off a ser...ies of episodes about measuring difficult-to-quantify aspects of the sport by talking to Cameron Grove about translating his study of astrophysics into baseball […]

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to episode 1793 of Effectively Wild, a Fangraphs baseball podcast brought to you by our Patreon supporters. I'm Meg Reilly of Fangraphs and I'm joined as always by Ben Lindberg of The Ringer. Ben, how are you? I'm alright. Happy New Year. Happy New Year. How was your New Year's? You know, I stayed up till midnight. I was very proud of myself. I am not prone to midnight. It is not an hour I often see on purpose as a person who is an early riser.
Starting point is 00:00:49 And I don't say that as if there's any moral superiority to rising early or one way to be. That is just the way that I am wired. There's been, you know, sometimes people get very fussy about their sleeping habits. I only get fussy about yours because I don't think you sleep enough, but now you have a newborn.
Starting point is 00:01:04 So what am I going to do? Tell you to sleep more? No, I'm not going to do that. That'd be crazy. But so anyway, I don't normally stay up till midnight and I did and I was proud of myself. I survived the sound of fireworks
Starting point is 00:01:18 for many hours after that and here we are in 2022. I had to write down the year and I got it right on the first try. And so I think my year might be downhill from there. I don't know if I can do better than writing 2022 on purpose correctly the first time around. It might decline. Yeah. I don't have to write the date all that often these days. Back when I used to, I would always screw that up for a while. Now, I probably still will, but have fewer opportunities to. But yeah, I was up past midnight, which was not because it was New Year have been, but also that's just probably what I would have been doing anyway. It turns out that a lot of lockdown pandemic precautions just mirror
Starting point is 00:02:11 what I would normally do under regular circumstances. So I just watched some drunken Andy and Anderson and then went back to regular life. Yeah. I watched When Harry Met Sally, which one could argue that that's a New Year's film, right? Yeah, definitely. Oh, very much so. Yeah, it's also a Christmas film and a summer film and a fall in New York film. It is a film that reminds us that fashion is cyclical
Starting point is 00:02:36 because Meg Ryan looks fantastic throughout the whole thing, and I'm like, I would wear most of these outfits now, again, if I ever left the house. But yeah, that was quite nice and satisfying funny you know nora efron she she was a talented gal miss her wish she were still around there you go well happy new year everyone it is a new year but same slow news week generally no development in the lockout as of yet no No meetings that we have heard of. Just a couple of news items I guess we could touch on briefly before we get to our topics and guests today. So we talked last year, late August, episode 1736, we did an interview about the surprising, some might say shocking, news that Fanatics had swooped in and secured exclusive card licenses with both MLB and the MLBPA, which seemed to leave Topps, which had been making baseball cards for 70 years in a pretty precarious position. And I think there was a lot of consternation about what this meant for baseball cards and
Starting point is 00:03:50 for sports cards in general and what it would mean for Topps. And now we know, at least the latter, what it means for Topps is that Fanatics is reportedly purchasing Topps. So Topps has lost some of its valuable licenses, had lost the baseball license, although that was not scheduled to kick in until 2025. But long term, there were a lot of questions about where Topps would go. And I suppose it made sense for both parties to combine forces at this point. And it reportedly came as a surprise to Topps that they were going to lose the license, and they were a bit miffed about that maybe, but, certainly the one that they've been the longest associated with. And now they will be working with Fanatics.
Starting point is 00:04:49 And so Fanatics gets to start making baseball cards immediately instead of waiting for a few years and just gets presumably to keep using the top's name if they want to and bring that tradition and lineage and the back catalog and the experience along with them. But it sounds like fanatics is really consolidating control of this industry. It's not a monopoly, but they certainly seem to have a commanding grasp on sports cards in general and certainly on baseball cards for the foreseeable future now. Yeah. I mean, it's weird to say you feel for tops for like only getting 500 million dollars i don't know i i guess this is making the best of a bad situation i will say i am i am heartened that tops will remain involved in the card market like this because i think on that episode we had raised the concern that like you know some of fanatic's own products have not been the best the quality of them just as pieces of merchandise has sometimes been wanting and
Starting point is 00:05:51 baseball cards are you know there there are a lot of things to a lot of people they're like a speculative investment vehicle for some folks but they're also a thing that people really enjoy and treasure and collecting them means something to them and it's a connection that they have to the game and you know i think there's been some fluctuation and variability and in the sort of aesthetic quality of of tops sets over the years and i know that enthusiasts probably have their favorites and ones that they think were not quite as up to snuff but in general i think that we would acknowledge that like tops makes good baseball cards so it's good that they will keep making them because people who care about baseball cards want their baseball cards to be good and this suggests that they will be familiar in a way that is is nice so yeah and one would imagine that
Starting point is 00:06:34 tops probably had to swallow its pride lick its wounds a little bit after having fanatics kind of come out of nowhere and take away its stranglehold on baseball cards. But hopefully this means that the people in Topps will be in a better position than they would have been otherwise. And Topps has a lot of other licenses and businesses as well. They do things other than make cards and they also have licenses with like MLS and Formula One and UEFA and other leagues. So now it's just all under the fanatics umbrella and we will see what happens there. But I wanted to follow up on that because it was kind of an open question after our previous discussion.
Starting point is 00:07:13 And now there is a little more clarity about that situation. Yeah. The other bit of news is that MLB Commissioner Rob Manfred has stepped in it once again. is that MLB commissioner Rob Manfred has stepped in it once again. He has not proved to be a master of public relations, I would say, during his stint as commissioner of Major League Baseball. And that run continues this week with a report by Andrew Marchand, the New York Post media reporter, who broke the story that apparently Ken Rosenthal,
Starting point is 00:07:44 who had been an MLB Network insider essentially since the inception of the network or thereabouts for 12 years or more, he got into some troublesome hot water with the league with Rob Manfred over a column or columns that he wrote last year for The Athletic. Ken Rosenthal is a man of many jobs. He, of course, has been on camera at MLB Network a lot and also at Fox, and he is one of the lead writers at The Athletic. And at The Athletic, he wrote some things last year that were kind of critical of Rob Manfred's handling of the lead-up to the start of the pandemic delayed season.
Starting point is 00:08:30 And according to Marchand's reporting, Rob Manfred was not happy with the tenor of those reports. And again, according to the New York Post report, Rosenthal was sidelined for a few months at MLB Network. He continued to be paid, but was not appearing on screen. And then he was eventually allowed to return to the network, but his contract expired at the end of the year and was not renewed. And again, according to this report, seemingly that may have had something to do with this continued bitterness on Rob Manfred's part about his prior reports. Rosenthal has confirmed on Twitter that he will not be returning to MLB
Starting point is 00:09:06 Network. He said, I'm grateful for the more than 12 years I spent there and my enduring friendships with on-air personalities, producers, and staff. He did not confirm or deny all of the details of Marchand's report, but he did end one tweet by saying, I always strove
Starting point is 00:09:22 to maintain my journalistic integrity and my work reflects that. saying, I always strove to maintain my journalistic integrity and my work reflects that. Yeah, I mean, I don't think that anyone who has watched MLB Network over the years will be shocked to learn that there is, you know, there's a league and owner sympathetic tone that often pervades its air. And I think that there are times when that is more obvious than others. That isn't to say that there aren't people there who don't do good work and that there
Starting point is 00:09:51 isn't, you know, sort of straightforward analysis that takes place. But it is, you know, it's league owned media and league owned media is going to have pitfalls. And I think that one of them is if you are critical, even in spaces away from MLB Network's air of the commissioner, and that makes its way back to him, whether it's being directed by him personally, or as I find more likely sort of is an understood mandate of the network and its senior leadership, you're probably not going to be asked back. And that's disappointing because they are at the center of a lot of important media rights and they are in a position to help foster an appreciation for the game. But the game thrives best when it is assessed critically and honestly, and this is antithetical
Starting point is 00:10:39 to doing that. So yeah, it's a real bummer. I mean, Ken's like a pro's's pro i don't think we need to list ken's bona fides you know i think the part of this that i find the most disconcerting is that he's a reporter who takes sort of seriously and i think puts a premium on being measured and thoughtful and trying to be fair to the subjects of his reporting even as he is critical of them right he is not someone who is prone to like florid language really and if you read the piece that is linked here that supposedly got him benched last year like it's critical but it's fair
Starting point is 00:11:19 it's even keeled you know it is i think an honest assessment of where we were in that stretch where it looked like despite the ability of baseball to come back, that it might not because, you know, the league couldn't get out of its way. And if that's enough to get you benched and then, you know, to, I guess, not get your contract renewed, like, you know, what kind of hope can we have that they're going to have honest critical uncomfortable conversations about other failings of the league or the sport or the people who are involved in the league in the sport so not good as I said on Twitter and then was worried that people would think I was making like a joke about Ken's height. I was very nervous. I
Starting point is 00:12:06 tweeted, I tweeted and then I got concerned that people would read extremely large yikes as a knock at Ken and I didn't mean it like that at all. But it's it's not good. And in the midst of a moment when I think having sort of clear eyed reporting about the state of the sport is particularly important. It doesn't send a great message. Now, I think given the sort of media terms of the lockout, no one was going to MLB Network expecting that they were going to hear, you know, clear eyed analysis of the, you know, what ownership should do to meet the players halfway or anything like that. But it's really too bad.
Starting point is 00:12:42 And I, you know, I think one of the other potential casualties of something like this is that it does kind of call into question and make readers potentially nervous about the sort of objectivity of other aspects of that media operation and that's a real shame because there are a lot of really good writers and reporters who work for MLB network and work for MLB.com. And, you know, when you put your thumb on the scale like this, you, you're in danger of undermining all of that good work too. So, you know, part of the job of being the commissioner is to take public criticism. That is part of your job. It is not the only thing that you have to do, but to demonstrate that you, you know, are so, you demonstrate that you view yourself as being entitled
Starting point is 00:13:27 to be sort of immune from that kind of criticism such that you will bench one of the most popular reporters to cover the sport, suggest someone who perhaps needs to think critically about whether they have the requisite skill set to do the job. I don't know, just as a thing that we might say very casually on our baseball podcast. Yeah, I mean, virtually all commissioners of major sports leagues are unpopular. It goes with the territory very much,
Starting point is 00:13:54 and it does seem as if Manfred is a bit more sensitive to that than perhaps some of his counterparts in other sports or his predecessors in MLB. And yeah, in one sense, it's not that surprising because again, we're all aware this is league-owned media, MLB.com and MLB Network are not currently mentioning active MLB players while the lockout is going on. So clearly, these are not just any media outlets. And of course, in any other walk of life, I mean, if many of us were to say something critical publicly about the people who sign our paychecks, that would probably not go over well, and perhaps it would do damage to our job security. But I think when it's a media
Starting point is 00:14:38 entity, and when you're bringing on someone like Ken, you are having him appear on your network partly because of the credibility that he has as a journalist, right? And so what's the use of an MLB insider if you think that all of these insiders are just puppets of the league and they're only saying the things that they're allowed to say, right? I mean, you're just not going to put much stock in it at that point. And so you do have to kind of be skeptical, I guess, about anything that is produced by someone in that role, because this isn't even something that Ken said on MLB Network, right? This is in one of his other guises with The Athletic, and it still had some bearing on his employment status seemingly at MLB Network.
Starting point is 00:15:22 And so if you're kind of taking everything that is said in that vein, and you're thinking, well, is this person pulling their punch because they think Rob Manfred might be watching and it might impact them in some way? Well, then you're going to take everything that you hear with an even bigger grain of salt, right? So I think that kind of undercuts the position. You want to at least maintain some semblance of a divide between the editorial operation and the business. Even if you know that they can't be completely separate, you still want to have at least some illusion that that is the case. And that is kind of undercut by this report, really. So I think that's the upshot of it, that if Rob Manfred didn't like Ken Rosenthal's comments because he thought they made him look bad, well, they probably didn't make him look as bad as this in this report that comes out that suggests that he can't handle the
Starting point is 00:16:16 criticism. Yeah, it doesn't speak well of the media apparatus that, you know, undergirds the commissioner that this is the outcome that you get and you know i appreciate that like these are weird moments because you're like do i give someone credit for not getting renewed but i you know i do appreciate that i'm sure there was a conversation about the reason for the benching and that you know ken decided right i'm not gonna i'm not gonna change my tune on this stuff. So that's to his credit and speaks to the failing over at Network. Yeah.
Starting point is 00:16:50 If anything, this makes him look better. Not that he necessarily needed a PR boost. He has a great reputation as it is. But I saw a lot of people who were maybe less familiar with his contributions across all media saying, oh, he'll catch on somewhere else. I think ken will be fine i mean yeah i haven't spoken to him about this but uh he has only two big jobs now instead of three like maybe he'll get some sleep every now and then or something probably not but yeah
Starting point is 00:17:16 i thought that was funny too i was like no no like the athletic didn't fire ken he got fired from one of his many jobs yeah and uh it actually led to an outpouring of appreciation and affection for Ken by all of his colleagues at The Athletic and others as well. I mean, just in my limited experience with Ken, I've always found him to be very gracious and considerate and helpful and diligent. I mean, he's blurbed both of my books and he didn't even really know who I was at the time or I didn't know he knew who I was at the time. So I've appreciated that, too. And even though he's not known as, like, I guess one of the leading critics of the league, he's not necessarily like a rabble rouser or a bomb thrower. He has certainly contributed to a lot of reports and investigations that teams and the league probably would rather not have surfaced. So he has collaborated with a lot of colleagues at The Athletic on reports of that
Starting point is 00:18:11 nature. I mean, the sign stealing report, of course, which he co-authored or other, you know, sexual harassment and assault reports. I mean, his name has been all over many of those reports. has been all over many of those reports. So, I mean, he's been doing this for decades on air and in print, and he has made a deserved name for himself. And I think this also sort of illuminates some of the almost unavoidable conflicts that can come about in the media industry. I mean, it's hard to find a lot of baseball media members who have been working for a while who haven't had some kind of relationship or connection to the league in some kind of capacity. I mean, even including us, right? We have both appeared on MLB Network. And I've done occasional appearances on MLB Network dating back probably almost a decade at this point and have been paid for some of those
Starting point is 00:19:05 appearances. And I know Fangraphs has some slight loose relationship with MLB, right, when it comes to data sharing and, you know, stat cast stats appearing on Fangraphs player pages, etc. So it's hard to find someone who has no connection to, you know, either MLB or to an MLB rights holder. to either MLB or to an MLB rights holder. And you kind of have to be aware of that. I mean, my relationship with MLB Network has always been pretty informal. I've never had a contract or anything there and I've never had regular
Starting point is 00:19:35 or lucrative enough appearances there for it to matter all that much to me financially. It's just the thing to do every now and then when they invite me to. And generally they're bringing me on to talk about more X's and O's stuff, right? Like player evaluation or sabermetric stuff more so than like the labor situation. So I think you kind of have to consider the source and consider the subject matter when you're watching that stuff. But it can get tricky because uh there are all these ties and and that's something that we've talked about with gambling for instance and the
Starting point is 00:20:10 fact that they're such big investments by gambling companies and sports betting in media operations and you always kind of have to question well is this person being told what to say or what not to say and in my appearances at mlb network i've certainly never been told what to say or what not to say. And in my appearances at MLB Network, I've certainly never been told what to say or what not to say or been critiqued for something I said. But when you're one of the most prominent voices and you have a more formal relationship and you're talking about things that maybe touch on more sensitive areas for the league, it is not shocking but still somewhat disappointing that things came to a head there. So even though the Ken Rosenthal rules might not apply to me, an occasional contractor,
Starting point is 00:20:50 and someone who Rob Manfred might not read or listen to as regularly, I'd still have to reevaluate whether I would want to go on because I wouldn't want to give anyone the impression that I was compromised in some way and wouldn't want to condone essentially censorship on some level of someone I really respect, legally sanctioned censorship, obviously. This isn't some sort of First Amendment issue, but that precedent would still give me pause even if it didn't directly apply to me. Yeah. I think that individual media members have to navigate potential conflict all the time, as you said, whether it's, do I appear on this show? What is the relationship of my site to the league? You know, what do I we we both have, you know, not just Jeff, like we both have friends who work for teams where, you know, we and try to be aware of them. I think that you are often aided in that pursuit by being able to rely on colleagues to help you assess potential conflicts and understand when disclosure might be necessary or when something is just too close a relationship for you to reasonably you know offer an opinion on it without either actual bias or an unavoidable appearance of bias i think that that challenge just automatically gets harder when you're working for the league which isn't to say
Starting point is 00:22:16 that it can't be done but it does get harder just on its face because your direct editorial line goes up to an mlb owned entity, in a way that is not true for you at The Ringer, for me at Fangrass, even though we do have, you know, a relationship with the league around data. So it's just a really tricky thing to navigate. And I think that it's something that, you know, reporters should be cognizant of. It's something that readers should try to assess and be mindful of. And I don't think that those are sort of unresolvable issues, right? People navigate conflict all the time and can do it, I think, successfully without being compromised. But it's a thing that you
Starting point is 00:22:56 have to work at regularly and be mindful of. So it's a tricky thing. And it's made all the harder when your editorial line is potentially putting its thumb on the scale or getting a directive from the business side to present things in a particular way. channel, even if it's mostly showing Kevin Costner movies at this point. I think the production values are typically very good, especially on game broadcasts. And if you're someone who's into studio shows, then it's nice to have some that are devoted to baseball as opposed to, say, turning on ESPN and rarely hearing about baseball at times. So you just have to be aware of what you're watching and where it's coming from and if you're looking at it through that lens i think it can still be good and hopefully effectively wild is a place where people are getting the straight dope as you see it and we're not beholden
Starting point is 00:23:55 to a league we're not beholden to sponsors our sponsors are our listeners so we say what we think and we try to be fair and we definitely say some things and have some people on to talk about things that all told probably the league would rather not have people talking about. Really? Like this segment for instance. Anyway. Anyway. So what we wanted to do this week because it's a slow news period, we're doing a little theme week here and the theme is measuring the unmeasurable. And I think one of the things that really got me in deep to baseball
Starting point is 00:24:32 analysis and caring about baseball and working in baseball in some capacity is the idea that there were things that were misunderstood or still undiscovered about the sport despite its long history. And it was really intoxicating to find out about things like catcher framing or even earlier sabermetric innovations and to think that, wow, the things that I thought about the sport were wrong and no one knew this. And often it confirmed things we thought, but it was still kind of cool to see it in objective quantified way. And I think in recent years that has become a bit rarer. I find myself a little
Starting point is 00:25:06 less excited by discoveries and research than I used to be just because, you know, there's less low-hanging fruit, I suppose, than there once was. And also, there's been a bit of brain drain, perhaps, and teams have hired a lot of really talented public analysts. And also, there is a divide in terms of the availability of statistics and you have certain StatCast stats and other data sources that teams have access to that public researchers don't. So there was a time where public researchers were way ahead of where teams were, and then they were kind of neck and neck for a while. And now in some ways, I think that private proprietary analysis is ahead, but there is still a lot of really
Starting point is 00:25:45 interesting and valuable research being done in the public sphere. And there are several studies that have caught our attention recently. And so we wanted to devote this week's episodes to talking about their authors and trying to plumb some of those depths and explore some of those unexplored areas. So we've got two conversations lined up today. Later on, we will be talking to Eric Shalek about his Negro Leagues Major League equivalencies, where he has tried to determine what Negro Leagues players might have done, statistically speaking, if they had been plopped down into the AL or NL and allowed to play in those leagues at the time, just trying to come up with some baseline to compare in a more accurate way between players
Starting point is 00:26:33 on one side of the color barrier and those on the other. And in our first segment, we will be talking to Cameron Grove, who is an astrophysicist in England, who has also been doing a lot of really cutting-edge baseball research on things like stuff and quantifying pitcher repertoires and catcher game calling and tracking pitcher deliveries and a lot of other exciting areas of research. So we will be back in just a moment with Ken. We are joined now by Cameron Grove, who tweets about baseball at pitching underscore bot, writes about baseball at his site, Ahead in the Count, and has contributed to Baseball Perspectives. Also, he's an astrophysicist on the side, though maybe in most contexts he would describe himself as an astrophysicist who studies baseball on the side. Cameron, welcome to the show. Thank you for having me. There's a pretty rich tradition of baseball analysis by scientists and thinkers who are
Starting point is 00:28:03 way overqualified to be researching sports, and you seem to fit right into that lineage. So we're lucky that you've decided to dabble in baseball while you're not busy trying to answer some existential questions about the universe, because I think I might be more interested in your day job than I am in baseball. So let's start there. Where do you work and what do you do? Yeah, sure. So as you might be able to tell by my voice, I'm British and I'm a PhD student or a grad student, I guess as Americans say, at Durham University, which is in the northeast of the UK, kind of near Newcastle,
Starting point is 00:28:44 if any of you know where that is. But yeah, so I've been there for a few years now, and I have one year left on my PhD, basically working on simulations of the universe at kind of the very largest scales. So we have big supercomputers here, and we kind of run universes on them, see what changes when we tweak the cosmology and we kind of run universes on them see what changes when we tweak the cosmology and that kind of stuff so it's interesting work but uh very computationally expensive and heavy so yeah the baseball stuff is more interesting to me is it really wow it sounds simple by comparison i mean you're studying what the expansion of the universe and dark energy
Starting point is 00:29:22 and how galaxies form or i like to nerd out about that stuff in my spare time so is there a particular research interest of yours i know you've studied gravitational waves so um my kind of main project is to do with a telescope called desi which is the dark energy spectroscopic instrument or spectrographic i can never remember but it's mounted on a telescope in arizona and it started taking data this year and what it's doing is it's measuring the positions of millions of galaxies going all the way back to the early universe and with that data we can learn a lot about kind of what kind of universe we live in and kind of what the expansion history of the universe is like. But my specific job,
Starting point is 00:30:10 because this is a collaboration with hundreds of people in, I run these simulations that allow us to kind of work out that the instrument is working correctly. So we only have kind of one version of the sky that we end up seeing through the telescope, but with simulations we can create hundreds and thousands of kind of fake catalogues of galaxies that we might see. And so by comparing what we rarely see to all these simulated versions, we can kind of match them up and see, okay, which simulated universes have the most similar properties to the one that we actually see. So yeah, my part is basically doing these simulations and then making sure that the simulations are doing what we want them to do. Because, you know,
Starting point is 00:30:54 it's computer code and there's bugs everywhere and we don't really know what the right answer is. So we have to make sure that that all works out okay. So what was the path that brought you to baseball and to baseball research? What question did you come across where you thought, aha, I answer far harder questions than this at work all the time. I'll give it a shot to see if I can sort out this baseball industry instead. So I didn't really follow baseball at all until maybe midway through 2019. So I'm a relatively recent follower of the sport, but I'd followed other American sports like football or American football and basketball. But then I actually
Starting point is 00:31:34 started doing baseball statistics research to learn a new programming language. So I was doing a placement at the Department for Education here in the United Kingdom, and they use a language called R. And I use something else called Python for all my physics work. So to force myself to use R and to force myself to learn this new language, I thought, OK, what am I interested in that will, you know, use coding and force me to learn this new language? And so I tried doing some kind of baseball stats in it. force me to learn this new language and so I tried doing some kind of baseball stats in it and then after the project has ended I just kind of kept on going with it and kind of found more interesting questions to answer so the first kind of major question that I kind of had that I thought was relatively maybe new in the public space was looking at kind of what makes a good pitch and can we quantitatively
Starting point is 00:32:29 measure that so often you'll hear announcers say oh that was a good pitch and or that was a bad pitch and he got away with it and you know they're seeing it so surely there's some way of measuring that and that's where the kind of name for my Twitter account came from, Pitching Bot. It was this attempt to make a kind of a model to evaluate pitches without necessarily looking at what happened after the pitch, but just looking at the velocity, the movement, where the pitch was in the strike zone, that kind of stuff, and put that into kind of an overall score so that was the first kind of major project that I undertook and the main question I wanted to answer but that spawned a whole load of different stuff and yeah I've done a lot of varied projects since yes yes you have you uh your research has touched on a lot of areas that are kind of close to our theme
Starting point is 00:33:23 this week of measuring the unmeasurable and it must be challenging if you're just joining baseball and sabermetrics midstream decades into that field to figure out, okay, what has been done already and what has been studied and what's the past research that exists? You know, am I answering a question that has been answered previously? I guess you're joining at a time when the availability of data has really skyrocketed recently with StatCast and other data sources. So some of those questions just weren't answerable really at all for most of the history of sabermetrics even. But still, you kind of have to familiarize yourself with the landscape. And I'm sure that given where you grew up, you were probably more inherently familiar with cricket or with non-American
Starting point is 00:34:09 football and various other sports. But I guess even though there are some rich traditions of analytics in those sports, maybe the availability of data for you to practice on and bring the methods that you had already mastered in your other work maybe weren't quite as plentiful. I guess baseball just really lends itself to that, which is why so many people have gravitated toward baseball analysis, even if they don't initially know that much about baseball. And have you grown to like baseball as a fan, as a spectator, or is it mainly still sort of as a scientist and as a researcher it's a bit of both i would say um i try to catch games when they're on but being in the uk you know i don't want to stay up past 2 a.m at the latest so that limits me to kind of daytime east coast games um when
Starting point is 00:35:00 they're on sometimes i'll stay a bit later for the playoffs but um that's the only exception i don't really have a favorite team i would say i kind of i like them all i like a good game you know no favoritism there but no yeah as you said the availability of data is something that i haven't seen in any other sports and it's mainly why i chose baseball as the sport that I was doing my analysis on because with StatCast especially and just so much public data that when I found it I was like oh my god this is amazing and it it feels so underutilized in the public space I guess I mean people do loads with it but there's still so much more you can do with it. And because it's so complex and, you know, there's like a hundred different features you can have on each pitch and there's loads of different variables to look at. There's
Starting point is 00:35:54 a lot of interesting stuff you can do with it, kind of beyond what you could do maybe five years ago with normal stats. Yeah. So the way that you started with quantifying the stuff of a pitch and predicting its results just based on the pitch characteristics, I think maybe the first person I'm aware of who did work like that is Jeremy Greenhouse years ago at Baseball Analysts, and he works for the Cubs now. And there were some early models like that, and now some people will see. You know, Sarah Soften cite one called Stuff Plus in his work at The Athletic that he developed with Max Bay, who works for the Astros now. But I know that you have improved that model and added more variables to it. And the main application
Starting point is 00:36:37 usually is just figuring out how good is this pitcher? How good should he be just based on his stuff? But I'm also really interested in the applications you found of that model to broader questions, more general questions about how baseball works. For instance, I've seen you research the diminishment in stuff that pitchers have after they work a certain amount of innings or maybe a few days of rest, do they have worse stuff? And you've been able to show that that does seem to be the case. I think one of the most interesting applications of that that I've seen is that you studied the times through the order effect. And historically, there's been a bit of disagreement. Why do batters get better against pitchers as they see them more times within the same game? And there are two schools of thought there. One is that stuff is getting worse. Pitchers are just
Starting point is 00:37:30 running out of gas as the game goes on. The other is that it's a familiarity effect and that the more times the batter sees the pitcher's pitches, the better he's able to anticipate and predict and hit them. So you applied your model to trying to answer that question. What did you find? Yes. So you have the pitch quality on one side and then you have the results on the other side. And if you kind of group those by the times through the order, then on the first time through, they're kind of in pretty good agreement with each other. But then as you go to more and more times through the order, the pitch quality metric stays roughly equal.
Starting point is 00:38:09 There's no significant decrease in stuff. Or even in the quality of pitch locations, it's not like pitchers lose their feel and start, you know, missing the zone or hitting the centre of it. That stays very consistent. But the results that the pitchers get on those pitches show a clear decrease. So hitters get a lot better, even though the pitchers are throwing the same stuff that they did in the first inning. So it's pretty clear to me that that was evidence for a familiarity effect.
Starting point is 00:38:39 And you can also split it up by pitch type and look at okay so if an individual batter has seen you know four change-ups from this pitcher in this game then you can also see an even more stark decrease in the results while pitch quality remains pretty stable so it's definitely familiarity as far as i can tell yeah that's i was excited to see that because i've been kind of on team familiarity effect when it comes to that question because there have been good studies, I think, that have come to different conclusions. But I've found the ones that seem to indicate that it's familiarity more convincing on the whole. then there are implications for strategy and pitch selection and how quickly do you get through a plate appearance that affects how big the times through the order effect is for that particular hitter so i think analytically that's kind of the more interesting answer not that that matters but
Starting point is 00:39:36 i think that makes it more compelling to me and i guess there are ways that you could extend that theoretically and maybe you can look at across games instead of within games. I have looked in the past at what happens when a starting pitcher faces the same team in the playoffs within the same round. And it seems like as long as you're on regular rest, there's no drop off in performance. Although pitching on short rest can be bad. One question that always comes up during the playoffs is with relievers seeing the same opponent within the same series, you know, two or three or more times. And I think there's some suggestion that maybe they might get a little less effective there. So if at some point you have time, I guess you could potentially apply it to that question too. But are there any other applications of your stuff model that you are interested in examining
Starting point is 00:40:27 i try to keep a big list of possible questions to answer somewhere on my computer i've lost it at the moment but i'll add to that question about playoff familiarity in there yes and maybe you'll see a tweet about it sometime in the next few weeks we'll see good but i mean there's so many other questions you could try to answer this you essentially when with these kind of pitch quality models you have a whole alternate universe of possible outcomes that don't actually happen but are you know probabilistic so you can look at kind of expected strikeouts expected walks instead of actually what actually happened. And now you can build a whole new set of statistics from that. So yeah, there's lots
Starting point is 00:41:11 of interesting stuff that can be done. One topic that seems to have been on your to-do list that you took a pass at answering and which has been a topic of great interest to Ben and I over the years is measuring catcher game calling, right? This has been sort of the next frontier in trying to quantify the value that catchers bring to their teams. And unlike pitch framing has sort of stymied researchers in their ability to answer definitively the value that catchers are bringing to their teams by their game calling. And so I wonder if you might share your approach to trying to answer this question and then what conclusions you were able to draw, because I thought you took a really interesting tact when it came to this. Sure. So the way I tried to look at catcher game calling
Starting point is 00:41:57 was by looking at the run values that pitchers get, depending on who's catching them. So on most teams you have a catcher and a backup catcher, and pitchers will kind of maybe get slightly better results with one than the other, but using something like ERA it just takes far too long to stabilize to get any kind of good pattern out of that. So I essentially looked at something called run values, which is on every single pitch you can have either like a ball or a strike or a ball in play, and those can be assigned, you know, positive or negative run values based on whether it's more likely for the team to score after that. So you can use these and see, okay, with one catcher does one pitcher have a better set of run values than
Starting point is 00:42:46 a different catcher and you can also go beyond that and look at okay how about not run values but expected run values so the expected run value is kind of okay if we kind of do all these modelling and predictions of pitch quality, do some pitchers actually throw better pitches when one catcher is calling the game for them than another? And that's something that seemed to have an effect. So the most clear kind of pair that I found it for was on the Red Sox. Sandy, Leon and Christine Vasquez. Basically, every pitcher on the Red Sox was throwing better pitches when Leon was catching. And that was kind of surprising to me because, I mean, I know that catchers kind of, they choose
Starting point is 00:43:40 where the pitcher is supposed to be throwing the ball. But it seemed to be that they were locating better. I can't remember if I looked at the stuff numbers, if they were throwing better stuff or not, but certainly they were able to just locate their pitches better when Leon was catching. I don't know whether that's because he was calling the pitches in a better place, or whether he was calling pitches that they were more comfortable with using on that day and they had more command over but yeah so it's really interesting how that effect
Starting point is 00:44:12 arises yeah and that's probably an encouraging result because leon is reputed to be a good game caller and also he's been just about the worst hitter in baseball over the past three or four years so you'd have to think that there is a reason why teams keep playing him and whether they have actually studied his game calling or not. Certainly, they've talked to pitchers and heard that about him. And I think the only hitter who's been worse than Leon over that same period is Jeff Mathis, who is also supposed to be a great game caller. And I think your method found him to be above average not nearly as much as Leon but sometimes you want your results to be surprising but sometimes you also want them to match I guess the accepted wisdom because that can kind of lend
Starting point is 00:44:59 credence to your method so it could be any number of things, I guess, that we're calling game calling here, and it could be managing pitchers in some way, or it could be pitch sequencing, let's say, but it's kind of encouraging that you came up with catchers mostly who have that reputation, I suppose. And pitch sequencing is actually something I wanted to ask you about because that's something where maybe you could apply findings from both of these methods or models is that something that you've looked into at all what makes a pitch more effective given what comes before it or after it let's say because that's still sort of a black box that a lot of researchers have tried to tackle over the years it's something that I've considered, but I've never dove into it just yet.
Starting point is 00:45:47 It's certainly a very complex problem because there's just so many sequences that it gets out of hand so quickly. Trying to make any sort of general statement about it, there's like, I don't know, eight different types of pitches that you could throw, and then so many different orders, different locations. I suppose one way of doing it would be to look for so as i did for the times through the order effect is to look at okay we have these pitch quality metrics are there certain sequences that cause right pitches to overperform their pitch quality and some that cause them to underperform but it's not something that i've considered yet just because i think it's a really hard problem to solve. And it's probably very batter specific as well. There's
Starting point is 00:46:29 probably different patterns that different batters can be weak to, that catchers might know about, and some might not. So lots of interconnected effects there that are hard to isolate and find which sequences are objectively better. I have sort of a broader question across all of your inquiries here, which is, you know, you said that there is a lot of publicly available data and it's somewhat underutilized in terms of people trying to answer interesting baseball questions. But I wonder if you might share, you know, if you could access anything on the team side, is there like a particular bit of information we just don't have on the public side that you're particularly keen to get your hands
Starting point is 00:47:10 on? Is there a data set sitting out there that you think if I could just grab that from across the divide between the public and the team side, I'd be able to answer this interesting question? I think player positioning is definitely one of those. So I'm really intrigued by StatCast's out above average model. And so having, being able to kind of make my own version of that, I think would be a really interesting project, but that's all kind of behind closed doors. So I can't quite get to that yet.
Starting point is 00:47:39 And then another thing is also all the minor league data that's there. And teams obviously keep that under wraps for good reason. But being able to look at kind of how players develop and at the major league level, we basically see the finished products. But there's probably so much more kind of interesting tweaking that could be done in the minor leagues if we had, you know, full stat cast data for all of that. So it does exist for one small set of leagues, and I have done a small bit of looking at that, kind of feeding that through my pitch quality models. But yeah, if there was like full minor league data for all players going back through time, that could be really interesting to see the kind of tweaks
Starting point is 00:48:25 that get made and how it improves players well it seems like one gap that you're trying to close is the motion tracking gap and pose tracking and a lot of your tweets of late have been devoted to your efforts to capture picture deliveries from video footage so if you work for a team and you subscribe to certain data providers, whether it's StatCast or others, there are teams that get full feeds of, let's say, a pitcher's delivery based on many points of articulation and very rapidly captured points in space and time, and they're able to put together these models. It's StatCast. There's kind of a feed that teams can subscribe to over and above the standard StatCast that gives them that information. And then there are other third-party providers like SimiMotion and
Starting point is 00:49:18 Kinetrax that have systems set up in ballparks that capture this movement data remotely just as players are playing. And we don't have any of that. And if we did, it would be tough to parse because it's an enormous amount of data and you'd have to figure out what to do with it, which is a challenge that some teams are facing now. But you are doing your best to make some version of that publicly available. So what is your method? how have you captured player poses and what do you hope to accomplish with that sure so the main reason i actually got into looking at
Starting point is 00:49:53 pose detection of players was because i was making a gif and i wanted to align some of the i was making a gif of some batters and I wanted to align them. But because the camera angles are slightly different on different days, the players were always just slightly blurry. And this was annoying me quite a lot. And so eventually I was just like, oh, I'll find something that can, you know, measure where a person is in a video or a photo. And so there's something called KAPAO. I'm not sure how it's supposed to be pronounced. It's K-A-P-A-O. But it's essentially a machine learning based pose detection algorithm that works quite fast. 10 seconds of video, which is pretty fast when you consider the complexity of the task it has to do. And then applying that to pictures deliveries, for example, I just thought that's a kind of a data set that the public doesn't have access to. But the potential value in that for analysis of
Starting point is 00:50:59 players is there's just so much you can do with it. So I've just been trying to kind of build up a kind of small database of lots of different pictures, deliveries, and then just open that up to other people to see if they can find anything interesting to do with it. So one example, it was one of the first things I looked at, was I was linked to a Twitter post that suggested that Luis Castillo was tipping his pitches early in the 2021 season. And so I downloaded some of the videos from his starts where he was supposedly tipping and isolated his pitching motion and split it by different pitch types. And you could see really clearly in that data a big delay when he threw his sinker,
Starting point is 00:51:52 as opposed to his other pitches. So I found that a really interesting conclusion just that you could get from the data itself. And you could maybe expand that to kind of look for pictures who are tipping kind of more generally if you had a bigger database and you can do a lot more stuff to do with maybe looking at what correlates with injuries is there anything that happens kind of on a given start before a picture gets injured and yeah there's loads of stuff you can do with it so I kind of set up a pipeline to try and get as much of this data as possible which is quite hard to do because you've got to navigate the baseball savant website to find where the videos are hosted and then
Starting point is 00:52:31 download them and things and then find out when the pitch is happening and where in the video so it's quite a few steps but seems to be seems to work so far have you gotten any feedback from the team side on on efforts there? I'm curious if, because I imagine that, you know, while they have access to a lot more information, they're always looking for something that might be able to optimize pitching. So have you gotten any feedback from the team side? I've talked to quite a few people from teams, just more generally, rather than being super specific. I think a lot of them are quite interested in the work, but they have all the data already. So kind of how I get there isn't
Starting point is 00:53:12 as interesting to them. I think they're just more interested in me as a person rather than the stuff I've been doing necessarily. Yeah, I would be shocked if some teams weren't doing something similar with looking for pitch tipping, let's say, and, you know, in a legal way, not in a real time in-game video kind of way, but between starts, let's say that is allowed and you could look for things that you could then apply in games. And I guess there are limitations in the way that you're doing this and that you're getting it from a 2D image, right? So are you able to extrapolate that to figure out positions in space?
Starting point is 00:53:52 Or is that too complex? Or is it just kind of inherently limited because you just don't have access to the same data that teams would? Yeah, so at the moment, it's all 2D data. I'm not trying to kind to project it into 3D. I've thought of some ideas of how to do that, but given the frame rates of the video, it's only 60 FPS. So there's only so much detail you can get from that. There's probably something you could do looking at, I guess all the limbs stay the same length. So if you know how long someone's limb is, you can probably project kind of how far it is
Starting point is 00:54:29 in the depth of the video. But it's not something I've considered yet. And if I were to try it, I imagine there'd be some horrific creations made from that horribly mutilated picture of oceans. Great. You could get some very good Twitter content out of that, though, I suppose. Yeah, I bet.
Starting point is 00:54:48 Apart from the analytical value, it's just kind of cool to see, like, when you recognize the wireframe pitcher, you know, when it's Chris Sale or Tyler Rogers or someone like that, that it's like that little flicker of recognition, because even in the stick figure, you know who that is. It's kind of like when PitchFX first came about and we were actually seeing these things represented via data for the first time, it was like, oh yeah, that's how that pitch moves. And that was like before we figured out what to do with that information, there was just kind of that little just enjoyment of the
Starting point is 00:55:22 recognition of, yeah, here's the thing we've seen in real life that is now captured in this data. And it seems like there's a lot you could do with this in theory. I don't know if the data quality will be good enough to do it, but I know that teams are doing it with what they have. I wrote a feature last year about pitcher deception, which is a subject that has fascinated me for quite a while. And as you must know from your stuff model, there are pitchers who just have some ability to repeatedly exceed or underperform what the stuff says that they should do, right? Their stuff says they should be this good, and they're actually that good, whether that is better or worse.
Starting point is 00:56:02 And there are a whole range of reasons why that could be so. But one of them that really fascinates me is deception. And I know that there are some researchers and some team people who are trying to quantify that just based on this post-tracking information and trying to figure out, okay, how long is the ball actually visible when it first becomes visible to the batter to the point when it's released? How long or how good a look do you get at it? Are you hiding it in some way behind your body that makes it tougher to pick up? Or is it just the release point is uncommon and therefore players aren't ready to hit it, etc.?
Starting point is 00:56:38 So I don't know whether you've looked at that yet or whether you think you can with the quality of the data that you have. But I would be very interested to see any research along those lines yeah that's definitely something that i can try looking at i mean it's something that's really a really hard problem to answer so i've kind of looked at okay who overperforms my kind of stuff models and who underperforms and trying to find some kind of coherent set of qualities about either of those populations is just almost impossible as far as i can tell so far so looking for kind of other data sources that might inform on that is always something that i would be interested in i haven't found any evidence yet but i mean it's only like a stick figure diagram
Starting point is 00:57:24 so you can't really see where in their hand are they holding the ball or is it actually picking up their hand or is it their wrist or you know if they're wearing a glove then their hand shakes all over the place because the the algorithm doesn't know where to look so it's definitely a tough problem and not something that i've come close to solving yet i think one of the other tough problems that we are trying to sort out collectively is how to both assess and potentially improve umpiring in Major League Baseball. And I know one of the other sort of areas of interest for you, and you published a piece about this at Baseball Perspectives, which we'll link to in the show notes, but was trying
Starting point is 00:58:03 to understand how some games are themselves harder to umpire than others, irrespective of the sort of base competency of the umpire involved. And I wonder if you might take us through some of some of that research, because Ben and I have long been of the of the mind that a human ump back there is probably for the best, we're a little skeptical of the robo ump revolution and i found this piece really interesting because it suggested to me that we are perhaps thinking about umpire calls a little bit incorrectly from a unit of analysis perspective right that we we get fussed about the the worst blown calls rather than thinking about umpire
Starting point is 00:58:43 performance sort of over the course of not just an entire game, but an entire season and sort of understanding that some games are themselves perhaps a little more difficult to call back there than others because of the pitches that are being thrown. So tell us about games that are harder to umpire than others. Sure. So this came about because of the umpire scorecards Twitter page, which often blew up when there'd be some game where the umpire had a massive run favor for one team rather than the other. Yes, we had the creator of that account on the show, actually. So it was definitely of interest to a lot of people. And so I was kind of looking at this and thinking, well, a lot of that run value can sometimes just be from one extremely borderline call.
Starting point is 00:59:28 Like if the bases are loaded and it's a 3-2 count and there's a pitch that grazes the edge of the zone, then it's probably a 50-50 call. gets it wrong according to where StatCast said the ball was, then that could be a change in run value of plus one run for the team that gets the benefit of the call. And so I was kind of interested in, okay, are there a lot of games where that happens? And does the game itself actually tell us more about what the ump scorecard is likely to show than the umpire's quality. And so to do this I built a model for called strikes just based on where in the zone the ball lands. So for each pitch that was taken you can assign a probability of an average umpire calling it a strike. And one thing that's immediately obvious is that the shape of that zone isn't the same as the rulebook zone. It's a bit fatter, a bit shorter, and more rounded in the corners.
Starting point is 01:00:33 So some bad calls by umpires where they don't call a strike that clips the corner of the zone, that might only get called a strike 10% of the time. So is it really an inconsistent call if some umpires don't call that? And you can aggregate this over the course of a game and see that in fact a lot of what comes out from the umpire scorecards and a lot of the kind of run favour is almost determined by where the ball lands anyway and the differences between this average umpire zone and the rulebook zone so there's definitely quite a few cases i think it was a dodgers giants game the one that was the the offender that uh spurred me to make the model but i think that was biased towards the dodgers by a couple of runs and and I simulated it with fake umpires
Starting point is 01:01:27 assigning ball strike calls according to these probabilities that I'd made. And most of those favoured the Dodgers as well, so it wasn't the specific umpire's fault that he gave the Dodgers more runs than the Giants in that case. An average, impartial umpire would have done the same. And so yeah, it's just something that I found quite interesting as kind of context for these scorecards. And some games really are really hard to call accurately. If they have loads of borderline calls with men on base in deep counts, then the umpire is just gonna he's gonna be wrong in one direction by quite a lot for sure so yeah that was something that i thought was a really interesting project and craig from baseball prospectus saw my tweet on it and said oh you should write this up into
Starting point is 01:02:18 an article and uh yeah so that's how that came about so last subject before we let you go if there is one subject about baseball that has resisted being measured and explained over the past several years it is the ball itself i know you've done some work on predicting the difference in offense after the supposed deadening of the ball and you've also tried to do a little work looking into the idea that there were multiple baseballs in play last year multiple forms of baseball so what have you been able to determine and what is still something of a mystery sure so my investigation into the ball was primarily looking at the distances that fly balls traveled in 2021 versus previous years. So MLB said they were going to deaden the baseball
Starting point is 01:03:10 so that hopefully there won't be so many home runs as there have been over the past few years. And I mean, we have all the StatGas data. So I was interested to see, well, can we measure the effect of the deadened ball and see kind of how it might have changed the game and what that might mean going into future seasons? You know, what changes might be sticky? What changes were a bit more kind of random?
Starting point is 01:03:33 So I essentially recreated some work that was done by Alan Nathan, I think, and that was published at Fangraphs, looking at predicting the distance of fly balls and what kind of different factors affect how far the ball travels. So there are the obvious ones such as launch angle and exit velocity, but then spin on the ball actually has a big effect. So when the ball spins, you get more drag and so it doesn't travel as far. Unfortunately, I don't have access to batted ball spin data, but it correlates relatively well with pull angle. So if you pull the ball, you hit it more squarely, I suppose.
Starting point is 01:04:14 And so it spins less and travels further. So you can kind of roll all these into a model. And I also looked at adding the weather. So the wind direction, the temperature, how that might affect things as well. So you have all these different factors. And so I built a model to predict how far the bowl would travel in 2020 and earlier based on all these factors. And then I redid all the models, but only using 2021 2021 data and so by looking at the differences in the model predictions you can say kind of what changed about the ball and the main things that i could find out was that well the ball was traveling less far which i guess agrees with the fact that it was
Starting point is 01:04:58 deadened so that's a positive the effect was in the correct direction. And it was balls that were hit at more extreme launch angles that were being deadened further. So either kind of flatter, kind of more line drive-y type hits with top spin, they were traveling less far. And as were, you know, the really skied fly balls as well. So those are the ones that I think are more likely to have spin on them. And so, you know so you have more spin if the drag on the ball in 2021 was increased then that might make those balls
Starting point is 01:05:32 travel less far and this was I was like okay maybe it is the drag has increased and when you compare with the effect of the weather and how that's changed it does seem to be that the 2021 bowl is kind of much more kind of draggy and is affected by air resistance and the wind and spin more than the previous bowls.
Starting point is 01:06:00 So looking at, say, so I split it by ballpark. And so Wrigley Field was most affected by wind direction and temperature by far. It must be super, super exposed there. But the effect of the wind almost doubled as far as I could tell in Wrigley with the new ball compared to the old ball in terms of the effect of the wind direction and strength on fly ball distances. So I found that quite surprising at the magnitude of the effect that and strength on fly ball distances so i found that quite surprising at the the magnitude of the effect that was changing there so yeah definitely some changes yeah and one other um aspect of it was so there was the whole controversy a few weeks ago with the the two baseballs right and i don't have anything to say on that matter based on this
Starting point is 01:06:48 research basically the um the kind of errors of the model which i think were mainly caused by me not having access to spin data mean that there isn't really that much i can say about whether there are maybe two populations of balls or not was thinking, so what happened in 2021 that was different to all the previous years was that the errors on the model got way bigger for some reason. And so I was thinking, okay, maybe that's because there are these two populations of bolls. And so, you know, when you sample from different populations, you're going to get a larger spread in the effects that come out. So maybe that is evidence for two-base balls. But after doing some more research on it, I think that's primarily because the balls are more affected by wind and spin than they used to be.
Starting point is 01:07:37 So essentially that just smears out the distribution a bit more. And so it effectively means my model is less accurate because the balls are being pushed around by all these different forces to a greater degree. I see. So some mysteries remain. Exactly. Yes. Is there anything we haven't touched on yet
Starting point is 01:07:55 that has been an area of interest for you over the past few years that you have looked into already or that you're still hoping and planning to at some point? I think you've covered most of it, to be honest. I mean, I tweet most of the stuff that I end up doing research on, so there isn't much unturned as of yet that I've thought about. I guess the limit kind of comes from the data that's available. So if there's potentially more interesting data sources coming out in the future, then maybe there'll be new questions that can possibly be answered.
Starting point is 01:08:31 Well, last thing then, Meg asked what you would be most tempted by as far as team databases. And you noted that some teams have talked to you, which is hardly surprising. The conceit of this series, or one of them, is that there's still really interesting and valuable public baseball research being done and that not everyone has been immediately snapped up by teams. But multiple people that we've reached out to talk to this week have already mentioned that they have been contacted by teams, as one would expect. And often that's the case.
Starting point is 01:09:02 You see some promising researcher appear on the scene and publish a few things and you get excited about what they'll work on next and then suddenly they disappear and you are of course happy for them that they got to pursue that if that's something that they want to do but sorry that everyone else is deprived of their insights. Now, for you, you have a career of your own that's separate from baseball and you're in England. So presumably you would have to either convince a team to let you work remotely or move and switch continents. And I don't know how much you want to do that, but what are some of the considerations for you there is working in baseball an ambition for you and something that you think
Starting point is 01:09:46 you might want to do well one thing that i do know is that i don't want to stay in academia so baseball is definitely an option i've got one year left on my phd so at the end of this year i'll kind of be writing up my thesis and hopefully that'll go well. But after that, I'll be looking for real jobs. I would like to stay in the UK. So that's definitely a big consideration. Moving to America would be a big change. I mean, I've been there on holiday, but that's a bit different. Right.
Starting point is 01:10:19 So, yeah, I'm not sure at the moment, but I guess we'll see what the future holds. So yeah, I'm not sure at the moment, but I guess we'll see what the future holds. Yes, you remind me a bit of Rob Arthur, whom I've worked with in multiple places. And he came from a genetics background and was kind of studying that and maybe going into academia and decided not to do that and to kind of give himself over to baseball research and other journalism work. himself over to baseball research and other journalism work. And fortunately for all of us, he has contented himself with consulting for teams so that he has still been able to be a public researcher to some extent. So selfishly, I kind of hope that that's what happens with you, but I wish you the best whatever happens and we'll enjoy your work while we have access to it. So I will link to all the things that we have talked about and all of the various places you can find Cameron on our show page as usual. But thanks for coming on
Starting point is 01:11:12 and we're glad that you discovered baseball. Thanks for having me. Okay, we'll take another quick break now and we'll be back in a moment with Eric Shalek to talk about developing major league equivalencies for Negro Leagues players and other players from the black baseball game. so less than a month ago we had adam drowski on the show, Adam from Baseball Reference and from the Hall of Stats, his website that has a purely statistically based version of the Hall of Fame, who is the most deserving of being enshrined purely based on the stats. And at the time that we talked to him, the Hall of Stats did not yet include Negro Leagues players. And now it does. And that is largely because of the work of our guest today. He has been on Adam's podcast. Now he is on our podcast. He is Eric Shalek. Hey, Eric, welcome. Hi, thanks for having me.
Starting point is 01:12:36 So we are talking to people who have tried to measure the unmeasurable this week. And really, if there's anything that is unmeasurable, it is the hypothetical question of how would Negro Leagues players who were barred from playing in the AL and NL during their careers, how would they have performed if they hadn't been barred? That is a very difficult question to answer and one that we'll never know the answer to for sure, but you have taken a crack at figuring out the answer to that question as best as we can determine it. So this work relies on the concept of major league equivalencies, which is an old Bill James idea that has been applied for decades in various ways. So for those who do not know, can you explain the concept of MLEs and how they have been used historically? Sure. So Bill James created the MLEs and wrote about them in the 1985 baseball abstract. And at the time, he was talking about minor league players.
Starting point is 01:13:36 And he used examples of Dick Schofield and Tony Fernandez. And since those were guys that I grew up following, I feel really old. And since those were guys that I grew up following, I feel really old. Anyway, he showed that if you look at the run context in which they're playing and the park in which they're playing, then with a certain multiplier or discount, you can get a pretty good sense of what their seasons would have looked like in the major leagues. And Dan Zimborski has done a lot of work with that with Zips and a lot of other systems do too. So people have been evolving that method, as you said, for decades. The hard part with the Negro Leagues is that not only is it difficult to measure, but you need multiple measuring sticks and multiple ways to think about how to use those measuring sticks. And so what I've tried to do is to look at a Negro Leagues player and begin by asking, what is the outcome that I want to get from this?
Starting point is 01:14:37 And the outcome is, if I dropped, say, Josh Gibson into the National League in 1933, what would his performance from the Negro Leagues look like translated to 1933 National League? And the answer is that, to get to that answer, we have to strip away as much of the context as we can from his documented play against top teams, and then we need to recontextualize it into the National League of 1933. So we need to account for his park, and we need to account for the value of a run in his league. We need to account for some pretty niche-y things like the standard deviation of performance
Starting point is 01:15:20 in the league, because the Negro Leagues weren't as uniform top to bottom as the major leagues were. In fact, there's quite a bit more variance. So there's a lot of little nuance that goes into it. But in the end, we come back with Josh Gibson being able to put out this much value. And then we can take that value statement and actually use another Bill James tool from the historical baseball abstract, the new one, new back in 2001, and use the method he outlined in the Willie Davis and Sam Crawford comments in that book to then project what a stat line would look like, a traditional stat line would look like. So we can do an awful lot more than you would have thought based on what looks, you know, Negro League stats look a little eccentric to our sort of, to the eyes that are used to Major League stats because Major League stats are so uniform and there's totals for everything and everybody plays the same number of games, yada, yada, yada.
Starting point is 01:16:23 yada, yada, yada. And in the Negro Leagues, we just have to make some different sorts of decisions about how they work with the numbers, because they didn't all play the same number of games, and they had different levels of competition. Sometimes they had different parks during the same year, all kinds of wacky stuff. I can appreciate why translating to a sort of familiar context is useful to us, particularly within the context of trying to assess the Hall of Fame cases of these players. I am curious how you thought about what the absence of Negro League's players for the major league means in terms of our understanding of the quality of play there, right? Because obviously there were a great many players who were very good who were playing in the majors, and the reality is that they did not have to play against the players who were
Starting point is 01:17:04 kept in the Negro Leagues because of the color line. So how did you think about sort of the question of difficulty and quality of play when it comes to that sort of interchange? Because I imagine it's quite tricky. Meg, that's a great question. I've wrestled with it quite a bit. And it's actually two questions that you asked in one. One is, what's the quality of play in the Negro Leagues? And then the other question is, what does that mean for the quality of play in the baseball universe of the time?
Starting point is 01:17:34 So I'm going to take it in the opposite direction I just said. In terms of how it impacted the major leagues, that segregation did, and how segregation affected the quality of play in the Negro Leagues, and that segregation did and how segregation affected the quality of playing the negro leagues i've done a i've done a study on my own that suggests that we're talking an effect of approximately 10 runs over the above average over the course of the year so if if ted williams was worth 100 runs in the 1940 and of course he's probably worth more but 100 runs in the 1940, and of course he's probably worth more, but 100 runs in the 1940 AL, in reality, if we had had integration at that time, we're looking at more like 90 runs. And the same is true for players in the Negro Leagues. If someone was worth 100 runs there, they'd be worth 90 runs instead when you bring the two leagues together. And it has to be that way, as you suggested, because we bring together the two leagues, and now we are putting the best of the Negro Leagues in with the Major Leagues, which was a much larger talent pool. And we're talking about displacing something like 30,000 to 45,000 plate appearances
Starting point is 01:18:44 and innings pitched. I should say outs, pitching outs. But we're talking about a lot of playing time. And so it's, it's actually a pretty pronounced effect. And I did a really basic study where I just replaced players from the major leagues with MLE players and just directly. And I'm sure there's probably a more scientific way to do it, but that was the one I had at hand. And like I said, it suggested about 10 runs a year per, you know, like 600 plate appearances. Now, the other question about the quality of play
Starting point is 01:19:19 in the Negro Leagues is thorny too. There are lots of different ways in which data is suggesting that the Negro Leagues are around AAA, they're better than AAA, they're worse than AAA. Because Negro Leaguers played in a lot of different places, Cuba, Puerto Rico, and all these other places. So we have to do a lot of sort of disentangling. There's a researcher I know who insists that I've got quality of play peg completely wrong, that I'm too low. I come in around a discount rate of 20% for most Negro League seasons. That's where I'm at right now. I'm doing some research right now that could change that. And it's an ever-evolving process because
Starting point is 01:20:07 there's so much we don't know that we're going to always keep trying new things to see what we can know. So it may change. But right now, it's around AAA. Yeah. And so even though multiple leagues are designated as major leagues and are certainly major league quality, you would not expect them to be exactly the same. I mean, the AL and the NL sometimes are different quality, right? And so it would be surprising if the Negro Leagues were at exactly the same caliber of play. Given the conditions there, the challenges they faced, the smaller player pool probably that was available to them, etc. Clearly, there were many, many, many players who would have been
Starting point is 01:20:45 more than adequate, would have been stars and among the very best players in the AL and NL at the time. And we know that based on what happened after the color barrier was broken or began to crumble, which took some time. But in the decades after Jackie Robinson, so many of the players in the AL and NL were players who previously would not have been allowed to play there. So you know that there were many such players prior to the breaking of the color barrier who also would have fit that description. And as you are making these MLEs, you are not trying to come up with what would everyone have done in a fully integrated league at the time, which would have been a very difficult
Starting point is 01:21:25 question to answer, I imagine. I mean, the question you're trying to answer is very difficult to begin with. But if you were trying to say, okay, everyone is in the same league, then you'd have to make MLEs for the AL and NL players as well as the Negro Leagues players too. And presumably that would just be an incredibly complex question, and maybe one you don't even want to answer because obviously that wasn't the reality at the time. So you are just kind of projecting what would happen if you dropped in one particular player into those leagues that otherwise would still be segregated, right? That's right. I think about it like if you had a rock and you threw it into a pond,
Starting point is 01:22:04 you'd see these nice ripples go all across the pond. But if you had a handful of rocks and you threw them into the pond, you'd have just chaos because all these rocks would hit and you'd have all these ripples going all over the place and nothing would look very pretty. think of the MLE as that one rock and I'm throwing it into the pond and the pond is, is the major leagues or specifically the NL or AL. And, and then I can see, I can see a sense of what the, what the players, how the players performing and rippling across the league. But I, but if I'm, like you said, if I'm bringing all, if I have to do that for everybody, it's going to get real messy real fast. Right. So I imagine that, you know, you mentioned that there are a number of challenges that are attendant with this project, and I think we can anticipate some of them. I'm curious if there were particular bits of consternation specific to the position players versus the pitchers and trying to get everyone
Starting point is 01:22:59 on sort of ground that you could agree with so that you could figure out what their place would be in Major League Baseball. Yeah, you know, that's a really, really neat question. And I think the big difference is that with pitchers, you start with runs. And when I started this, I thought, I want to use runs, not batting average, not home runs, not slugging. I want to use runs because runs translate from place to place, from league to league, from time to time. So with pitchers, you've got runs and you have runs allowed per nine innings. It's pretty easy to work with in that way. With the hitters, then you have to go and figure out what their production is, what their rate of production is. And it's a little more involved.
Starting point is 01:23:44 So what I do for them is I take their statistical inputs and I turn them into weighted on base average. And so that's sort of a proxy for what runs allowed per nine would be for pitchers. But once you're able to sort of get it down to runs, it becomes simpler. And weighted on base average translates to runs and to runs above average, as you both know so easily, that it's a great tool to sort of equalize. And then once you've got the runs, though, then it's really about figuring out what the contexts are. Then it's about the park and the league,
Starting point is 01:24:21 and it's about standard deviation of performance in weighted on base average or in runs allowed per nine or what have you so like that's what that's the place where it has to start with the runs and then all the uh all the little adjustments can happen and you have some very thorough explainers of your methodology on your website which i will link to on the show page for anyone who wants to get into the nitty gritty. Or anyone who wants to fall asleep. Right. Yeah, it might not work as well on a podcast, perhaps, but we'll give you the Cliff's notes here. So you mentioned that MLEs have often been applied to players going from the minors to the majors, and that is something that actually happens, right? That is not a
Starting point is 01:25:02 hypothetical. Players get promoted from AA to AAA, from AA to the majors, from AAA to the majors. So you can just sort of see what the exchange rate is in reality. Now, when it comes to Negro Leagues players and players from the white major leagues at the time, I know there's been a lot of great research that's been done to look at what happens when those players would play each other in various exhibitions and barnstorming. And often it seems like the Negro Leagues players held their own or more, although there are always complications with the makeups of those teams and the effort levels, et cetera. So how do you figure out what the exchange rate is, what the league quality difference is when you have a situation like this where obviously there were no players for a time, at least, who were going from
Starting point is 01:25:50 the Negro National League to the National League, let's say. Yeah, that again is a really great question. And so what I tried to do was to fit the quality of play discount into a structure that I understand and that is widely accepted, and that's the minor leagues to major league structure. You know, that single A and double A and triple A each have their own level of play, and then the major leagues are sort of one, and everything is discounted from that. And it's not a perfect match, but we are starting to make some headway on ways that we can improve that. That said, there's some very strange things that happen
Starting point is 01:26:34 when Negro Leagues players went into organized baseball. For example, especially with pitching, we know that the hitters had a fairly smooth transition to organized baseball. And if you look at the National League of the 1950s and 60s, almost all of the best players are African-Americans who would have or were in the Negro Leagues. So there's no question that the top end of the Negro Leagues translates easily. The problem is that then the pitchers don't translate so easily in terms of the success we see or don't see. So if you look at hitters, there's like 30 or 40 hitters. And when they once they made the majors, they average careers of roughly average major league value.
Starting point is 01:27:15 You've got guys who are way off the charts, you know, but you also have you also have scrubs who, you know, like Kurt Roberts, who didn't make a big impact. You also have scrubs who, you know, like Kurt Roberts, who didn't make a big impact. With pitchers, though, there were really only four successful pitchers who went from the Negro Leagues to the Major Leagues, and that's Toothpick Sam Jones, Connie Johnson, Satchel Page, and Don Newcomb. And that's it. There are other pitchers who played in the Major Leagues, but they weren't very good. other pitchers who played in the major leagues, but they weren't very good. So I looked then at the performance of the hitters and the pitchers in AAA and in AA and in single A level leagues. And the pitchers at the major leagues were, other than those four guys, or including those four
Starting point is 01:27:58 guys, were way, but like 20 runs below average for a season, career-wise. Not a season, but for their career. When you get to AAA, it's not much better, but the hitters are outstanding at AAA, as befits the fact that they were at least major leaguers on average at the major league level. So when you go down the minor league chains, you see that not until you get to double A-level leagues, or probably what were then probably called A or B leagues, do we see that the pitchers are holding their own. And I've thought about this again and again and again. And so on one hand, it tells you that there was some kind of difference between the pitching in the big leagues or organized baseball and pitching in the Negro Leagues.
Starting point is 01:28:51 And then there's also this question like, OK, well, if the hitters were really going great guns in organized baseball and the pitchers weren't, what does that say about the quality of play? And I'm still working through that but here's what i can tell you in the negro leagues it seems like the defensive spectrum was different than we perceive it in the major leagues and that teams tended to to select for their most athletic players first and put them at either shortstop or center field depending on their handedness and then anyone who could catch went to catcher and then they selected for pitchers. And I think that's very different than what we understand the defensive spectrum to be in the major leagues, where at least I perceive it as there's pitchers
Starting point is 01:29:35 and there's everyone else, and then we start moving down the defensive spectrum. But that's not how it was in the Negro Leagues, because the talent pool was smaller and the needs were different. But that's not how it was in the Negro Leagues, because the talent pool was smaller and the needs were different. So the pitchers in the Negro Leagues weren't necessarily the best athletes. They weren't take the balls out of play as often as they did in the Major Leagues. And there's lots of narrative out there about how by the last three innings of a game, you'd be hitting a mushy sweet potato instead of a baseball that was all dark and cut up. So a pitcher who can get by on guile and doesn't have a great fastball can make it for the first few innings of a game in the Negro Leagues. can make it for the first few innings of a game in the Negro Leagues, and then they have to start relying on the fact that the ball is defaced, it's mushy, and using trick pitches like using the cuts in the ball to change the spin on the ball, things like that. Things they couldn't do in organized baseball.
Starting point is 01:30:42 So then on top of that, you have the fact that on average, I'm pretty sure that parks in organized baseball were a little bit smaller than those in the Negro Leagues. And the Negro Leagues tended to play at cities that were very near sea level. And that wasn't the case in every organized baseball league. So you've got just a whole bunch of cards stacked against pitchers going from the Negro Leagues to organized baseball. And so you've got just a whole bunch of cards stacked against pitchers going from the Negro Leagues to organized baseball. And so it's really hard to say from that what that really means about quality of play, especially when you've got the hitters doing so well. And so my compromise so far has been, all right, we've got guys who aren't performing well
Starting point is 01:31:23 in AAA and may not be, you know, may not really be better than AA pitchers. Then we've got guys who aren't performing well in AAA and may not be, you know, may not really be better than AA pitchers. Then we've got guys over here in the majors who are like high quality players in between there somewhere is around AAA level. You know, it's not perfect science, unfortunately. I can't get too much closer than that yet. I imagine that the sort of increase in proliferation of available data and box scores around Negro leaguers made a huge difference in your effort here. I'm curious if there are, and this perhaps is in one of your 6,500 word explainers, so my apologies if I missed it, but I'm curious if there are other pieces of information that you are keen to sort of add
Starting point is 01:32:03 to this project that you think might improve the precision or specificity of some of these equivalencies? Yes. So the work of Scott Simkus and Kevin Johnson and Gary Ashwell and Larry Lester and all of the gentlemen and ladies who have contributed to the Negro Leagues database at CMHits.com is amazing. And without it, none of this is really possible. And the reason it's not possible is that until their work went up, we didn't have a lot of information about league totals. It's very difficult to find anything that says that, for example, you know, the NNL stole, who knows, like 300 bases in 1926.
Starting point is 01:32:49 You just can't find that stuff. But you need that stuff in order to compare players against their own league so that you can then translate them into a major league. And so once that stuff began appearing, those lead totals began appearing at seam heads, then we had the ingredients where we could really start cooking. And so that's huge. It's huge. And there's holes in the data. There aren't stolen bases for every season.
Starting point is 01:33:20 There isn't hit by pitch for every season. Not every box score has been uncovered. And obviously that's an ongoing task that the guys at SeamHeads are still working on. I think that in terms of missing data, there are individual players. We just don't have anything because they either went off to some semi-pro league. There's a guy named Heavy Johnson who was a tremendous hitter. And unfortunately, he played like six or seven years in the Negro National League in the 20s, starting around age 26 or 27. The preceding six years, he was in the 25th Infantry Wreckers, which was an army division that just basically played baseball. So we don't have any stats for that. And we don't have any stats for 1929 and 1930, because he went off to
Starting point is 01:34:04 the Northwest to play semi-pro ball and so there's people like that where we're just never going to get a lot of the information but the information that could come through that i'm pretty sure that seam heads guys are are working on is a full accounting of the cuban winter league now, there's about 20 seasons in their database. And getting a fuller accounting of that will really help because it's more plate appearances. It's a bigger sample. And the bigger the samples get,
Starting point is 01:34:37 the more confidence we can have. And the same thing is true for the Puerto Rican Winter League. There's currently no information on that available that's usable. And if that comes online, that'll provide some really good, it'll really beef up the sample for a lot of latter-day players. And will give us insight into players like Perugio Cepeda, the Bull, Orlando Cepeda's father, who currently we have zero stats on. But this guy's a legend, and I'd love to know more about him. So there's things like that that are likely on the way someday. Gary and Kevin have mentioned to me that that's a goal of theirs. And really, anything we
Starting point is 01:35:16 can do to increase what we know and what's documented is going to be helpful. And if you go to SeamHeads or you go to Baseball Reference, then you see our best accounting available now of the official league games that those players played. Of course, they played in many exhibition and barnstorming contests too. And those schedules, the official league schedules were considerably shorter than what people are used to with AL and NL schedules of today or of that time. And of course, you want to preserve that difference because you want to remember the reason why these players were playing in a different league, why they had these shorter schedules. You don't kind of want to pretend that this is some alternate happy history where there was no color barrier.
Starting point is 01:36:01 But I think one of the dangers of presenting these stats the way they are without anything else, which I think it's wonderful to have them available, and I hope that it does lead to more people discovering these players and their names and their accomplishments. But you could inadvertently lead to some of these players being underrated because people might look at their counting stats or their war or whatever and will see lower totals than they expect to or than the greats of the AL and NL of the time. And so one thing that I think your work allows us to do and allows Adam to do at the Hall
Starting point is 01:36:37 of Stats is to present these things on somewhat of a comparable playing field here in terms of totals, in terms of playing time. And they may still be conservative, as Adam explains here, but they at least look a little more like you would expect these stats to look, like the career wars to look. So I don't know if you want to run through an example or two, say take a Josh Gibson or a Satchel Paige or any player you want to pick and heads with 230 some home runs and, you know, tremendous batting average and all of that. him through the process, what we get is a guy who has about 9,000 plate appearances and 83 war, and that's a pretty damn good player. And we're talking about 500 and almost 600 career batting runs. And man, that's not chopped liver, you know, that's high-end stuff. And what that translates to in terms of traditional stats is a 917 OPS, a 160 OPS plus,
Starting point is 01:38:16 435 home runs. He's just, you know, he's a monster. He's a great hitter. And it's true that he's a monster. He's a great hitter. And it's true that MLEs are slightly conservative by nature because we have to use a lot of measures of central tendency because we don't have all the data. And we have to, in some cases, you know, in a season when somebody has fewer than 200 plate appearances, I try to beef the sample up by using surrounding seasons or using their career averages to increase the sample size so that we're not giving 100 home runs to a guy who has 10 plate appearances and hits two or three home runs. So it is a little conservative. So could Josh Gibson have hit 500 home runs in those 9,000 plate appearances? Absolutely. Absolutely. The fact that I've got them down for 435 simply means that that's what my math is saying right now. But as more data
Starting point is 01:39:12 comes through, we could see a whole different look. I mean, we could see totals shooting up. We just don't know. And we're not going to know until the data does come through. I think that it's important that what you said is really important, that seeing numbers that are familiar looking, putting this in a familiar context, brings Josh Gibson to life in a different way. When I can look and say, gee whiz, I mean, you know, the only people who had hit 500 home runs by the time that Gibson had retired were, you know, Babe Ruth, Mel Ott, and Jimmy Fox, and he's at 435. That's one heck of a slugger. It puts some frame of reference around Josh Gibson. When we say things like, well, he, you know, hit 800 home
Starting point is 01:40:05 runs against all competition. Well, Babe Ruth hit a thousand home runs against all competition in all likelihood. I don't know the exact, the exact count. So, so we're, you know, how does that all fit together? And it's hard to say, but when we have a, when we have a solidly, internally consistently derived figure, at least we can say, hey, we estimate that he's around 450 home runs. Dang good player. That connects me from the legend of Josh Gibson to what kind of player he really was. And I happen to have Gibson open because we were talking about it. Give me a second to pull up Satchel. Sure.
Starting point is 01:40:47 Okay, so with Satchel Paige, we're actually missing something like 1,000 innings of Satchel Paige's career, believe it or not. And I say believe it or not because we already have somewhere around 2,000. But he pitched in the California Winter League, and he pitched in Cuba, and he pitched he pitched in Cuba and he pitched in Puerto Rico and he pitched in a whole bunch of places that we just don't have numbers for yet. But 2000 innings is a heck of a big sample. And so we can have some confidence that we're getting
Starting point is 01:41:17 good numbers, especially because we also have his numbers in the major leagues and in the minor leagues. So we know what kind of a pitcher he was late in his career. So when we look at him through the MLE lens, I'm getting a total of about 4,500 career innings. And I'm seeing him saving about 530 runs more than an average pitcher. And that comes out to about 95 more. And man, that's a lot. And when I run that through the traditional stats machine, I get about 300 wins. Now, those 300 wins are assuming that he's playing for average teams every year.
Starting point is 01:42:04 If he was playing for the Yankees, he'd be somewhere around 350, 375 wins. There's so much context around wins that we're just sort of saying on an average team, that's what that looks like. If he had Whitey Ford's Yankees behind him, he'd have had a winning percentage much like Whitey Ford's or better. He was an amazing pitcher. He probably would have had somewhere around a 127 ERA plus and probably would have pitched around almost 900 games. I mean, after all, he pitched from age 20 to 46 nonstop and then in 58 made a three-inning comeback with Charlie Finley. So this guy's quite a pitcher. And of course, there's no one really like Satchel Paige. He's an institution unto himself, but he was a darn good pitcher. And he
Starting point is 01:42:53 and Lefty Grove are really the only two pitchers between Walter Johnson and Tom Seaver who have the kind of career where we can ask, were these the best guys between Walter and Tom? Do they have a place in the discussion among the all-time great pitchers? I think, you know, the other obvious move here would be to use those numbers to help us put these guys in context when it comes to the Hall of Fame. And I know that when the early era baseball committee was meeting, you did a number of tweets and posts about how the candidates stacked up relative to other major league peers that we might be familiar with. And so I guess the first question I would ask is, were there any omissions from the gentlemen who were elected and inducted that you were disappointed by, guys who you hope will
Starting point is 01:43:40 make their way into Cooperstown when the next committee meets? Yeah, it was kind of bittersweet for me because I was really excited for Minnie Minoso, and I was really excited for Bud Fowler and for Buck O'Neill, just like everybody. But I was also really sorry that none of the actual ballplayers from the Negro Leagues era, from early committees, deliberations made it. I feel like in some ways that they probably couldn't reach consensus on the players, and so they went with the sort of pioneers and executive types. And I'm glad for those guys and their families and for all the fans.
Starting point is 01:44:21 I think that one of the places where I wish they had charted a different course was to put Dick Lundy on the ballot. Lundy's an incredible shortstop, a great fielding shortstop. He could hit, and he had all the tools. And I think it's a mistake not to have had him on that ballot. They had Grant Johnson on there, and Grant was from an earlier era than Dick Lundy. He retired a good five to ten years earlier than Lundy did. Lundy was a very young player when he came up at age 18, and he had a really strong career.
Starting point is 01:45:01 Both of them are Hall of Fame caliber players, in my opinion. I think Dick Redding, not electing Dick Redding might be a mistake. He's sort of like the... If Smokey Joe Williams is the Walter Johnson of the Negro Leagues, then Dick Redding is something like
Starting point is 01:45:18 the Pete Alexander. They occupy the same sort of 1-2 ranking in their era, and they're both long careers and were great pitchers. I don't think Reading was as good as Alexander, but they occupy that same sort of territory within the context that they're pitching in. So I think that that was a missed opportunity for sure. But you know what? I also think that these are very difficult deliberations and I don't, I don't want to, I don't want to be like, you know, Mr. Frowny face about this because, because the Negro league's got another shot this time around at getting more people in. And that's, and that's important. They've had sort of the, um, been sort of, um, treated not as consistently as the major league players have, which is unfortunate.
Starting point is 01:46:10 And I hope that we have more opportunities than once every 10 years to explore who is a good Hall of Fame candidate from the Negro Leagues. It would be unfortunate if we just left it here until 2031 or 2032. Right. And we did have two of the founders of the 42 for 21 committee, Sean Gibson and Ted Knorr, on the podcast on episode 1785 to talk about their efforts in that area. And I know that you shared on Twitter your 42 for 21 ballot. So I will link to that on the show page as well, the players that you selected as the most deserving for potential induction. One last question for me. You mentioned Grant Johnson as one of the most deserving players, and you have him or you and Adam have him as more than a deserving Hall of Famer, according to your MLEs at the Hall of Stats. And of course, his career at the Hall of Stats classifies it as 1895 to 1914. That predates the organization of the Negro League. So how do you do your MLEs for players from black baseball from before
Starting point is 01:47:18 the Negro Leagues were founded? Just the same way that I do for post-founding of the NNL. And, you know, the quality of play was lower. I ramped down from AAA to about AA in 1905, like, you know, kind of a point at a time. But, you know, I sort of treat the Eastern independent teams and the Western independent teams as their own quote-unquote league. They played each other a lot. So I use those league totals in the same way that I would use the NNL or ECL or NAL totals for latter-day players. I try to use the same procedures and techniques for all players because I don't want to treat anybody differently.
Starting point is 01:48:01 I want everybody to go through the same process so that my, so that my biases and my favorites or what have you are not influencing the, the outcomes. And, you know, I've, like all of you who, who are writers, you know, your, your, your bias might not come out in the, the actual words, but in the choice of subject and things like that. So yes, of course, my bias will be in there. But I try to make it as objective as I possibly can and have as few decisions to make for each player as possible so that I'm not inflicting my ideas on everyone to the best that I can. But it is very much the same process. The issue with someone like Home Run Johnson
Starting point is 01:48:47 is that the first half of his career, the information is very scant. And that's true for all the players from his time, especially the early part of his career. Guys like Saul White, Hall of Famer, and Hall of Famer Frank Grant, and lots of other guys, Abe Harrison, Clarence Williams,
Starting point is 01:49:07 and James Seldon, and George Stovey, guys like that from that period who we just don't have a lot of info on. And I would love to have that information to be able to do more with people from that era. But right now we don't. And until we do, kind of got to sit on my hands. All right. Well, we'd encourage everyone who's been interested in this interview to check out the show page where I will link to a lot of other resources that you can check out to learn more about Eric's work. You can also find him on Twitter at his name, Eric Shalek. That is C-H-A-L-E-K. Eric, thanks very much for your work and for joining us today. Thank you both so much for having me.
Starting point is 01:49:45 It's been a delight to talk with you. All right, that will do it for today. Thanks, as always, for listening. You can support Effectively Wild on Patreon by going to patreon.com slash effectivelywild. The following five listeners have already signed up and pledged some monthly or yearly amount to help keep the podcast going and help keep us ad-free and also get themselves access to some perks. Garrett Sutherland, Mark Bailey, Jonathan Goetz, Stephen R. Christensen, and Alex Kobayashi. Thanks to all of you. And if you sign up like those listeners did, you can get access to the Effectively Wild Patreon-only Discord group with about 450 members now talking about baseball all the time.
Starting point is 01:50:25 You can also get access to our exclusive Patreon-only bonus podcasts, a couple of which we have recorded and published already. You can join our Facebook group at facebook.com slash group slash Effectively Wild. You can rate, review, and subscribe to Effectively Wild on iTunes and Spotify and other podcast platforms. Keep your questions and comments for me and Meg coming via email at podcast at fancrafts.com or via the Patreon messaging system. If you are a supporter, you can follow Effectively Wild on Twitter at etaleanpod.
Starting point is 01:50:53 You can join the Effectively Wild subreddit at r slash effectively wild. Thanks to Dylan Higgins, as always, for his editing and production assistance. We will be back with another Measuring the Unmeasurable episode a little later this week. Talk to you then. When you were at the school Did the factory help you grow Were you the maker or the tool Did the place where you were living Enrich your life and then Did you reach some understanding Of all your fellow men All your fellow men
Starting point is 01:51:34 All your fellow men

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.