Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1170: Are the Robots Ready?

Episode Date: February 1, 2018

Ben Lindbergh and Jeff Sullivan banter about Nori Aoki’s return to Japan, Vladimir Guerrero’s swing rate, Franchy Cordero’s power/speed skills, spring training for free agents (again), and Melky... Cabrera, then bring on Baseball Prospectus Director of Technology Harry Pavlidis to explain BP’s latest pitching-themed analysis, featuring robot umpires’ feasibility, desirability, and potential effects on offense; […]

Transcript
Discussion (0)
Starting point is 00:00:00 Everybody's looking at me like they know me. There is so much they don't see. And all the cameras in Japan couldn't photograph who I am. The way the colors bleed into one till they block out the sun. Hello and welcome to episode 1170 of Effectively Wild, a baseball podcast from Fangraphs presented by our Patreon supporters. I am Dan Lindberg of The Ringer, joined by Jeff Sullivan of Fangraphs. Hello. So most of this episode will be an interview. Baseball Perspectives is in the midst of a pitching week. That's what they're calling it.
Starting point is 00:00:50 Hashtag pitching week, I believe. So we will be talking to Harry Pavlidis, the director of technology at BP, about robot umps and about classifying pitchers in a new way that BP is doing it and pitch tunneling. Lots of interesting analysis stuff. We'll get to that in a few minutes, but a few things we wanted to banter about before we get there. I guess we could start. I've received many tweets and messages of condolence about Nori Aoki leaving Major League Baseball to go back to NPV. I don't know exactly how I got associated with Nori Aoki, but for some reason I get tweets every time something happens to Nori Aoki. I think it's because I've been
Starting point is 00:01:31 complimentary about him in the past. I've written about him in the past. I've always felt like he was somewhat overlooked or underrated, I guess. And so he's leaving MLB. He's going to the Occult Swallows and he signed a three-year deal with them for about $9.2 million in total or roughly 1 billion yen, which obviously I would say that I signed a 1 billion yen contract instead of a $9 million contract. That just sounds better. I understand how currencies work, but still. But I think the thing with Aoki is like he is always kind of an average player. And if you look at just like his career war and mlb you know per 600 plate appearances it's 1.8 where you know average kind of the rule of thumb is two he's just kind of average and like every year like look at his weighted runs created plus 113 102 102 109 106 97 or if you look at his career defensive runs
Starting point is 00:02:28 saved totals right for left field zero exactly center field small sample negative two right field bigger sample plus three basically like average ish everywhere in the the data that we have add up all his defense it's kind of average up all his defense, it's kind of average. Add up all his offense, it's kind of average. His base running is slightly below average. But on the whole, kind of an average player. And to be fair, he's only had 600 plate appearances in a season once and hasn't been the most durable and maybe has been sort of used sparingly in situations that would boost his stats a little bit, but I always felt like he's pretty good. And he just kept getting a series of one year, fairly low dollar deals. And I always felt like he was getting a raw deal a little bit. So good for him to get a three-year deal in this winter when lots of people aren't getting any kind of deal. He got a three-year one as a 35-year-old and for decent money, about the same sort of money that he's been making here. He's 36, actually, as of earlier
Starting point is 00:03:31 this month. So pretty good deal for Nori Aoki. I wish him well, and I will miss him. And apparently he's popular with moms. So I've heard moms really like Nori A. Moms and me. So, bon voyage to Nori. In his final year in Japan, so Aoki was a very good hitter back when he played with Yakult before and throughout his 20s. In his final year in Japan in 2011, this is a year where the offensive environment
Starting point is 00:03:58 in Japan just cratered. But in any case, this was Aoki's triple slash line. 292, he batted 292, 358 OBP, 360 slugging percentage not actually that good of a player in his final year although again that was better than average so 292 358 360 in the major leagues he hit 285 350 387 he was exact he was a mirror of what he was in his final season in japan if anything he was actually a little bit better so you can say
Starting point is 00:04:23 that ioki played six years in the major leagues and he only hit 33 home runs and he wasn't that much of a stolen base threat and of course the uh people made fun of his defensive routes sometimes and he was yeah he was an animated player in ways where he wasn't animated i think because he wanted to be it's just it was his body language it's uh one of his instincts but he was exactly what he was supposed to be and pretty solid. Never the solution, really, to what you would want in the outfield, but never a problem. No, just sort of a perpetual stopgap. And I always appreciated, too, that he was just a reach-based-on-error machine.
Starting point is 00:04:59 So some stats at least kind of underrated his abilities like his his on base percentage for instance effectively was higher i don't have the exact numbers i remember running them at at some point in a bp article and he improved significantly in his on base abilities if you look at his ability to reach base on air because i don't know he he hit a lot of ground balls, I guess, and had decent speed. And I don't know why he was really the reach-based-on-error machine, because he's a left-handed hitter. And usually you see more reach-based-on-errors for right-handed hitters who are hitting the ball to third. But still, he was really good at that. And so maybe he was partly underrated a bit because of that.
Starting point is 00:05:42 So he will be missed by at least one person. Agreed. partly underrated a bit because of that so he will be missed by at least one person agreed someone who might not be missed but i have uh i've seen just one too many tweets from national writers to ignore it uh there for some reason melky cabrera seems to be one of like the symbols of the dead market i don't know why this has happened i was looking up to see who his agent is and his agent is listed as dominic torres who i don't know who that is i don't know if he works for a bigger agency this is just going off the baseball reference information but dominic torres seems to be trying to get the melky cabrera information out there which is odd because here's the thing over the past three years melky cabrera's fan graphs war is 1.6 over the past
Starting point is 00:06:21 three years his baseball reference war is, let's take a look. Well, it's 3.9, but still below average. He's 33 years old. There's no Nori Aoki, that's for sure. There's no Nori Aoki. And last year, according to StatCast's catch probability, the only outfielder worse than him was Matt Kemp. And that's a whole different conversation. According to Ultimate Zone rating last year, Melkyky carrera it was 10 runs below average but according to defensive run saved which david forrest said over
Starting point is 00:06:49 the weekend he likes better than ultimate zone writing he was negative 20 runs in the outfield he's not a good defensive look i i don't mean to pick on milky carrera there's lots of mediocre players out there who run signed but this is just a weird one to have seen floating around like oh you know the market is broken because milky Cabrera doesn't have a job. Well, Melky Cabrera isn't a good player and he's getting older. So I don't know who cares. Every team has a Melky Cabrera in AAA. And the only difference between him and Melky Cabrera is that the player in AAA is younger, less proven, and doesn't have a performance enhancing drug suspension on his record.
Starting point is 00:07:19 That's right. Yeah. So you wrote something about a couple players. I have read one of those articles. The other has not been posted as we speak, but will be as you listen. So you wrote about Vlad, and I like this article because it's one of those articles and they'll get a reputation for having been great at something, whatever. Pitching to the score is an obvious example, swinger, very much borne out by the numbers. And not just that he was a free swinger. There are lots and lots of free swingers.
Starting point is 00:08:10 But that he was a free swinger who was way better at swinging than any other free swingers. He was amazing. He's definitely one of a kind. Nothing too surprising in the article. But you think of it like swing rate is such an obvious statistic to look at. And we can look at it for the past 16 years basically on Fangraphs, no problem. That's how far back the plate discipline leaderboards go. But there is pitch-by-pitch information going back to 1988. It's available on RetroSheet.
Starting point is 00:08:38 And the problem is that it's not easy to combine those numbers. So you can see all the single season swing rate metrics that you want for 1988 or 1993 or whatever but there's no there's nowhere where all that information is put together so that you can compare careers so this was basically an article that if the information were available you could have seen written in 2006 or 2012 but sean dolan are an employee of fangraphs and uh someone who actually knows computers unlike me was able to run some research for me and i just asked him to get me swing rates for all players going back to 1988 so for many players we're looking at whole careers for other players like
Starting point is 00:09:16 no kirby puckett we're looking at partial careers and all i did was i looked at career swing rates and career weighted runs created plus wrc plus pick your offensive metric it doesn't really matter and confirmed vladimir guerrero is swing rate ranked just outside the top 30 and his wrc plus ranked just outside the top 30 that might not sound amazing on its own but if you look at the actual distribution of wrc plus and swing rate yeah vlad is out there by himself he swung as often as jeff francor but he's basically a dead match for and for Andrew McCutcheon's career WRC plus he's just barely behind like David Ortiz and Chipper Jones and Mike Piazza Chipper Jones wound up with a WRC plus of 141 Guerrero is at 136 so very free swinger obviously but the the fact that he was able to swing that much and it's not just that Guerrero was a free swinger when he was young
Starting point is 00:10:03 he stayed the same way his entire career even after he established himself as a threat which means pitchers you'd think would presumably throw him fewer and fewer pitches in the strike zone because they're more afraid of him that's the way that this works but he would still keep on swinging that much and the closest comparisons I could find like Freddie Freeman is aggressive and he's a good hitter but he still swings like five percent five percentage points less often than Vladimir Guerrero did his entire career and of course Freeman has only played a few years relative to Vlad's whole career so again nothing that really stands out except to demonstrate that at least over the past 30 years there really hasn't been anyone like Vladimir
Starting point is 00:10:38 Guerrero not over their entire career and and whether or not you believe that Vladimir Guerrero belongs in the hall of fame I like to give him some bonus points for just being completely himself having a style that no one else could match and and i love it so i am a i'm a big hall guy it's great the second graph that you had or i think it was the second of just his career year by year first swing pitch rate swing rate contact rate the same the entire time, basically. Like he just, he never really, like some guys still get more selective, more disciplined as they age. Sometimes that's a good thing. But with Guerrero, never really changed.
Starting point is 00:11:16 He was just the same guy at the tail end of his career as he was in his prime. Yep. I was comparing his 20s to his 30s. And when Guerrero was was younger he swung at 58 percent of all pitches and when he was in his 30s he swung at 58 percent of all pitches when he was in his 20s he made contact with 79 percent of all pitches in his 30s he actually made that 81 percent and he swung at 47 percent of all first pitches in his 20s but in his his 30s, he swung at 46% of all first pitches. And for anyone who doesn't know, the league average is several percentage points under 30%.
Starting point is 00:11:50 So Guerrero was just up there to swing. And of course, the classic highlight is Guerrero hitting like a bloop single off a ball that bounced and whatever. Dee Gordon has done that. Corey Dickerson has done that like three times. It's a good highlight. But the real highlight of Vladimir Guerrero's career is just hitting the crap out of balls that weren't strikes that is vladimir guerrero yeah tell me about frenchie cordero okay okay so this piece hasn't published i'm looking forward to it frenchie cordero this is i've been bored this this off season i don't know about you you've got other things to write about i don't so i get bored and you know when you get bored with baseball and you start just hunting around leaderboards.
Starting point is 00:12:25 And long story short, I was poking around on some baseball savant leaderboards. And, of course, that's where you get the stat cast information. And looking at sprint speed, which I trust, it seems to bear out that fast players are fast and slow players, I think, are slow or slow. So I trust sprint speed as it's calculated. And one of the very fastest players in baseball last season was Franchi Cordero. You might remember, because I certainly remember, last year when we were doing our Padres season preview, we had Dennis Lin on to talk about the Padres. And I asked him, what is a Franchi Cordero?
Starting point is 00:12:54 Because I'd never heard the name before. He's very fast. Then I was looking at the exit velocity leaderboards, and pretty close to the top was Franchi Cordero, which I also didn't expect. Franchi Cordero, which I also didn't expect. Franchi Cordero, I had a list of Franchi Cordero only had 49 batted balls tracked in the major leagues. That's not a lot, but we still learned some information in that 10 of those batted balls, he hit more than 105 miles per hour. And there were only six hitters in all of baseball who had higher rates of batted
Starting point is 00:13:21 balls hit at least 105 miles per hour. And this is like Aaron Judge, Giancarlo Stanton, Miguel Sano, Joey Gallo, Nelson Cruz, etc. There was a name or two in there that I forgot. Powerful hitters, all of them. Franchi Cordero, he topped out at 113 miles per hour in a small sample in the majors. He is extremely fast, and he hits the ball really hard. And he's only 23 years old. He was 22 last season. And he's never been like a top prospect.
Starting point is 00:13:44 He's not anywhere in the Padres top 10 by any system I've seen so far Baseball America left him out of the top 10 MLB.com has him like number 12 John Sickles has him number 11 it's a deep system don't get me wrong but this is a guy there there are not many players who are very fast and very strong it's just not a skill set that many people have and you could point to him being maybe like a younger keon broxton he's a center fielder he's a center fielder sounds like he's the new keon broxton yeah right he's a he was a shortstop until like 2015 so he's new to center field but last year in a small sample statcast says his defense was great
Starting point is 00:14:21 so you have a fast good defensive center fielder who presumably has a good arm because he was a shortstop he hits the ball really hard here's this this one problem he strikes out like a lot yeah he struck out 44 percent of the time in a nine to nine major league play it appears and that's bad but he was really good in triple a he only struck he struck out like a little more than a quarter of the time in triple a he doesn't walk he He is just one of those not-a-whole-lot-of-play-discipline guys. So he's like a slightly more interesting Nick Williams, I guess. But if you look at what Nick Williams did as a rookie with the Phillies, he actually hit well. So long story short, I don't think that Franchi Cordero is the best player on the Padres right now.
Starting point is 00:15:00 That's probably, I don't know, Will Myers or Denelson LeMet or Manny Margot. You choose. They don't know, Will Myers or Denelson LeMet or Manny Margot. You choose. They don't have a lot of great players. But Franchi Cordero, to me, without question, is the most exciting Padre. And that's the headline of the article. He's the most exciting player on the Padres because of everyone who is on the Padres depth chart right now that has stat cast information. Franchi Cordero has the fastest sprint speed and he has the highest exit velocity. Even if they sign Eric Hosmer, he still has the highest exit velocity.
Starting point is 00:15:26 And to have that blend of skills, I don't quite understand why he's being relatively ignored. Of course, he's a high-risk player given the swings and the misses, and he doesn't have a polished approach. But he hits the ball as hard as Yohan Mankata, and he runs as fast as Yohan Mankata. So keep your eye on Franchi Cordero. He's a Padres player who isn't boring. All right. Well, there's your unofficial or official Jeff Sullivan sleeper for 2018, or one of them, at least.
Starting point is 00:15:54 Franchi Cordero. How was your response? It was what? I think it was Ken Rosenthal, maybe it was Jeff Passan who tweeted out Wednesday morning that the Players Union at this point does not think it will have to put together a free agent players camp in spring training are you are you happy about that or disappointed by that i think i'm happy about that i'm happy that it happened the one time because i got to write about it and it was a fun story but everyone i talked to who was there said i hope this doesn't
Starting point is 00:16:20 have to happen again i mean you don't want it to happen because it is a symptom of problems that could lead to no baseball at some point down the road. And no one wants that to happen. No one wants people to be unemployed if they could be employed instead. So that would be probably for the best. One thing I didn't mention on the last episode that I wrote about in the article was the original plan for the 1994 strike, or really the 1995 strike if it had continued at greater length into that year, was that there was going to be a barnstorming tour. MLB Players Association had already put together plans. They were going to put together four teams of players and staff, and they would just barnstorm around the country for six to eight weeks or so, playing games only on weekends. And it would have been like not a singular thing, but certainly there's a long history of barnstorming in baseball history.
Starting point is 00:17:14 But generally from the days when players had to keep playing in the offseason to make extra money, this would have been unusual. I guess it would have been kind of like the exhibition tours we sometimes see in other countries, except during the regular season. And it would have just been time away for players to stay in shape, to remind people that baseball existed, to maybe have some positive PR for the players and against the owners, just by reminding people what they were missing, essentially. So that's kind of an interesting footnote I came across while working on that article that I had never heard of before. Barnstorming tour during the strike. Would there have been wages? Yeah, actually. I think I have a document in the article that was provided to me by Jackie Moore,
Starting point is 00:18:02 who was the manager of that Homestead spring training team. And one of the things he sent me, like when they sent him the offer letter to manage that team, they didn't know at that point whether there would have to be a barnstorming tour or not. So there is some detail in this letter about the terms of what that would have been. And let's see, it says compensation. I don't know. This may have just been for Jackie Moore, I guess, $1,000 per weekend, up to $8,000 for complete tour plus expenses.
Starting point is 00:18:35 And the players were not paid for that spring training homestead camp, although their expenses and travel were covered. So I think they probably would have been paid because there were sponsors lined up for this to pay for the expenses and there were going to be proceeds going to charity too. So it would have been an interesting story, but fortunately that didn't have to happen either. So yeah, let's keep the Homestead Spring Training a unique event, I think. I think the argument at present is that players now have their own private facilities and agents have their own private facilities.
Starting point is 00:19:08 So players are staying in shape, but they're not playing with one another. So something could be lost, but I guess you never know. If Melky Cabrera can't get a job, then he's going to have to do something in a month. That's right. All right. Well, we will take a very quick break and we'll be back with Harry Pavlidis. I cracked the cold and the wall
Starting point is 00:19:46 So we are joined now by Baseball Perspectives Director of Technology and co-host of BP's Stolen Science Podcast and person who has classified every major league pitch thrown in the PitchFX
Starting point is 00:20:04 slash StatCast era and made those classifications available to people like us at Baseball Perspectives and Fangraphs and Brooks Baseball. It is Harry Povlidis. Hello, Harry. Hello, Ben. Hello, Jeff. Can I get dibs on the profile when you finally retire, when you hang them up from classifying every pitch? You'll have classified like a half century's worth of baseball history at that point. I think I'm up to like 20 million pitches probably at this point.
Starting point is 00:20:32 Because there's all the pro baseball is just the tip of the iceberg is what you see in major league baseball. So it's a ridiculous amount of data out there that we've handled going on 10 years. Yeah, that's right. More than 10 years. A picture of 80-something-year-old Harry Polita still crouched over the keyboard classifying those pitches. And then I'll have a... That's the plan.
Starting point is 00:20:55 That's absolutely... Your ambition in life is to just age and continue doing what you're doing. Hang up on the shuffleboard courts. Yeah, I'll just have a mockish line about the first pitch thrown the following season without Harry behind the keyboard. What was it? Won't be a dry eye in the house. No, we won't know.
Starting point is 00:21:17 We won't know. Yeah, well, by that point, they'll have developed an AI Harry that can do what you do, which is relevant to what we are discussing today. Because it is pitching week at Baseball Perspectives. There are three heavy-duty research pieces up, one on robot umpires, one on classifying pitchers via various attributes and to command pitcher, power pitcher, and so on. And then one on pitch tunneling and some upgrades that BP has made to that metric. So we'll get to all of these, I guess, but we can start at the beginning,
Starting point is 00:21:52 start with the first piece that went up, which is the robot umps piece. And I wrote about this at Grantland, I think about four years ago, and you helped me with that article. As I recall, you were quoted in that article and a lot has changed since then, in other ways not a lot has changed that was the pitch fx era and yeah yeah we're in trackman stat cast but a lot of the caveats and
Starting point is 00:22:19 the the hang-ups and the reasons why maybe robot umps aren't as easy as everyone thinks they would be are sort of the same today as they were then yeah i mean i think the first thing is you know we're not anti-robot umpire we're not luddites you know there's this notion that we were against the use of technology uh in baseball tracking which i i okay that would be very ironic that would be really i would be experienced so much cognitive dissonance. It would be unimaginable. You became the director of technology just to keep technology down. Just exactly.
Starting point is 00:22:51 My whole goal with working and massaging and understanding pitch tracking data is to make sure we don't use it for anything. That's what we all want, right? Is to retire or not retire with having nothing accomplished. So basically the problem is there's unrealistic assumptions about what the quality of the technology is. And this is something that the commissioner of baseball keeps trying to tell people, but he's right. It's like, it's not fast enough. It's not quite accurate enough. And it was a little faster and a little
Starting point is 00:23:20 more accurate when they still had the pitch FX optical camera system in there. And even then you still had issues. I mean, there's just, you know, it's hard to really explain that the robot is inaccurate when we're comparing the robot to the human, because what we keep like, you know, if we plotted the pitches and call them as they were plotted, then that's the umpire was more or less accurate. But we're assuming that the tracking system had it accurately. And so there's the problem of calibration, the ball could be a little bit left or a little bit up, a little bit inaccurate in a way. And that should be relatively easy to fix, but there's just also the pitch to pitch error that happens. And everyone seems to assume that that would be less prevalent of a problem with technology than with humans, despite the fact that that has not been demonstrated.
Starting point is 00:24:06 Yeah. And that's the thing. It's like there's proof through rigorous assertion is not accepted. That's bottom line. It's like, so people were like yelling at us literally like in capital letters in the comments or through television screens that were wrong are not actually demonstrating or answering the question. So it's like, you have to have a system that's fast.
Starting point is 00:24:26 You have to have a system that's 100% reliable. So you're going to need redundancy. And you're also going to have to have like on-site calibration and fixing. So the way you do that is like you fire a projectile at a foam board and then see where the board hit the board at the front of the plate and then compare what the computers and the cameras and the radars are saying. And that's your calibration. Because there's one very, very, very, very important thing here.
Starting point is 00:24:47 The umpire is actually watching the ball cross and making their decision based on where they think they saw the ball cross. The so-called robots we have are estimating the trajectory of the pitch via a formula. And that is not the same as an actual observed measure. So, first of all, people are advocating for a technology kit that is not an actual observed measure. So first of all, people are advocating for a technology kit that is not an actual observed crossing. It's an estimated crossing. Yeah, so that's like, so if then, okay, so you want a system that directly measures it,
Starting point is 00:25:15 like drop lasers or whatever, and figure out how not to get that triggered by a bat or a glove or the remains of Randy Johnson's pigeon. You just, like, even if you did that then there's still all sorts of other complications that could you know that we talked about in the article but the fundamental thing is like there's this the technology literally is not doing what people think it is doing and it's literally not designed to do what they want it to do and therefore we're not that close and that's what we're saying and what comes back is you guys are naive and you
Starting point is 00:25:43 don't know anything about lasers. Before we get to everything else, and there's a lot in here to get to, the one thing that probably doesn't get nearly enough coverage when people talk about this strike zone, the automated strike zone idea, is that delay. You were talking about the time that it takes to register a call. Of course, right now, you have a pitch and a call is delivered within half of a second. Sometimes with very obnoxious umpires, it might take a full second, but it's uncommon. What have you observed in the evolution of any sort of delay now? What is the fastest amount of time that you can go from having a pitch recorded to having a judgment rendered? We're talking about tens of milliseconds for broadcast purposes already. So right now, these systems are designed for broadcast. So when it was SportVision, who was fundamentally a broadcast enhancement company, their product met this tight stopping point where, okay, you can't process any more data. You have to now
Starting point is 00:26:36 beam that to the truck so it can get on the TV screen for the K-Zone. And you also have to have this time because that's just measuring the pitch, estimating where it went. But also there's this barrel distortion stuff at the center of the camera that you're mapping it back to for TV displays. All this, all these algorithmic things have to happen. So how long does it take it right now? It's based on getting it to TV. So when they went with TrackMan, they discovered that just by nature of things, the processing of the radar signal takes longer. So they were able to get less processing done.
Starting point is 00:27:02 So what you were seeing on TV in the K zone at the beginning of the season was like really bad because it was partial. So you were missing, they're missing the zones by a lot. And you know, they fixed that. They did some work and they fixed it. So that's, you know, so the answer to the question is it's probably too fast because it's made to be starting the transmission to the broadcast before all the data has been processed and estimated cleanly in some, that's the risk. So it's like the faster you go, the way this is designed, the better it is for TV, but not for baseball, like strike calling. So you actually probably want a system that is faster than the human umpires. So when you say a half a second to a second is the call, I'm guessing with the automated systems, they're not going to want that because that's
Starting point is 00:27:43 going to look suspicious or, okay, there's a delay. Oh, it's a ball. It's going to be expected to be instantaneous. I think it's fast enough now, assuming all these broadcast timings are relatable to what would be something you get down to the field for a visualization on the field. I think they have it now, but I don't know if it's reliably that fast. I mean, sometimes the system just forks and you don't get a few pitches. And sometimes it's a software problem. And, you know, so sometimes a network latency problem. So you start to get into this, like it should work fast enough, but I'm not sure it's going to work fast enough all the time. Like you really have to field test the heck out of that. So I know with the radar processing, it's really close to the edge. But again, remember, we're not waiting necessarily for the full track of the ball in all
Starting point is 00:28:29 cases. Now I think they are, but for all I know, maybe they still have it truncated. I'm not 100% sure. So I may be saying it's fast enough now, but it may still not be right. So to kind of go back to the thing I said before, but this isn't observed, to wait for an observed system where you're losing like three tenths of a second you can start processing you know so that that may actually slow things down and push it to close the limit that jeff mentioned where it starts to seem familiar at least but again i think it's going to have to be faster like if it's automated uh-huh and then there's also the issue as you mentioned in the article about the top and bottom of the zone, which is difficult to set programmatically. Oh, anyway, the rulebook is super vague.
Starting point is 00:29:09 Right. I mean, it's hard for umpires to do currently even, just because the zone is moving on every pitch, essentially. Yeah, well, it's literally moving. And I think, I don't know, you'd have to really query the umpires hard on this, and they might not even be able to consciously explain how they do it. But where are they setting the zone?
Starting point is 00:29:25 Because there's no point in time that says this is when the zone occurs. It's like when the hitter is ready to swing. What is it? You know, something like that. There's very vague language. Yeah. So it can change pitch to pitch. And conceivably, if the player is like ducking or changing his stance, the umpire can choose to, you know, change when he picks his zone.
Starting point is 00:29:43 And right now, like they're graded against what humans mark or as david kagan noted and at hardball times last week that if they're going to go to like an automated bottom zone or what they're graded on it's going to be derived from what the umpires call and it ends up being totally similar i think what the humans the stringers do but historically so i don't know if they have changed what they're publishing and if it's actually their derived data or something it's we tried to do a derived version, but it's tricky because the strike zone is curved. It's not square. And to me, that's like the number one thing. I mean, the technology stuff is kind of, we can argue, I probably bored half. You're going to have to
Starting point is 00:30:17 edit out half of me. This is boring stuff. It's like, oh, well, the ball tracks this way and the laser does not exist. It's not, it's a camera, fellas. It's just a camera. But the thing is the game will change. So even if we nail down all this, like is it fast enough? Does it feel fast enough? Does it not seem suspicious? Is it secure, not being hacked? Is it calibrated?
Starting point is 00:30:36 So even if we nail down all the technical aspects of this. Yeah. Will it not miss a pitch occasionally? Yeah. Let's just pretend that it does. Miss a pitch occasionally. Yeah. Let's just pretend that it does.
Starting point is 00:30:49 We've created this redundant system that is self-calibrating between pitches. And just pretend that we're a few years down the road and the technology is rocking. And we have a reasonable way of setting the top and the bottom of the strike zone that everybody feels comfortable with. This is going to change baseball a lot because the strike zone is really strange. There's two kind of rules of thumb in baseball. And let's just lay what those out are first and kind of talk about why they are maybe. One is left-handed hitters have a strike zone that is shifted such that inside the balls that are on the plate inside are not called strikes, but balls that are off the edge of the plate outside are. It's like the whole strike zone is just moved over. Less so than it used to be, right?
Starting point is 00:31:28 This is true. It has become less extreme, but it is absolutely still there, no doubt. The other rule of thumb is that the umpire is going to be, we'll call them sympathetic to whomever is ahead or behind in the count. So if it's a two strike count, they don't give you the corner. They call it a ball. When it's a three-oh count, yeah, sure. That's a strike, whatever. So what happens is the strike zone kind of swells and expands through the count. And it's not, and it's, so here's maybe the third rule of thumb. It's round. It's elliptical. It's a kind of gushy. It's not a rectangle. so pitches above the zone over the middle of the plate might be called the strike pitches you know actually that are in the shoulder
Starting point is 00:32:09 line height of the zone and still over the plate but on the outside won't get called because that's a corner you don't get corner calls so and those corners where the where they exist or if they exist change based on the count so you have the situation where the strike zone is going to move for lefties the strike zone won't change count by count. So you're going to see more guys get punched out on borderline pitches on two strikes. And you're going to have this, you know, immediate change to how the game feels. It's just going to be a little bit different where, and I think that is going to be the thing that I'm most curious about. Like if you solve all the technical problems and eventually, sure, you probably will. I don't know how long it's going to take, a few years. But even if you do, what's going to happen to the play of game? What's that going to be like? Why has the strike zone
Starting point is 00:32:53 developed that way? Why have umpires been that way? Why do they call strike zones that change by the count? How important is that in of like baseball's DNA? We don't know. It could be nothing. It could be, you know, this is fine. They'll get used to it after one spring training. But at least that's one thing we have to go out and discover is what's the effect going to be on how the game feels and how balls and strikes are called and how people react when they see things and immediately how the players, you know, bang off that change and how they immediately realize, oh, wow, the zone is different. And I have to change how I, my at-bats, you know, I changed my approach.
Starting point is 00:33:28 It may hurt some pitchers in certain ways and help others in other ways. So the guy is kind of wild and always missing his target, but he's actually still throwing a strike, but not getting the call because the catcher's like diving all over the place. Like the guy's hard to frame. Well, suddenly that's not as much of a problem. Yeah. You know, so. Right. Yeah. I think working on my article at Grantland and reading your article at BP,
Starting point is 00:33:58 even though I think we cataloged all the reasons why robot umps maybe couldn't exist or more difficult than people think they are, I don't know that that necessarily would have swayed me on my position. I mean, I don't know. I necessarily would have swayed me on my position. I mean, I don't know. I tend to be sympathetic to that argument that, well, we know that humans are getting plenty of pitches wrong, and we may not know how many the robot umps would get wrong, but it seems reasonable at least to think that, I mean, if you were starting baseball today, right, that's an argument that a lot of people make like we we probably wouldn't just have a guy back there calling pitches i mean i think that's oh that's totally nonsense really well totally show me the budget to do this you know
Starting point is 00:34:36 in your babe ruth league well sure yeah okay that's the thing here man it's like until you have this kit working down below, there's another problem here. So suddenly there's no home plate umpiring in baseball. Nobody wants to be an umpire anymore. So you don't have all these amateurs striving for it because they just stand out there behind second base twiddling their thumbs. But then again, also, are you really going to have this technology? Is playing the game at a moderate competitive level going to become dependent upon some technical innovation and become a requirement for play? That sounds like a bad idea.
Starting point is 00:35:09 So, I mean, it's like you're going to have a human umpire on the bases and all over the place, but the only place you're not going to have it is home plate. I mean, what do they do? You've got to have someone there. Yeah. I mean, I guess to me, you know, we have this system now where human umpires are essentially evaluated by robot umps in a sense, like not in a binding way. But I mean, it seems almost backward, like it has helped, right? And so, I mean, if MLB is trusting the robot umps essentially to tell the umpires whether they're doing a good job or not, which is, you know, kind of the case. They've made adjustments to the zone based on those automated readings they give umpires you know
Starting point is 00:35:50 printouts after the game right yeah so i mean isn't that tacitly sort of saying we trust the system more than we trust the humans or that no is i mean it's a little different because i guess it's not in real time so maybe that makes all the little different because I guess it's not in real time. So maybe that makes all the difference. But the fact that it's not in real time should tell you something, but it might not. Maybe it's the umpire saying go away. It's a union, you know, negotiated term. So strike zones have gotten more standard over time between umpires. So from 2000 or so, when they switched from having by league to having one pool of major league umpires where the crews went through all the leagues instead of the way it used to be,
Starting point is 00:36:27 zones got better within like a year. When they started doing the Quest tech evaluations in 2000, they started getting more consistent. And then in 2009, they switched to PitchFX, a more precise system. They got better still. So I think with the fact that the data quality was a bit of a challenge this year. So just talking about this past season, a lot of these new radar installations, there was interference, there was just some challenges that they've been working and they're spending a lot of time.
Starting point is 00:36:54 I'm hearing from people at the teams talking about what's going on on their field in the off season. The systems are going to be better in 2018. But in 2017, I think what was happening was they were probably getting less utility out of their grades. And they knew that because the measured accuracy, as far as they've measured it and estimated it, is not the same as it used to be from what I understand and from what I can see in the data myself. But obviously, I'm not doing the field testing. But from the field test that I have heard about is they're like, yeah, we need to make this better. And they are.
Starting point is 00:37:23 So they're not at all. They correctly look at all these data things very skeptically and and the way they grade the umpires you know and this this information is from when they were still with pitch fx right when they started and actually were making the transition they would discard from their grades the pitches that are the ones that i think bother the fans the most so like if the pitcher missed the target the catcher stabs crazily across the zone, they don't want to like, they're not going to ding you for that, for missing that. You know, if the hitter did something squared around the bunt or checked or whatever, and that got in the way, they might not ding you for that. So there was like this consideration and they kind of delineated at a conference, like, you know, the representatives
Starting point is 00:38:00 kind of explained to us what all the things were. And I was like, ah, no, those are all the things that you want to use the technology to show the umpires what they're training up. Like it just shows you that you need to train them on how to handle that situation where the catcher sets up inside and goes reaching across the plate. The ball is dead middle, but it's called a ball. You want to teach them how to, and come up with training techniques so that their brain handles that situation better. And if you don't make the incentives attached to that, no one's going to do that. So no, like they're not quite, you know, like saying this is the better system. They're saying they're using it almost to just grade themselves and not to develop. If that makes sense, there's like more to it. It's like, it's like, okay, you gotta have your zone more
Starting point is 00:38:39 consistent, but we haven't seen the stopping of the zone swelling and growing. If that was really, But we haven't seen the stopping of the zone swelling and growing. If that was really, if they really took the robot quote stuff that they're using for the grading as seriously as you would hope, we would have seen this zone growth go away in the last 18 years. Like you wouldn't have two zones for, in those left-handed hitter zone would have been not just slowly getting back over, but like fixed. Like, no guys, you have to stop doing this because apparently they're allowed to miss those calls. Otherwise they wouldn't be because it's a clear bias in shifting the zone against left-hand hitters. And they're basically saying the way the
Starting point is 00:39:15 humans call the game is right. We just want to see how right they are along their parameters. In doing the research for your article, part of the process was that you compared sort of a uh a calibrated theoretical strike zone to the actual strike zone that was called i'm just going to read a sentence that uh that you wrote two sentences that you wrote out loud to you and you will you will see why quote we looked at every major league game in 2017 to see how many calls would change from one thing to another based on our model their results ranged from just four changed calls to 50. What was the 50-call game? I need to know about the 50-call game.
Starting point is 00:39:49 I won't tell you. Sorry. But there were a few. How many innings were in the 50-call game? If the answer is a nine, you need to tell. It was a regulation game, and that umpire was having a very, very, very bad day. And I would look at that as an anomaly for that person. But I think
Starting point is 00:40:07 it was pretty clear that they were just not getting their calls right that day. They were just pretty much any chance there was to have a disagreement, there was. And at a certain point, when those disagreements are in every direction, it's there are there are sometimes this is the data being too noisy and the and the humans are getting it right but when it's up to 50 bad calls even if it's just 30 missed calls that that's a lot but in every game there were a few like literally every game like what's funny is that when you look at it on average though what we kind of expected was like you know each team will probably see three to five calls change. So maybe let's say they'll get three strikes added and two balls added, each team. So maybe 10 calls a whole game.
Starting point is 00:40:51 It's like not that much will change. But where it changes is kind of interesting. Not too many guys take O2 pitches, okay, for obvious reasons. You know, you're battling things off, trying to bottle the ball out of the way. But what happens on O2 pitches right now, as I said earlier, the stone shrinks so much that if you went to the robot, there would be a 30% increase from baseline and called third strikes on pitches that are taken. Now, it wouldn't happen very much because not many guys take a borderline two-strike pitch. But when you do, you're not going to get the generous call. So there's like this, like, whoa, oh my gosh. That would change.
Starting point is 00:41:26 And that's going to happen probably once a game, maybe. But the 3-0 automatic won't happen. You'll notice it. There'll definitely be situations where you would notice the difference. And I think that would correspond to watching an umpire having a bad day. It's like, that call looked weird. So it's really obvious when there's a lot of those weird calls, but I think you're going to have situations where like, oh man, that was a strike on the last pitch. He literally called that a strike just a minute ago. Literally the last pitch, same spot, now it's a ball. Yeah, because the zone changed. Those little instances will be what... That's the type of thing that's a difference, but it's rarer to have that day where the guy is just missing
Starting point is 00:42:04 calls left and right. But umpires have bad days. It's just like anybody else. In the type of thing that's a difference. But it's rarer to have that day where the guy is just missing calls left and right. But umpires have bad days. It's just like anybody else. In the course of a season, you're going to have a handful of games where the umpire just kicked it badly. And I think we checked that game and there had been four ejections by the time. It's one of those things where it's like they were letting him know and he was letting them know that he knew that they were letting him know. So, I mean, I don't think this is going to be – you can probably just go figure it out on your own. And I'll blink twice when you do, right, Ben?
Starting point is 00:42:35 Yes. You've used that approach with me before. Many years. I'm an article to write. Blink twice. I have an article to write. So the last question I have on this, even though I am generally a believer in technology and think that we will get there if we're not there yet to the point where it would be feasible and better to do this, I'm not really a robot advocate. I've kind of changed my mind on that over the years for some of the reasons that you
Starting point is 00:43:02 were mentioning in an earlier answer about how it would affect the game. You didn't really even get into framing and, you know, pitchers expanding the zone and that element of strategy, which, you know, I really enjoy. And I'm probably biased because it's fun to write about and analyze that stuff. I don't know if it's as fun for a fan, but. And this is the thing. I think that stuff's really important for baseball. I think what's fundamental about baseball is the batter pitcher competition. And this is the thing. I think that stuff's really important for baseball. I think what's fundamental about baseball is the batter-pitcher competition. And that is the fundamental thing to the game. And I think mucking about with that is always very risky. And we've done it before. The game is done before. And don't get me wrong, this wouldn't be the first time that the mound has changed. It used to be underhand pitching. I mean, this is... I don't think it
Starting point is 00:43:40 will destroy baseball, but there's no question that it would have a ripple and there'd be things like you mentioned, like some strategic stuff that you find cool, entertaining. No, that'd be gone. But maybe something new would emerge. I don't know. I don't know. I feel like it takes more away. One question we get often, I agree, but one question we get often is what the net effect of it would be.
Starting point is 00:44:00 And it's really hard to forecast, but I'm sure you've thought about it. Would it help pitchers more or hitters more would we see a decline in offense or an increase in offense if either based on your your best estimate of all the many ripple effects that could come from this i think overall you're going to lose strikes uh-huh but you're going to hold on a second i think i have the pack you're going to have more first pitch strikes okay so i think that you're going to have more first pitch strikes. Okay. So I think that you're going to, so that's like the fundamental thing. It's like right now, OO count is a little hitter friendly, relatively speaking. So I think that's like the word, that's the only thing that I, because anything after that, we don't know.
Starting point is 00:44:37 But the first place where the rubber meets the road is on OO, and there should be a small percentage of pitches that are more likely to be called strikes with a robot than they are with a human. And I think that will make hitting more aggressive. Whether or not that increases offense, I don't know. You tell me how your ball, you know, balancing research is going. I mean, this is, it depends. Like, will there be an increased incentive to put balls in play like there is today with the ball being livelier then, you know, if you then have it, you increase that, like go ahead and swing thing with, you know, bigger zone, is that going to increase offense? I think it would in this context. I think though, what's harder to figure out is,
Starting point is 00:45:15 I think this is something I mentioned earlier, that really hard throwers or wild throwers are generally harder to frame, right? So they lose some strikes. Those guys suddenly, you know, so suddenly stuff will become more valuable. Pure stuff will have just a little more of an advantage. So you might actually increase the amount of power pitching in the game. Or alternatively, you could say, finally, breaking balls are going to get called. So I can do all sorts of backdoor, top shelf,
Starting point is 00:45:40 you know, corner nipping, breaking pitches that are junk. This is the thing, man. I don't know. I think initially what's going to happen is you're going to have batters turn more aggressive. And that's about all I can guess. I think there's going to be a change in every individual player is going to have to be looked at by their front office and say, how is this going to impact his pitching? Some guys less, some guys more. The real command daughters will probably figure it out. Even if we're thinking, well, they lose that advantage, they're no longer going to benefit from framing and their own frameability. But their command pitchers will probably figure out some other thing,
Starting point is 00:46:11 like maybe one of the backdoor, frontdoor stuff I was mentioning earlier. But you're also, it's just, you know, this could fork off into three different ways, but it's going to change things. That's all we, you know, that's basically it. I think initially, slightly more offense, all else being equal. But overall, I else being equal. But overall, I don't know.
Starting point is 00:46:35 It may eventually play into the pitcher's hands because, in effect, you're giving them more work, more places to punch guys out when they're head in the zone, head in the count relative to the baseline. We had an email come in last week that was asking about this and the effect that it could have on catchers because of course if you have the automated strike zone at the major league level but you don't have the budget to do that at other levels then you're going to have catchers who are prioritizing you know catching as they're growing up developing even getting close to the major league level but all of a sudden it would no longer be that important in the bigs and so what well potentially go ahead you take it from here the running game would go funk because catchers would just be like setting up the throw, not receive. They're about jumping in front of the umpire and losing a call and things like that. And you won't have to worry about guys with crappy framing skills who have a cannon not being safe to play.
Starting point is 00:47:15 So you basically bring back Ryan Dermott. It's like, okay, you can suddenly put these guys with butcher hands behind the plate. So you might be able to suddenly start having more hard hitting catchers and hard throwing catchers. So I think that's how so the catching the profile of catcher would change. And it's even possible that their game calling swing reading and things like that would be slightly less important because there's no hey, I know this umpire, you know, I've caught 100 games in front of him. There's no more of that. So it changes the role. I think they become a less nuanced position. But I think this would have an impact on running.
Starting point is 00:47:49 And bunting. It would kill bunting. Catchers would be so much quicker able to bunt. This would be terrible. This would be end of bunting. Oh, no. Oh, no. Oh, no.
Starting point is 00:47:56 Oh, I'd miss some types of bunting. So robot aunts now, right? Yeah. Kill the bunt. So let's transition to the second of these pieces. I'm sure we won't spend as much time on the next two as we did on the first one. But still, we want to recognize this research here that you did with Kate Morrison and Jeff Long. And you essentially classified all pitchers on a three-category scale, power, command, stamina, ratings from zero to 100. I think they
Starting point is 00:48:29 largely match up with what you would think, but there's some surprises. Yeah, there's enough like, yeah. I guess the difference here, people are probably familiar with like baseball references, power finesse splits. This is different because it's using process essentially rather than results. It's not saying this guy has a high strikeout rate. It's saying this guy throws a fast fastball and he throws a lot of fastballs and he throws hard breaking balls. So it's classifying them based on that. So I mean, it's fun to just browse and just look. And if it's just kind of like a Bill James style toy, that would be fine and
Starting point is 00:49:05 wonderful. But do you see a lot of analytical applications for this too? Are there things you're looking forward to doing with this? Always. Yeah. I mean, there's multiple. Yeah. One is absolutely that there's definitely a Bill James toy type of thing to this, like no question. But what's different, and I don't think that is something that I'd be familiar with in terms of Bill's work, is that it's process driven, not results. So we don't look at your walk rate and your K rate and your hit batter rate or anything like that to determine if you're a command pitcher. We don't look at how many strikeouts you had determined if you're a power pitcher. We look at what actually miles per hour you're throwing. So how hard is your fastball relative to the rest of the population? How often do you throw your fastball? And if you're a hard cutter guy, that's your
Starting point is 00:49:48 fastball too. So how often do you throw that? And then as you mentioned, is your breaking pitch a soft or hard breaking pitch? Where does that fall on the spectrum? And that's what a power, that's what the power profile is. It's those things. And it's, so it's mostly how much, it's mostly about your fastball, but it's a bit about your breaking pitches and how you use them. And it's theoretically sound, not complete, but sound in such a way that we can use it to kind of just create a set of benchmarks of saying this is over the course of the 10 years of data we have, this is the range of powerness. So like Chapman's most fastball heavy throwing smoke season is our 100, sets the 100 mark. And R.A. Dickey is like, it's literally a zero. He may actually have a negative score, but we scale everybody to zero. There's definitely some manipulation of the
Starting point is 00:50:35 distributions to get them as equivalent as possible. So that's something we want to fix in another iteration is make sure they're truly the same shape each year or representing the proper shape that they should be each year. So as opposed to looking across the whole population. But anyway, so you have this, you know, set of parameters that say this guy is a power pitcher. And they pretty much jive with what you would think. But it doesn't necessarily jive with when you see the guy's name. So the top power pitcher from starters was Chad Cole.
Starting point is 00:51:03 And that's probably not the name people would have picked. Yeah. I saw a comment. It was a comment. Yes. Who is Chad Cole? Yeah. Yeah.
Starting point is 00:51:11 It's like, you know, and this is a good question. He is a sinker baller for the Pirates who throws like 95 to 99 with a two seamer. He throws that pitch 60% of the time. His changeup average is 89 and his slider average is 89. And he does not have a third speed. He does not have a slow breaking pitch. It's a two speed pitcher. So yeah, he's a power sinker pitcher. This doesn't mean it's good. Right. And you did find that relievers as a whole, like if you sort by power, all the top guys are relievers. Of course it is because they're throwing harder,
Starting point is 00:51:43 shorter. We didn't want to position adjust it. We just went, this is what they've laid out on the field. And if you moved him into the rotation, you'd probably start losing points because he's going to start throwing more breaking stuff. But this is just how he threw. And that should make sense because you are going to see more pure power profile guys
Starting point is 00:52:00 coming out of the bullpen than working out of the rotation. So it's like, if we're not accounting for that signal it's like we're not we don't want to account for it we want it to be that way we want it to reflect what they did out there and yeah you would project them differently if they moved to the bullpen and again it doesn't mean so back to chad cole it doesn't mean that's good he probably it could say it says two things one he throws a lot of hard pitches and you may look at that and go, he could benefit from having a third speed. Maybe his changeup needs to be softer. Maybe he needs to be one of those guys.
Starting point is 00:52:30 It's not an answer as much as it just sets you up to ask questions. So we totally hope that people will look at these things over time because there's aging. There's the power. You can see power declining. You can see command improving. Actually, I want to talk about command. I know we're probably running, killing on time, but the command thing.
Starting point is 00:52:48 Is the thing I'm actually most interested in. Cause that's the. That is like our theory of pitching. Like this is like, like there is like, what is you ask any person in baseball command is, and they're going to, it's, it's, it's crazy. So we're just going to, we're just going to talk about what we kind of think
Starting point is 00:53:03 are aspects of command. Yeah, because it's something that obviously incorporates the pitcher's intention and target, which is something that is either difficult or impossible to measure. Right, exactly. And there are some measures and they can help. So, for example, the frameability is a big part of it. And so we would take a look at their call strikes above average. That's like the first thing in the recipe for command. But what we, what we have done is said, look, there's a hundred things we want to measure. We'll just start with a few,
Starting point is 00:53:32 a lot of things we tried and threw away because they didn't really have any, they didn't seem to have any analytical value and there's clouds, you know, they weren't a skill and some things we threw out because maybe we're not measuring them right yet. So like this, the second component to it is cornering ability and this is not how fast you can drive your car around turns it's it's if we take the strike zone and find the the 50 percent line okay we'll just say this your initial target on a count and you know that that's a circle basically around the edges of the zone or just inside the zone and put a dot on that each of the four quadrants like where it is in the diagonal and those are targets. So every pitch that's thrown on every count has a target, a set of four targets. And it's going to be how much of a strike that is,
Starting point is 00:54:12 is going to go up and down based on what the count is, because on an O2 count, you're not as likely to fill the zone. And also if you're throwing an O2 fastball versus an O2 breaking pitch, it's also a different assumption about what your target would be. You're not going to throw a back foot fastball, but you may be throwing a back foot slider. So at 0-2, your fastball target may still be 50% strike, but your slider we found based on actual distribution of pitches is about 20%. So we have an idea of how pitchers throw to corners. So for every pitch, we establish a corner. We measure the pitch's location relative to all four corners and pick the closest one. And that's your corner score for that pitch. And then we also look at what the called strike probability was. And if you're in
Starting point is 00:54:52 the 50, let's say it's 50% band, if you're right on the dot, that's great. Zero. If you're on the edges, but not at a corner, that's still okay. But if you're starting to move off out of that band, either in or out of the zone, you start to get penalized. And the penalty is greater when you go out. So ideally, you're throwing along the edges. If you're not throwing on the edges, throw in the middle. If you're not throwing in the middle, throw off the plate. That's the worst.
Starting point is 00:55:13 And if you look at statistically, that's actually the guys you throw on the edge are the best group of pitchers. The guys you throw to the middle are the second best group of pitchers. So there is at a very large population level results. But at the individual level, there is not. Yeah, so that was going to be sort of my next question. How do you decide to validate your command measurements? Because obviously you can generate a sequence of numbers,
Starting point is 00:55:34 but then you have to come down to, am I measuring the right thing? And so what is your determined process? It's basically, it's totally hypothesis testing that if this is a good thing to do, then pitchers who do it are going to have, in this case, maybe higher strikeout rate. We just look at a range of things. So like from DRA, our deserved run average, to strikeout rates, walk rates, home run rates,
Starting point is 00:55:56 things about balls in play. Like we look across all of those. We say, yeah, this does seem to matter. So we thought at one point that distributing your pitches evenly to all four corners would be important. So there was like, what's your variance in which corner you go to, man. So that turned out to have nothing to do. We couldn't find any relationship at a group level to performance. We couldn't find anything when you just lump the guys into a pile. Okay, who's the 15 most extreme guys in this? No, that doesn't
Starting point is 00:56:25 feel anything right. So we're thinking we're just not measuring it right, but we put it aside. So corner variance we got rid of, but we kept cornering because guys who throw to the corners tend to have the better deserve run average, better K rates, better walk rate as a population. But you go into the population, you get really crazy stuff. There's a serious similarity between Dallas Keuchel and Wade Miley. And like, okay, what? There's only three points of ERA between those guys. Wade Miley was terrible last year. But a lot of his numbers in these metrics, they line up to Keuchel.
Starting point is 00:56:59 They throw the same way. They pitch the same way. Down and out of the zone. But one of them is Cy Young-Kin and the other one was unceremoniously cut and certainly not having a contract at the moment. So what does that mean? Well, it means that there's power and stamina and they're kind of the same there, but there's other components that we haven't released yet. Yeah. And this should be useful. I'm sure you helped me with my article during the last postseason about hitters facing Dallas Keuchel and about how guys who are patient have done so much better against him. And so now that we have these scores for everyone, and like the idea might be like hey if you're good at two or three of these you're probably a good pitcher if you're
Starting point is 00:57:47 good at five of these you might be justin berlander so and the five things are are you know what we want to get at and we haven't finished this we want to keep working on power a little bit stamina we want to start putting a pitch fx information to see how we hold your velocity we did not look at that command we talked about all those other things we want to do but we also want to to get a stuff grade, like a pure stuff grade. And that's actually way harder to do than people give credit for because of arm angle variances. And then we also want to have a deception or funk score. And that might just be the unusualness of your release point. And like Clayton Kershaw, he doesn't seem to score very high on anything except his release point is off the charts extreme. So it's like those, so that he will probably end up high in the funk pile, which may be
Starting point is 00:58:28 counterintuitive to people a little bit. But our hope is that we put together these set of five components eventually that will all both start understanding how they combine. And that will be ripe for analysis. That will be, you know, but we have to constantly tune and improve, you know, build out the last two and then have a set of five that we keep refining. But we think that will be a base for analysis along with our Ringer MLB Show listeners may remember when you rolled out pitch tunneling last year. Michael and I talked to Jeff Long and Jonathan Judge at length about that. You've upgraded the metrics in certain ways to try to make them more accurate, more accurately reflect the batter's experience.
Starting point is 00:59:27 experience. Martín Alonso made a really cool interactive tool so that anyone can look up any matchup between a hitter and pitcher in 2017 and see it from the hitter's perspective as detailed as not only the hitter's handedness, but the hitter's height, which is really cool. So you can get a sense of why certain sequences may be difficult to time, that kind of thing. So this data is incredibly complex. And I think that it's amazing that it's out there. It's also intimidating to people. And I was going to ask you which articles you've most enjoyed or what applications of this database that you've seen over the last year you've enjoyed the most and you helpfully put a lot of links in your piece at BP pointing to some of the best research that's been done using this information and and you know for people who aren't clear this is looking at basically the separation between pitches seeing if there's some added effectiveness to you know mixing pitches that look the same up to a certain point
Starting point is 01:00:27 and there's of course a point between the pitcher's hand and the plate where the hitter can no longer really track or adjust to things that the pitch does so certain guys seem to tunnel and use it really effectively other guys seem to not really tunnel at all and are just fine. So it's hard to say, like, sort a leaderboard and say, this is good tunneling, this is bad tunneling, which makes it complicated. So what are the best sort of applications of this information that you've seen? And what would you like to see people use this information for? What I want to see people use it for is pretty much what you just explained, which is that it's how they do it and what they're doing. And you have to then understand why that works or doesn't work or has the results it has or what results you expect it to have.
Starting point is 01:01:15 I'm also interested in what people would say about, you know, optimizing pitch sequences. Just so that I think that I think there's a lot of information potentially there in this data working about that I would like to see people mine for, but there's, you know, there's, it's about understanding the individual picture and that's the stuff I like the most. So it's like the stuff that breaks down, like, this is what this guy is doing with, with tunneling. And this is what this guy isn't doing. And like Darvish does both, like he can throw you a sequence. Darvish does both. He can throw you a sequence that's the Greg Maddox calm of milk, where it's a 94 mile an hour cutter, and it just goes a little bit away. Or it's a split at 89, and it just goes thunk, and you couldn't see it. Or he'll drop you a 59 mile an hour curveball after whizzing some fastballs eye high past you that you can still tell. So he may be setting
Starting point is 01:02:03 you up for something intentionally. It has a purpose And so I think that that's the thing. It's like, there's like, why? Why is this advantageous? Why do you want to change the eye level? Why would you want to deceive a guy? What, you know, that's, it's about understanding individual differences in pitchers, and then understanding how to appreciate, develop, and, you know, project those types of pitchers and model how they might behave with a new pitch. So I wanted to ask just real quick about the idea of deception. Now, when we had a podcast with Fernando Perez back during the previous summer, I asked him when he was a hitter what he thought was the most deceptive thing that a pitcher could do. And in defining deception,
Starting point is 01:02:41 he pointed to basically knees and elbows, sort of the chris sale thing but then you have another uh perspective francisco cervelli was asked at one point what makes jake arietta so difficult and he's like well he's throwing the ball from shortstop so he's just really hard to pick up the ball so deception is it's sort of in one sense a missing ingredient but then if you have the uh if you take the arietta perspective that at least does show up in some trackable information we have of, of course, the release points. But is something mechanical that comes pre-release, is that just forever doomed, I guess, to be an analytical blind spot? Yeah, no, absolutely.
Starting point is 01:03:14 We're not going to be able to, like, the fact that, you know, like, Luis Tiant has turned his back to you, and that's very disconcerting. Oh, excuse me, this is modern times. Johnny Coeto has turned his back to you, and that's very disconcerting. Oh, excuse me. This is the modern times. Johnny Coetho has turned his back to you. And that's very disconcerting. I mean, there's this, it's the whole thing is, you know, it's the old saying, you know, just, just, you're disrupting timing. And so how do you do that? It's like, well, you know, throw different speeds, throw from different release points, change the rhythm, you know, so the guys who quick release and stuff like that to work on guys, like all the, those are all types of things that are hard. You can't really measure.
Starting point is 01:03:45 We can look at pacing to some extent, but you can't tell if a guy's speeding up his mechanics. You can't tell if he's got a really high leg kick. You can't tell if he's raising his glove high front side and, and, you know, catching your eye with that. So we can tell based on the movement of the pitches,
Starting point is 01:04:00 what their arm slots are. And that's something that's actually currently missing from the visualizer. We have to put that in somehow we forgot, but it's the, what's called the Lensner axis named for Matt Lensner, where you can calculate the arm path, the arm slot of a guy by the shape of their pitches. So I think that along with how far off to the side, wherever, what the angle is, the viewing angle for a hitter has to be and how unusual that power and path is combined with that and possibly something with the pitcher's height although i think the arm angle and the release point may cover that
Starting point is 01:04:29 i think that will get us a little bit but we're always going to have a need to actually look at the pitcher on the mound you know i mean that's that's the thing there's going to be let me put it this way the coolest thing would be if you looked at this data and nothing made sense about a guy until you saw him yeah right this sometimes people ask me, is there a research project or is there an article about baseball that you've wanted to write for a long time and you haven't been able to, that sort of thing. And deception is probably at the top of my list. I've been wanting to write some kind of in-depth piece about pitchers who do something strange, hide the as hitters say and i actually i got usmero petite's agent to send me at one point early in petite's career he went to i forget if it was asmi or one
Starting point is 01:05:14 of those places where you get your whole delivery tracked yes he was uh i've heard he's one of those guys who who always you know is cited as deceptive and he's more effective than you would think based on his somewhat pedestrian stuff so i i have like the the full record of usmaro petite's wind up and delivery and like all the tracking points and everything that were placed on him during that process and that cool yeah i've been trying to figure out what to do with it because it's really complicated and i don't know how to interpret it yeah i mean you need a lot of it i mean it's like one thing we we had you know where we i was the co-author on a paper with amsi and dan brooks and and glenn um and we basically showed that the lenzner axis concept was accurate that the biomechanical data matched our lenzner estimates and so that's what we did with it like proof of basically proved out the theory which was
Starting point is 01:06:06 just a really nifty thing to do and get to see all kind of that data is kind of cool too yeah but yeah i think i think if you had that stuff and enough of it then you could start getting different types of analysis with like how is this body you know action affecting the hitters. And that's probably really hard to do. But I would imagine it's possible. Yeah, I would not be surprised if there's someone who works for a team who is listening to this and either nodding along up and down or side to side because he or she doesn't want us to continue down this path because there's some kind of competitive advantage here.
Starting point is 01:06:44 Right. I mean, there's everything from the way guys do little things to hide the re-gripping. I remember when Ryan Dempster started that flip, flip, flip, flip, because he was tipping his pitches and he wanted to like, you know, move his glove while he was gripping or not re-gripping the pitch so the batter couldn't tell.
Starting point is 01:06:58 So I think guys do things like that, even in terms of like, you know, how they break their hands and, you know, how high up they let their front side go. And and i think you know you could start then quantifying like hey these guys you have really high front side and their stuff is otherwise the same and their arm slots otherwise the same but they're coming up just a little bit more you know after the set then they seem to have a five percent more success rate you know whatever i'm making this up and it's like that that's the type of thing where I imagine you could go. But you need a lot. That's some pretty hardcore data analysis. It is. Maybe one day you can just end up doing sort of statistics by
Starting point is 01:07:36 omission and declaring the deception is all of the missing gap between a pitcher's actual performance and all the things that we're able to measure let's look that's like our game calling with catchers right yep there's this yes that must be his how that's look at that that's amateur psychology manifested in an error bar yeah yeah chad cole not but no you're right do you think so like it's like or like if you look at the guys like their change-up isn't working it has all the numbers you want. It must be his arm speed. You can start guessing at like things, you know what I mean? It's like, you can start saying, I got to look at this guy because I need to, but you
Starting point is 01:08:12 should be able to have like questions. And the idea would be from like a baseball operations standpoint. It was like, if you can have data, give you specific questions, it's maybe easier to have less skilled people go do the video evaluation on that question. All right. Well, all of this information is available at Baseball Perspectives. We will link to the series. You can go to the stats pages, find all the pitch tumbling stuff, all the pitcher rating stuff, spend many hours just diving through all of that, figuring out what it means and what to do with it. And
Starting point is 01:08:45 you can listen to Harry on the Stolen Signs podcast, where I'm sure he'll be discussing some of this stuff soon as well. And you can find Harry, of course, on Twitter at Harry Pav. And as soon as pitches start being thrown again, he will be classifying them. So Harry, thank you, as always. Thank you. Good to be here. All right, I'll will do it for today. are in the process of being sold. All right, so that will do it for today. By the way, some of you may be wondering about the possibility of robots assisting umpires.
Starting point is 01:09:54 That is something I mentioned back in my 2013 Grantland piece. Could work. I think the problem is, though, that once umpires have something in their ear or their pocket that's telling them the quote-unquote correct call, it'll be hard for them to overrule that. They'll need a good reason to do it, and their focus will be elsewhere. They'll be paying attention to other things because they'll be confident that the machine will be telling them whether it's a ball or a strike. That's kind of
Starting point is 01:10:17 the problem with the system occasionally missing pitches, too. You might just say, well, the umpire will be standing there, so if the system misses a pitch, the Empire can just call it. But if the system misses one out of every X thousand pitches, the Empire's not going to be mentally prepared to call that pitch the same way he would be now, when he has to be on at all times. You could imagine some sort of augmented reality device working, where maybe it would just overlay the path of the pitch on the Empire's field of view. But then again, I guess you run into the problems that Harry is raising here about things not happening quickly enough. That would have to be accurate in real time to work. But yes, I figured some of you might write in to suggest something like that.
Starting point is 01:10:56 And in principle, yes, it could work. You may be wondering what became of the listener email show. Usually our midweek episode is the email show. We decided to push it back this week just so we could talk about BP's work in a more timely manner, but we will be answering your questions very soon. You still have time to get them in
Starting point is 01:11:13 via email at podcast at fangraphs.com or via the Patreon messaging system. Speaking of which, you can support this podcast on Patreon by going to patreon.com slash effectivelywild. Five listeners who have already done so include Sean Viziac, Benjamin Litvin, Michael McDonald, Shane Shuby, and Jeff Good. I agree. Jeff is good. You can join our Facebook group at facebook.com slash group slash effectivelywild. You can rate and review and subscribe to Effectively Wild on
Starting point is 01:11:42 iTunes. Thanks to Dylan Higgins for editing assistance. Jeff and I will be back very soon to talk to you again. Living by a default Me and you around As born a baby that didn't cry I programmed robots to make them lie 2020 vision to see I can see

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.