Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1853: What Are the Odds?

Episode Date: May 26, 2022

Ben Lindbergh and Meg Rowley talk to Kelly Pracht, the CEO and co-founder of predictive analytics startup nVenue, which has provided the real-time probabilities displayed on this season’s MLB Networ...k-produced Friday Night Baseball broadcasts on Apple TV+. They discuss nVenue’s origin story, its sports-betting ambitions, its 100-plus-input machine-learning model, which factors are and aren’t predictive […]

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to episode 1853 of Effectively Wild, a baseball podcast from Fangrass presented Let you win again it and we will next time before the end of this week. But today we are going to give you something that we've been hoping to bring you for a while. We are devoting all of this episode to the probabilities that have been appearing on Apple's Friday Night Baseball broadcasts, which are co-produced or produced maybe entirely by MLB Network for the entirety of this season. Every Friday since the start of the year, there have been two games every Friday evening that have appeared on Apple TV Plus. And as many people have noticed, those broadcasts include real-time probabilities, which are updated from pitch to pitch on the bottom right of the screen. So sometimes they will show a probability that
Starting point is 00:01:22 the hitter will get a hit. Sometimes that they'll strike out. Sometimes that they will reach base, etc. So, of course, we were interested in this from the start. In fact, I think even before these first appeared, I may have mentioned on the podcast that they were going to appear because I read it in a press release. And we were interested. Our interest was piqued, I would say. Right. We were kind of into this idea.
Starting point is 00:01:43 Yeah. We were into the initial concept for sure. Yeah, I was curious to see what it would look like at least. And so I tuned in and I still thought the concept was kind of cool. I don't know if it's essential. I don't know if it would enhance the enjoyment of the broadcast to everyone, but basically on board with the concept. But there were certain odds that were raising red flags to me or at least not what I would have expected to see. Let's see. As I would put it, basically, either I fundamentally misunderstood baseball in some way or there was something weird going on with these odds. It was really either one or the other. And I am open to being
Starting point is 00:02:26 fundamentally wrong about baseball. That is entirely possible. I've been wrong about plenty of things. And in a way, I would welcome being wrong because it would give me an opportunity to learn something. But I've been watching these odds. And a lot of our listeners have also been watching the odds. And we've gotten emails and we've gotten tweets and we've gotten people talking about them in the Discord group, in the Facebook group. And so I just kind of wanted to figure out, well, how do these work? And it was reported by Sport Techie that a company, a startup called Invenu, has been providing these odds to Apple and that they've raised more than $4 million from various
Starting point is 00:03:07 investors and they're interested in getting into sports betting, et cetera. So in mid-April, I reached out to InVenue and asked, we've been getting some questions about some of these odds that are kind of confusing people and not just the Effectively Wild audience, but baseball, Twitter as a whole, I would say, and just asked if we could get some explanation or if we could talk to someone potentially and have just kind of been corresponding ever since for several weeks with their chief marketing officer and had kind of an informal conversation with their CEO at a certain point and conveyed some of my confusion about these odds and sent numerous examples about ones that I had a hard time following. Specifically, there are cases,
Starting point is 00:03:53 and we will cite some examples later on in this episode, but there are cases where the probabilities of reaching base, let's say, are making an out move in the direction that we would not expect them to move within a plate appearance. So it's not so much about the number. Sometimes it's about the number, but also it's just about it went up when I thought it would go down or it went down when I thought it would go up based on what we think we know about baseball, which is that if the count becomes more favorable to the batter, if they go up 1-0, 2-0, 3-0, et cetera, that generally they have a better chance of reaching base,
Starting point is 00:04:25 which is what we see on the whole. And there are some published odds here that contradict that, which kind of raised my eyebrows, I suppose. So that's sort of what I was wondering about, asking about. InVenue told me that they would maybe be publishing an FAQ or explainer of some sort on their website. I don't believe that they've done that yet. But eventually, we arranged an interview with Kelly Prock, their CEO and co-founder, who will be joining us in just a second. Now, independently, as some of that was going on, Fangraph's writer
Starting point is 00:04:57 Ben Clemens had some of the same questions and curiosities. And I don't know if he heard us talk about this on the podcast because we had registered some of these concerns in an April episode or two as well. But he decided to look into this also by actually gathering the data that was published on these broadcasts and doing a study to see how predictive it was. And so he has now published that study at Fangraphs, and we will talk to him a little later in this episode as well to explain some of his methodology. But first, we will talk to Kelly and we'll hear a little bit about InVenu and how they were founded and how they ended up providing these probabilities. And then we'll get into the model and some of the specific numbers that we had questions about.
Starting point is 00:05:40 And then we will talk to Ben. Is there anything else you think we need to explain before we get to this we like math we vote we vote yes to probabilistic thinking as a useful lens through which to understand the world and also baseball and uh we hope that like a good version of that can find its way into the public forum i I don't know. That sounds so formal, Ben. It was a long interview and there were a lot of things covered. So yeah, I don't know. I don't have anything else to say. I think you summed it up well. Let's get to this interview, won't we? All right. Well, we are joined now by Kelly Prock. She is the CEO and co-founder of InVenue, the company that has been providing the odds that you've been seeing all
Starting point is 00:06:25 season on Apple TV Plus's Friday night baseball broadcast. Kelly, welcome to the show. Hey, thank you for having me. Really excited to talk numbers with you. So to start, could you summarize InVenue's origin story? When, how, why did you co-found the company? Yeah, no problem. So, you know, I've been leading big tech for quite some time, my whole career, in fact, almost 20 years. And, you know, I come from this crazy sports family growing up in West Texas in the world of Friday Night Lights, where sports was a way of life. And as I became a technologist, I really felt a huge disconnect between how tech is used for sports for fans that are watching. I saw a huge gap when I was leading supercomputing teams. I saw that live data could
Starting point is 00:07:13 be used so much better. That was around 2017. Probably wrote that code, wrote the first lines of code that exact day when I had the aha moment. And here we are five years later with the InVenue algorithms that do live prediction for real-time sports. And how did your relationship with Major League Baseball and Apple TV Plus for these particular broadcasts start? Yes. So in 2021, we participated in the Comcast NBC Accelerator. And it was so great because we found this product market fit for these algorithms that we created. And we spent a lot of time with NBC and with Sky Sports and just a numerous amount of people. And that turned into a relationship where around September of 2021, NBC put our content on air for some A's and some White Sox games because that's where they have the regional sports network. Out of that, the MLB network caught hold of that, gave us a call
Starting point is 00:08:12 and said, wow, we've been talking to them through Comcast Sports Tech. But when they saw it on air, they said this could be something pretty powerful. Well, we got a call maybe two weeks before the start of the season. I think the lockout had just been resolved. Spring training was just starting. And we got a call from MLB Network saying, hey, there's an opportunity to put your content on air. We did everything we could in the course of those two weeks to get ready. And it landed on April 8th. We were on air for the first time with Apple TV and MLB Network.
Starting point is 00:08:43 we were on air for the first time with Apple TV and MLB Network. And could you tell me a little bit about your partners and who has helped you develop the model and some of their backgrounds as well? Well, our algorithms are really homegrown. Myself and my CTO, Mick Stearns, created this from scratch in the very, very beginning, based on our knowledge and deep understanding of how to do compute, how to do machine learning, and how to make things really fast. So when you talk about partners on creation, it's all Houston grown and our own. And those are obviously useful backgrounds to have in play here when you're developing an algorithm. Where did the baseball
Starting point is 00:09:23 expertise at the company come from? Well, that's one of the things that my co-founders and I and our CTO, we all have in common. In fact, the entire executive team, we are, like I said earlier, we're crazy sports fans. So even developing the algorithm, I had, you know, just love baseball, grew up with baseball, grew up with football, basketball, we've all played sports. But while I was creating the algorithm, we had season tickets to the Astros games. And I think I've probably spent well over 100 games in stadium with our algorithm, watching it live, coming back, working on the code, repeating the next day. So we're just true sports fans in every sense of the word. So there's no one, I guess, that you've consulted with who's kind of a baseball person specifically.
Starting point is 00:10:29 I mean, I know you all have some expertise when it comes to baseball and sports, but in terms of someone who has been in the industry before or that kind of background, I don't know that that's necessary, of course, because people from outside the baseball establishment have really revolutionized the game in the past couple of decades and in many cases can pick up on things that experienced baseball people might have missed. But I was just curious whether that has been a part of the process or even maybe working with your broadcast partners, whether that's been a part of the process. Oh, it's so huge that you bring that up because it only takes looking at the products that come out of tech for sports specifically. It doesn't take us very long to pick out who knows sports and who doesn't. And so there are sports that are not in our wheelhouse, like, for example, hockey, just not a sport that I know super well. So as we deliver algorithms for sports that I don't know as well, we take these methods that we've done for creating the live machine learning and apply them.
Starting point is 00:11:12 But we are bringing on board people with massive sports expertise, people deep within the league, coaches within the league, sometimes the league itself. So in baseball, that wasn't really necessary because we have between, you know, across our founders, we have every bit of expertise that's needed. But there are sports that are coming that we don't. So it's a great call. Even when we watch data feeds come from some of our providers, we can tell if they're American based or not. Like when they're talking about football and they say, you know, something about this side of the pitch, it's clearly they're, you know, clearly they're more European based. So you can always tell.
Starting point is 00:11:50 And you talked about some of the early efforts with NBC and then obviously the version that people are going to be most familiar with this season are the Apple TV broadcasts. Were there any changes or refinements to your process that were made between those initial broadcasts and what we're seeing this season? There's always changes and improvements. The core algorithms remain the same. They get better. Deep learning and machine learning, that's its goal, is to learn from previous. So it does get better.
Starting point is 00:12:18 I think one of the big differences from what NBC showed to what Apple showed is NBC showed a really large tombstone of data that took up a bunch of the screen. And as sports fans, we were like, oh my gosh, that's too much. And quite honestly, the feedback that we saw come in through Twitter was, it's too much, it's too big. They used multiple decimal points, but it took away from the game, we thought, They used multiple decimal points, but it took away from the game, we thought, because it was just covered too much of the action. But in Apple TV, it's still the same algorithms, but they've taken a different approach. They said, let's go simple. Let's keep it clean and brief.
Starting point is 00:13:05 Now, the downfall to that, to using just one data point, whereas NBC used five, six. They chose at will what data points to show. The downfall is now we hear questions about, oh my goodness, well, that didn't make sense. That reach went up when a strike came in. Whereas on the NBC broadcast, you could see how the other fields would go up and down. And I think that made for an interesting experience, but they both have pros and cons. And were there any particular game states or situations that you found that early versions of the algorithm struggled to fully capture, even just in terms of, you know, it's striking you as someone who knows the sport as like, that doesn't seem quite right. Oh, goodness. You know, I've always questioned the data from day one. Like, how could this be? How could we pick up on this? But every time
Starting point is 00:13:46 I had a question and every time I had the question, I would go back to the core data that was used for the machine learning algorithm, because it's all built on the same thing that stats generate, right? It's all built on what happened with each and every pitch, like all of the metrics, like what goes into the model. And I would come back and say, that's why. It always, always made sense. I have yet to find a case that doesn't make sense. There are a few that have not made sense to me, but we can get to that. So before we get there, I did read in the Sports Business Journal that you had recently hired a lead
Starting point is 00:14:25 betting advisor, right? And I know that you have some ambitions in that area as well. So is InVenue currently working in that space or are you primarily involved in broadcast now and hoping to expand in there? What do you see the long-term applications beyond the kind of implementation that you've had on the Apple broadcast being? Well, our mission, and it's always been our mission, by the way, is to allow fans to engage with the data. I think just putting data up is one thing. It's not exciting until you can engage with it. By the way, when you don't agree with it, to be able to say, I don't agree with that and to make yourself heard or bet against it. So we found out that in the betting world, this live predictive analytics, like with each pitch, it's a pretty rare thing. Only a handful of companies are doing it.
Starting point is 00:15:12 And it makes perfect sense for us to really further this fan engagement by providing those odds for the real time bets. Now, we also see a world where betting and watching are going to merge. So right now, a couple of weeks ago, I was in New York and I was watching the Yankees game and I was betting on the DraftKings app and I could not watch them bet at the same time just due to the latencies on the screen. As those things go away, that betting and watching in real time is going to be really powerful and we're excited about that. So that's a natural application for InVenue. And quite honestly, it's our number one plan. I appreciate that this is your company.
Starting point is 00:15:54 And so you're probably not keen to pull back the curtain too much. But given that the obvious application, both that you envision and that people might take away from this, is them putting their actual money at stake either because they disagree with the odds that you're showing or because they agree with them what kind of sort of outside validation or testing has gone on to to verify these odds i know that your sounds like your team does that on a fairly regular basis but i don't know that if i were sitting at home that the way that I would express my sort of belief in the accuracy of the odds, if I thought they were wrong, would be to put money on it. It might be to say, these don't look right to me.
Starting point is 00:16:31 So what kind of validation is going on here, given the stakes for people? Yeah, no, it's a wonderful question. It's actually my favorite question is to talk about accuracy. First of all, the accuracy of what a sports prediction is, is not a widely known metric to communicate. So even looking at stats, for example, you know, if I asked you, could a stat tell you what's going to happen? I think most of us would cross our eyes, pull up 40 stats, all the splits and come back
Starting point is 00:17:00 with the answer of, I don't know. It really comes down to sports intuition. So we actually had to really dig deep and create the methods to prove regression. So we've run, at this point in our game, four years of baseball seasons under our belt. We've run millions, I would say probably on the order of 100 million predictions. So what do we do to prove the accuracy? It really comes down to, did our models train? Do they pick up on the unusualities that are happening on the field? Does it pick up when there's a streak or a slump? That's really how you communicate that. And outside of going into a lot of data science explanations on calibration curves and how we fit the line over the course of these millions of pitches, that would take a lot longer.
Starting point is 00:17:51 I'd be happy to go into it at some time. But to answer your question, Meg, we study every bit of the data year over year, check that the calibrations work. And our accuracy, we believe and we can articulately show, is much better than any stat could provide. One more question before we move on to some more odd specific questions. This is more on the business-y side. I know you've put some information out there about investors and seed rounds and money that you've raised. What, if anything, can you share about who has invested in the company or how much money you've made or your plans to expand, etc.? Sure. Yeah, very, very public information. I'm happy to talk about it.
Starting point is 00:18:32 So after our Comcast stint, Comcast NBC accelerator stint, we did raise funds. We raised with KB Partners, which is a sports tech VC in Chicago, as well as Corazon Capital, which is also based in Chicago. And the founder and lead partner of that is Sam Yagan, who has some Texas roots and roots in some of the big tech type applications. Lots of smaller investors, but those are our two leads. And what they did was deliver us enough capital so we could take this algorithm that we feel like we've mastered in baseball. And yes, I'm happy to get into the challenges of that. But we're taking that algorithm with those funds to go and deliver more sports, to package it into what we call micro bets, and to really, really expand our entire product and offering. So as for the model, and don't hold back on our account because our audience is fairly stat savvy, I think, and would be interested in whatever details you're able to offer, though I know this is obviously a proprietary model
Starting point is 00:19:39 and you can't give the whole game away here, but to the extent that you can share, tell us how the model works, I guess, and you can go into whatever level of detail you're comfortable with. So first of all, and y'all are baseball people, what we look at is exactly what you would think when a batter comes up to the plate, right? What things do you think of, Ben? You think of, where are we at in the game? How's the batter doing so far? Is he 0 for 2? Is he 2 for 2? How's he done in the past? What else comes to your mind? Personally, I probably wouldn't consider how the batter has done in that game. I mean, you know, I wouldn't think that that would be predictive personally, but I would certainly look at the batter and the pitcher and the umpire and the
Starting point is 00:20:21 catcher and the ballpark and the defense and all of that. I assume that that's all part of the soup here. And I know that you've mentioned, or it has been mentioned, 120 inputs. I don't know if that's the hard number or whether that's grown or shrunk over time, but I was kind of curious about examples of inputs other than some of those obvious ones, because I can think of a lot of factors that could potentially impact the outcome of a plate appearance, but 120 is a big number. I don't know if I could come up with a list that long myself. So I was wondering about some of the subtler or less obvious factors that you might be taking into account. You got it. So what comes from the field, what comes from the data stream from the live play
Starting point is 00:21:02 is literally thousands of data points, right? So trimming it down into between 100, 120, you know, it takes a little bit of gumption to say, okay, let's, let's, let's add in all the things that we know about the, the batter, you know, about how he's doing this year, how he's done this game, you know, that, and a lot of folks do look at, you know, like if a, if a batter is 0-2 and he hits, you know, And a lot of folks do look at, you know, like if a batter is 0-2 and he hits, you know, 320, probably thinking he might get a hit. Now, we know that that doesn't really apply always. It's not a given.
Starting point is 00:21:33 But it is a piece of the puzzle. We look at the ballpark, the distances to the outfield. We look at the weather, the temperature, where we're at in the season. We look at, is this American League versus National League? We look at, are they in the same division? How are they ranking versus each other? Like, is this team, you know, 20 games behind or half a game behind? We feel that that adds in there. Now, we also put in a lot of times when people watch a matchup, they think about the batter only. Like, think of all these batting stats. And we tend to kind of understand those a little bit better than we do some of the
Starting point is 00:22:09 pitching stats. But honestly, when we start to put in the pitcher, how he's doing, how he's done the last five batters, when we put in his performance over years, weeks, and months, then all of a sudden we see a much more accurate version of what's going to happen. So all of that, it comes up to somewhere between, it depends, somewhere between, I think we're sitting at 110 right now, but all of those things, they just add up. And by the way, that's already trimmed out so much. And after that, we let the machine pick up which is the most important feature. And it varies for every single matchup, which is really interesting. We did a study on an at-bat that didn't make sense to us a few weeks ago. It was a Max Muncy batting. And we could not for the
Starting point is 00:23:00 life of us figure out why one of the factors kept such a great importance. And it turns out in the Max Muncy situation for that particular count, the pitch count was a really big deal. That's what the data said. And that's something that perhaps we don't always pay attention to. Yeah. And I know that there may be some circumstances with this type of model where you can't always necessarily explain why it's saying something it's saying, which maybe doesn't necessarily mean it's wrong. It's possible that it's picking up on something that is not obvious. I guess that's kind of the trade-off when it comes to the complexity of some of the advanced stats that are being brought to bear
Starting point is 00:23:38 these days. It may not be quite as transparent, but it's also maybe a little bit hard to figure out if it's going wrong in some way at times, which is why I wondered, you know, when you're talking about these various factors that seem like they might have some predictive value and then you add them to the model, is that based on testing every step of the way so that, hey, let's see if this actually improves the predictions and if it does, we will keep it And if it doesn't, we will jettison it. Or is it just sort of we think this should matter or help and we'll try it? So our models are per player, right?
Starting point is 00:24:14 There's a lot of players in baseball, right? There's a lot of pitchers. There's a lot of batters. And when we tried to really deep learn on which player, like if it's Altuve, we're going to find these things are important, so let's jettison it. We really found out that our methods of machine learning were solid enough that we didn't have to do that too often. You know, after you get, you know, past 100 inputs, removing one does not change the scenario very often once you get them fairly right. Now, one of the pieces, and this is kind of in the
Starting point is 00:24:45 secret sauce of what we do, but I'm happy to share this really important method that we do. And that's, we've heard other people say, oh, we look back 10 years to understand what's going to happen. But you know, and I know that Altuve five years ago is not the same as four years, not the same as two years ago. Heck, this month, he's really not even the same as he was last month because he was injured. So in our data, we segmented it out. We do year models, month models, week models. And we were able to find this really cool thing. We're able to find trends, streaks, and slumps.
Starting point is 00:25:20 And what you don't know that's deeper than what you're seeing on Apple TV with every single outcome. We include in our API things that we want to know as sports fans, which is why and helping us to understand. So we have little factors that tell us this is really different than the league. It's much higher. There's a trend going on here. The pitcher over the last few months has been giving up more home runs. That sort of thing is present in our API. It's just not always the easiest thing to display when you have 14 seconds between the pitch. Yeah, I was curious about that because I assume there's always some risk of overfitting, right? If you're looking at some smallish sample, some recent sample, I mean,
Starting point is 00:26:05 just to throw some numbers out there, if someone is 0 for 10, the last 10 plate appearances that they started out 2-2 or something like that, and you're looking at the last 10 plate appearances as predictive, then maybe the next time they come up and they start out 2-2, then you'd say, oh, well, they have no chance to get on base here, right, because they haven't the last 10 times. But you could potentially get yourself in trouble there because if you're focusing on too small a sample, then maybe you're not throwing out some recent performance that is useful and predictive, but you're also not basing your predictions and probabilities too heavily on that thin slice. So that seems like an area where you have to be pretty vigilant and do testing to make sure that you are looking at the right timeframe. And that's exactly what we do. So within each prediction,
Starting point is 00:27:05 we have up to 72 models that we run. We run pitcher for years, months, weeks, batter for years, months, weeks. We run matchup models for months and weeks. And we also grab what the league average is for that particular situation. So we take all of those together and then we look and figure out based
Starting point is 00:27:25 on our algorithms, which ones are going to be the most effective, how to fit those and merge all those back together into the best prediction. Because by the way, we get one shot of telling you what the prediction is and it's a set of multiple outcomes. Like you may be seeing one on Apple TV, but behind the curtain, there's actually a lot more of the outcomes. And by the way, that all has to add up to 100%. So if you see something like a reach go up, that actually contains elements of single, double, triple, home run, walk, hit by pitch. So we're actually showing you just a summary of the result, but we're really excited about how we can merge all those together. And again, the regression numbers that we have are truly fantastic. And we show without a doubt
Starting point is 00:28:11 that our models are able to calibrate to what we call a line of perfection. And we're not saying we're perfect, but we're saying that if for 200,000 times this count happened and we said 10% was the predicted outcome, did we in fact, were we right 20,000 of those times? Does that make sense? That's how we calibrate. And we're dead on. It's in good shape for all of the areas where we have plenty of samples. There's always outliers. And those actually can be kind of fun to look at.
Starting point is 00:28:44 have plenty of samples. There's always outliers and those actually can be kind of fun to look at. So I guess this is a related question that might just be rephrasing Ben's in slightly the same way, but I guess I'm curious, you know, you're doing a lot of this predictive modeling on the per player basis, but obviously, you know, there are plenty of comps that exist across a player population. We do studies at Fangraphs where we are trying to understand how a player is likely to perform in a given moment. And we're not simply going to look at that player's performance, but other players like that player in similar circumstances facing, you know, similar kinds of pitchers throwing, say, pitches that are of a similar velocity band in a similar part of the strike zone. And you've talked a lot about the machine
Starting point is 00:29:25 learning and the algorithm sort of confirming itself, but I'm curious what sort of just cross baseball work you guys are doing to try to understand where there may be gaps. Because I know that you've said, you know, your aggressions look great and you're managing to a line of perfection, but as experienced baseball people, I think we still sit here with moments during this broadcast where we are scratching our heads about how you could be arriving at that. And maybe that's a result of you not showing enough or the broadcast not showing enough, but there are, you know, instances here where I don't know if you are overfitting or if the sample is too small or what have you, but it still doesn't pass the sniff
Starting point is 00:30:05 test. So I'm curious, like what sort of baseball wide studies you're doing to try to say, how does this jive with what we know about baseball? Well, no, it's a great, it's a great time to actually get in a few numbers. So let's talk about one of the examples that you guys mentioned ahead of the podcast, which was the Semi and Javier matchup from this past week. The Astros Rangers were facing off here in Houston, right? Now on the broadcast, so this was quite a long at-bat. It went, you know, 0-0, 0-1, 0-2. He took a few fouls, and then he took a couple of balls. And, you know, after six pitches, he finally struck out swinging. And so, you know, on Apple TV, they showed the reach probability in that case. And I think one of the things that might have been counterintuitive that would cause you to scratch your head was as he took the first, you know, I think reach was somewhere around 23% at the beginning of the at-bat.
Starting point is 00:31:00 A strike came in and reach went up to 33%. Another strike came in and it went up slightly more. And then it started to go back down as the at-bat. A strike came in and reach went up to 33%. Another strike came in and it went up slightly more. And then it started to go back down as the at-bat progressed. Would I be right in saying that would cause you to scratch your head? Yes. Yeah. Okay. No, and us too, right? So we look at it every time we see one that doesn't make sense, we say, okay, why? Well, I went a little bit further on this particular at-bat, and I love fan graphs, by the way. We're huge fans, you know, been on your site many, many times looking at splits and all the wonderful work that you guys do. And I went ahead and grabbed what the splits are for, I only grabbed Simeon in this case.
Starting point is 00:31:38 And I said, I wonder if we could learn anything from the splits that would be nonsensical. if we could learn anything from the splits, that would be nonsensical. And as I looked at the batting average for SEMI and for this particular count progression, it made me scratch my head. And that's just the stats and it's built on data. So I would claim that stats always, if you looked at stats in the same way and the same types of progression, you might scratch your head. But stats don't have one of the limitations that we have because stats can stand on their own. And very often do you really break down what's in a batting average versus the slugging or OPS.
Starting point is 00:32:14 A lot of times in 15 seconds, you just don't have time to dig into those splits. We could do it after. We could do it post-game. But we can't really do it in the game. But I claim that in this particular at-bat, you might find some things that would scratch your head, make you scratch your head. However, we said, we took that at-bat, since you guys had mentioned that ahead of this, and said, I wonder why reach went up when a strike came in. That, yeah, it's like, goodness, why did that happen? Well, here's the thing. We dug in.
Starting point is 00:32:46 We looked under the hood. We looked at all the predictions that were coming out. And something really interesting happened. The single percentage doubled between the first and second pitch. Why did it double? It didn't show up in the Marcus Simeon data, but it did show up in the Javier data. So for that situation, Javier was giving up more singles. Now that was in his more recent performance, but that was the cause. And it truly was in the data
Starting point is 00:33:14 that in this matchup, that's what happened. And two models, both the pitcher models, as well as our matchup models confirmed it. So that's why it came in higher. I'd love to explain that on Apple TV and tell you why. But at this point in time, you just get the number. So two questions. One, who controls which type of probability is displayed on the broadcast? Is that you or is that Apple or is that MLB Network? Because sometimes it'll be reach or out percentage. Sometimes it'll be hit percentage, strikeout percentage, walk percentage, home run percentage. Who decides what will be shown at any particular time? So we are total fans.
Starting point is 00:33:54 And that week we were in New York, my chief product officer and I had the opportunity to go to Secaucus and go sit with the MLB network for both broadcasts that Friday night. And there is a human being that looks at all of the data that's coming in and picks which ones to show. SuperSport's smart. It's what they're interested in. Now, under the hood also for the in-venue API, we also have flags and signals that tell you this is the most game impactful thing to show. This is the most relevant because we want to help our future customers, you know, be very sports smart. But in this case, the MLB network, they have it down.
Starting point is 00:34:38 By the way, sitting there with these guys, sitting there with the crew, watching the game, watching them put up their numbers, they would ask me questions like, Kelly, why did the numbers go up here? Or why did this go down? And, and then we would talk about it and be like, oh, you know, because, you know, the groundup probability went way up because he throws, you know, wicked curve, which, you know, you know, pulls more groundouts, you know, like it always made sense and it made for great conversation, but they're just super sharp and know exactly what they want to show. But every now and then they'll show something that's a little counterintuitive to me. You know, like I think they showed extra base hits a few times, you know, a home run a few times because that's an exciting one.
Starting point is 00:35:17 But, you know, the numbers are always a little bit lower. So it's human controlled. It doesn't always have to be. So it's human controlled. It doesn't always have to be. Got it. So I see that there are some cases where there could be a counterintuitive movement that might actually make sense. For example, if a hitter takes a ball, let's say, and the count is more favorable toward that batter. There are cases I could imagine where maybe their percentage of a hit actually goes down
Starting point is 00:35:39 because, well, their percentage of a walk went up, let's say, right? So that might not be clear to everyone because you're only seeing that single number on the screen. However, for reach percentage or for out percentage, that seems like a case where you wouldn't necessarily need to see the other numbers or the other numbers wouldn't necessarily change the story because you're just talking about is the guy going to get on base, right? They don't always explain exactly what is meant by reach baseball probability, but I assume that it means he's going to get on base by any means, right? He's going to just end up not making it out in this plate appearance. And that, just based on everything I think we know about baseball, suggests to me that there really should never be a case where I could come up with a logical explanation for why a batter's odds of reaching base should go down when they have a
Starting point is 00:36:34 strike thrown to them or two strikes in Simeon's case. And, you know, if there were some way in which their odds of reaching base actually did go down, then it should be consistent, I would think, within the plate appearance. So that if Semien is starting out with a 22-ish percent chance of reaching the base and then he takes strike one and it goes up over 30 and then he takes strike two and it goes up again and then there's another ball and it's going down and then ball two and it's down again, that seems to me counter to, you know, like if you were to ask Marcus Simeon or the pitcher
Starting point is 00:37:11 Javier there, like, would they want to start in that count or not? You know, I'm pretty sure Marcus Simeon would not want to take the 0-2 count just because the probability said that he had a better chance of reaching base somehow there. That kind of example, which is not an isolated occurrence, that happens quite a few times per game. That's the one where, to me, there are a lot of probabilities that you display that I don't bat an eye at, and they look like they are completely reasonable. And sometimes they should surprise me, I think, because if it never surprised me, if it never told us anything that we didn't think already, it wouldn't be very useful. But that just seems like a core baseline thing about baseball,
Starting point is 00:37:58 that as the count gets more favorable to the batter, then the outcome of the play to print should get more favorable to the batter, then the outcome of the play to print should get more favorable to the batter. So even if that hasn't been the case very recently for that batter or for that pitcher, that just seems to me to be outweighed by the larger pattern of how we know the league as a whole performs in those counts. So here's my question to you. Would you rather see a prediction for Javier versus Simeon, or would you rather see what happens typically with the league? Because we do have both. And you're right, the way you described what should or what would happen average, on average, you're exactly right. That is what our minds think, because that is the average in baseball, 100%. But what you're talking about from the first pitch received to
Starting point is 00:38:48 0-0 to 0-1, what you're talking about is a difference of 9-10%. And so if the odds of the single went up 10% and other things held, something has to go, but the reach easily can go up 9%, 10%. You're not talking about it went from 5% to 90%. I would argue if it went from 5% to 90%, wow, there's something really unusual there that doesn't follow baseball. But you're talking a 9% shift. And a 9% shift, in our opinion, and over the millions of pitches we've analyzed, 9% shift is not dramatic or drastic. It actually does explain this matchup, what's happened in their past. And we think that's pretty darn interesting. But we've also offered additional content that says it went up 9%.
Starting point is 00:39:41 It went up 9%. It went up because he has the higher odds of getting a hit based on performance over such and such time frame. Right. I guess I would certainly rather see tailored to those particular players, but I would not expect that it would be so dramatically different that it would go against that league-wide trend. that it would go against that league-wide trend. Like if you were to tell me that Semyon was particularly well-suited to match up with Javier and that maybe the expectation here should be higher than I would think based on their respective past performances, I could buy that and maybe the model would be picking up something that I'm not aware of. I think the sticking point for me is that you have these movements within the count for these two players where, you know, if you were telling me
Starting point is 00:40:31 this is what has happened in the past, sure, but this is supposed to be a prediction, a probability of what will happen next, right? So there's some balance there where maybe you are taking into account what has happened in the past, but are you taking that into account so heavily that it is swamping what we know about what typically happens between players? And that's the thing that gets me because if there were a higher or lower than expected probability on, say, an 0-0 count or something, that's okay. But then what about that matchup between those two players would actually give Semyon an upper hand as, you know, the count gets less favorable to him, you know, as he's taking strikes. If he's down 0-1, he's down 0-2 now. And we're saying that he has a better chance to get on base.
Starting point is 00:41:19 That is the part that perplexes me just kind of based on, you know, any matchup, really. that perplexes me just kind of based on, you know, any matchup, really. Well, Ben, I think it's always going to perplex you for a couple of reasons because you know baseball so well. You know, you're an esteemed author and have a podcast on it. So you understand intuitively how the league performs in general. But what you might not know is Simeon versus Javier, their recent, in the course of 15 seconds, their recent past, how, I mean, how they've been performing. And quite honestly, I don't see anything in existence
Starting point is 00:41:53 that could tell us that in 15 seconds. Again, now, if we dove into the splits post-game, they start, they support this. Because by the way, the splits is built on the data, which is built on the same data that we build our models on. And so it does make sense, but you're always going to scratch your head because we as fans are not attuned to A, talking anything outside of stats. We talk stats, but we don't talk probabilities. And that's a gap that quite honestly, we haven't spoken in that method before when we're describing baseball. So we've got a new method. And we've also got something that's doing pitch by pitch. And in football, we do play by play. And, you know, in basketball, time segment by time segment. It's a new way to experience the game. It's built on data. It makes sense. It's proven to do very well. It's just you're always going to scratch your head until you're used, if you ever are used to this way of looking at it. And, you know, I could add, I read a really great Fangraphs blog, I think it was written a
Starting point is 00:42:59 few years ago, in which one gentleman was writing about, he did a very simple predictive modeling algorithm where he took some past data. He came up with the probabilities, very similar to what we do. I mean, the methods were different, but he came up with probabilities for the outcome of the at-bat with each pitch. And then in order to communicate that, he took those same probabilities and he turned them back into stats to help people understand. So I think we just have a different method of talking, but I think this is very helpful to folks who probably don't always understand what the slugging average is or OPS. Sure. So I guess I have a couple of questions about that and we don't need to dwell overly
Starting point is 00:43:41 long on the semi and at bat, but I guess part of what we're trying to suss out here is how how much weight is is relatively small sample recent performance actually being given in the model here because i just listened to your explanation you're right that we are people who deal with this stuff every day and are familiar and comfortable thinking probabilistically. And I have to tell you, I don't come away super satisfied with that explanation that you just gave. And so I guess I'm curious about the data piece of it and maybe more broadly sort of how you guys are thinking about data in the model that might be true, but that might not be sort of meaningfully true in terms of it being important to the outcome of any given plate appearance. And then, so let's start with that.
Starting point is 00:44:30 And then I have a follow-up question on the idea of this, you know, being for folks who don't understand baseball particularly well, because especially if we are marrying that to hope that they will then bet on these odds in some way. That strikes me as kind of concerning if you have to, you know, have the kind of experience that Ben and I have to intuitively know, well, there's something off about that. So let's start with the data and how much the recency of data and how large that sample is and the role that that plays
Starting point is 00:45:06 in the model? Because I think that that probably merits some clarification here. So part of that is in the secret sauce of how we do things. But I will tell you that hundreds of thousands of pieces of data go into every single prediction, sometimes more. It's not taken lightly. And we do look at like the past several years, we look at the past several months, and we look at the past weeks, and we look at them all individually months. And we look at the past weeks. And we look at them all individually. And then we marry them together. That's super important.
Starting point is 00:45:30 It makes sports sense. It's not an area that I think can be easily explained over a podcast. But one of the questions when we first talked to Ben was, my question was, we can explain data science, the data behind it, the modeling and the methods for a long time. And we think that's really interesting and happy to do that in lots of settings. In fact, I said while I was in New York, why don't we I'm happy to meet. Let's grab a whiteboard and I'll go through all the details. Sign an NDA. We'll go through all the details until I can convince you.
Starting point is 00:46:00 But I'm going to ask you the same question that I asked Ben right now that that I asked him a few weeks ago. How can I convince you that it works? Like what methods can I use to convince incredibly stat savvy people like yourselves that this works outside of intuition? What would it take? It might involve a thorough enough explanation of the model such that you wouldn't be comfortable doing that from a trade secret perspective. But I think that part of what might be helpful here to swaying people that this is actually describing probability in an accurate way, because we know that not everything you predict on the screen is going to come to pass. We understand that. But this approach doesn't seem to really speak to a very robust and longstanding body of work around what we know to matter in baseball in terms of the likelihood of a particular player getting a hit versus not.
Starting point is 00:46:58 So I think part of what we are maybe reacting to here is that this seems largely divorced from an established literature that would maybe have raised some red flags about, say, the probabilities we saw in the semiannual bat, for instance. Yeah, I guess if I could hop in, I know that there are a lot of people who might just object to the presence of probabilities on a baseball broadcast at all, you know, whether they were correct and accurate or not. They just might not like that. We aren't those people. Yeah, we are not those people. So we're kind of like in your target demographic here. I think
Starting point is 00:47:34 not so much with the wagering maybe, but at least with the concept of real-time probabilities on a baseball broadcast. When I first heard that these telecasts would be displaying these stats, I was pretty intrigued and excited to see them. And, you know, I've even kind of been on the other side of it. Briefly, saw the blowback to that from people who questioned the point or the practicality or, you know, how can you predict anything about baseball or it was wrong in this one specific instance. And so therefore it must be broken, that kind of thing. So I'm sure that you've gotten that kind of feedback. And I sympathize because that's just going to happen inevitably, I think. So the ones that kind of raised the red flags for me are the ones that seem to run contrary to my understanding of the sport. And I guess when it comes to the question of how I could be
Starting point is 00:48:39 convinced, I guess it kind of comes back to the old Carl Sagan extraordinary claims requiring extraordinary evidence. To me, this would be extraordinary to suggest that a batter would be less likely to reach base as balls are thrown to that batter, for instance. So I guess, you know, the only thing that maybe could convince me is if you were to basically open the books, publish all the validation you've done, all of the previous predictions and shown that they work. And I know that you may not be able to do that, but. Such good news in that we are planning to publish. Like all of this will come to pass
Starting point is 00:49:16 as we go forward within Venue. So we look forward to publishing because we're really onto something here. And I'm a huge respecter of the work you guys do, the things that you've written, Ben, in particular. Huge, huge fan of what you do. And, you know, another one of the questions that I asked you when we first talked was, in addition to what would it take to convince you? Then the next one was, how much grace do you give us? If you saw, like, how many pitches are there in a baseball game, right?
Starting point is 00:49:46 Like 300-ish, give or take, right? For 300 pitches, if you're watching the entire game and it's up for every single pitch, how many times of scratching your head, you know, like what's the rate of acceptance? You get what I'm saying? Yeah. And so we feel that we fall well into not just an okay result. We feel that our results are solid and sound. They're based on true data, just like the stats are. It's just a different way of looking at it. And, you know, I'm going to pick on the Simeon Javier situation again.
Starting point is 00:50:22 You know, when the AB the at bat started and we said somewhere around 22 percent odds of reaching. Right. Do you know what the league average is for reach on an OO account? Oh, well, it depends, I guess, what you're counting in reach, but a little over 30 percent. Right. It's actually 30 percent. So and when we do that, we take all the OO accounts for several years, and we say, how did baseball perform? And we came up with, I think I rounded, but 30%. So if you're questioning the shift of moving up, you should question at first, why is it so low? Why is he being predicted to get on base at a 22? And we do the sum. A reach is the sum of a hit, a walk, hit by pitch,
Starting point is 00:51:07 or anything that gets him on base. If you're questioning the transition of an 8%, 9% shift later in the progression of this at bat, you should question this first. And by the way, if you look at Simeon's batting average this year, it's not so great, right? I think he, at the time, he was a 175. He's doing terribly, yeah. So that right there should tell you we're good at a no-o-count, right?
Starting point is 00:51:31 Like his batting average was 175. We predicted 8% lower than the league of getting on base because he's doing bad. So we're starting from a solid place. Yeah. And now when the next pitch comes in and we see, oh, the pitcher's giving up some hits and we put some weight on the pitcher, we put some weight on the matchup. And I realized that that that that that that can be questioned. This is fair. All is fair in data and statistics and interpretation. And so it went up. It's just what the data told us. Now, by the way, the league average went from 30 to 24. data told us now by the way the the league average went from 30 to 24 so your intuition was right it did drop six we went up nine it dropped six it's just the way it is we could tell you what
Starting point is 00:52:11 the league average is and you might be happy but you know if we told you the league average at 30 when it started and he's really doing so poorly you might be no i don't believe that so you'd have a question the other way too yeah right but if it went down once he had taken a strike that would at least be directionally consistent with what we would imagine there you know i think that some of the odds in these moments are are striking on their own but i i don't want to be overly fixated on that i think that i appreciate that this is all being done very quickly and that sometimes we are surprised by odds and that doesn't necessarily mean anything bad, right? Like it is pleasant to be surprised by odds sometimes as Ben said, but I think that where we are maybe chafing at this is that
Starting point is 00:52:55 directionally it doesn't make a ton of sense. And I agree with Ben, I would rather have odds that are tailored to the particular players involved, but I think that I would imagine have odds that are tailored to the particular players involved. But I think that I would imagine that a balance can be struck between doing that with a sufficiently robust sample versus being so fixated on this particular matchup or the recent results that these particular players have had that we kind of get turned around directionally on some of this stuff, which is where I think. Let me clarify. We do not use tiny amounts of samples. We look at large samples, medium samples, and small samples, and we make conclusions from them. But here's the most cool thing about all of this. And this is why I really appreciate that you guys bring to this whole table. We're discussing these numbers and we think that's a win for baseball, right? Like, table. We're discussing these numbers and we think that's a win for baseball, right? Like,
Starting point is 00:53:51 do people really discuss that Simeon went from a 175 to a 180, then down to a 132 for his batting average over the course of this at-bat? No, but now we've got, it's a win for Apple, it's a win for MLB, it's a win for those of us who love numbers. And by the way, time will tell, right? Like you've been looking at probabilities and these transitions for, you know, all of, gosh, not very long, right? Like since April 8th, which I think was the first on-air broadcast. And I think it's going to take time to get used to. It's going to take publications
Starting point is 00:54:16 and we're really, really excited to lead that charge because we stand by what we're doing. It's mathematically and data-wise very, very sound. But altogether, we can't describe the full science in a podcast, but my offer stands. Willing to sign an NDA, I'll fly to wherever you're at and we can take a whiteboard. And I can convince you, Meg, that this is not based on small samples and it is based on science. Yeah, I think it is definitely more the direction than the magnitude for me. There are certain examples. I emailed about this before, the example that
Starting point is 00:54:52 you highlighted in the playoffs last year about Jorge Soler in a case where he hit a home run and the odds had jumped up to 19% prior to that pitch. And so you cited that as an example of, hey, this seemed to be picking something up about him. It had gone from, I don't know, 1% or 2% to 3% to 19%, something like that. And that raised my eyebrows because I just have a hard time imagining that the best hitter against the best pitcher in the most favorable count would ever have a 19% chance to hit a home run in any particular plate appearance. That sort of, it strains credulity for me to hear that. I mean, you know, you could look at Barry Bonds or Mark McGuire in their record setting
Starting point is 00:55:33 home run years on those counts, and they did not have those kind of odds. So, you know, that just kind of comes down to, I guess, the burden of proof, right? That if you have made those kind of predictions, well, did they pan out or not? But it's not even just the extremes so much that kind of caught my eye because I do buy that sometimes they could just be lower than I would think or higher than I would think. And that could be reasonable and it could actually be helpful in perceiving something subtle that maybe we should take note of. But it is those directional changes within the plate appearance that kind of just calls into question the other numbers, even the ones that to me seem perfectly reasonable that just, you know, if it is also producing those ones that I just cannot explain those movements in what seemed to be the direction opposite of what I would expect, that just, I guess, cost me some confidence. So, you know, I guess we could go back and forth forever.
Starting point is 00:56:29 No, I fully note your concern. And by the way, we're going to take that and we're continuing to make suggestions to the users of our data and the media sense of how to take some of these quips and some of these things that we have that live under the covers and help explain that so it doesn't leave you feeling so confused. But again, it's an exciting opportunity to drive fan engagement. It's a new day. There are a lot of things that Apple TV is doing really right and keeping it crisp and clean, putting it down on the right. We look forward to a day when
Starting point is 00:57:00 we can help influence a little bit more information and tell you the cool things about the whys of the numbers. This kind of feedback that we're giving about these particular perplexing ones, are these things that you have heard from other people? I mean, I've seen on Twitter some people pointing out similar things, and I don't know whether you would care to confirm or refute this, but I also did hear from people who were involved in the White Sox broadcast last year that they had had some similar concerns and had communicated those and maybe had actually cut the trial short because of the concerns that they had about some of these numbers. So is this something that you've encountered as you have made your way in the
Starting point is 00:57:42 industry here? Sure. And I'd love to talk about that because it gives a real sense of the style of adopting the numbers. So in the Oakland A's trial, we did three games and we saw Dallas Braden et al really lean into it and they enjoyed it and they made fun out of it. It was good. It was a very good experience. Now, Jason Bonetti, huge respect for Jason Bonetti and the whole NBC crew. They had some questions about why did an RBI change? And it was a valid question. And we did a follow-up with the entire NBC crew, both the A's and the White Sox. And Jason Bonetti's feedback was, and he was on the call, was, hey, love the numbers. It's good. We understand that numbers are up and down. He didn't question the numbers, but what he said was, hey,
Starting point is 00:58:29 Invenu, I really want to know a little bit more information about the why. And so what we've done from that September game to now is we've made sure that we have available those whys, and those are in our API, and we're super excited to help folks explain what's going on. But again, our mission is really to engage fans in the numbers. We feel they're very good. We can clearly show they're good. And we're really excited about it. But watching Twitter, that can be a sport in itself. We see a lot of great comments, and we see some rather troll-like comments as well. We see people enjoy them. We see people hate them.
Starting point is 00:59:13 And I think we're just seeing a new world. You know, they're not necessarily a fan of all things in the broadcast outside of the in-venue contribution because it's different. We're watching different. They have to watch their Apple TV. So we take all of that with a grain of salt, but we watch every single comment. We poll, we watch, and we seek to improve. And that's why when I asked you the question a while back, Meg, I asked you the same question. What will it take to prove to you?
Starting point is 00:59:41 How can we improve your experience? And how can we pass that information on to anybody in media that might use our numbers? And we're always looking to improve. So I guess the last thing, we don't have access to all of the history of the predictions you have made. And so we can only go on what's out there. There is a writer, Ben Clemens, who's done some research just based on the games that have happened thus far and has gathered the probabilities and just tried to compare them to the most simplistic model that he could come up with, which is basically just using the league average outcomes for that count. count, so not even anything batter or pitcher specific. And just comparing those predictions, sort of the naive model to the ones that have been on the broadcast, and then looking at the outcomes, he has concluded that the more naive, the simplistic model with just the one factor
Starting point is 01:00:38 has been more accurate thus far through the games that we've seen, not including, I think, the very first week. And I know that maybe the model has been adjusted as it's gone on, and it seems to have become more accurate as time has gone on. And I've noticed fewer kind of eyebrow-raising, personally, probabilities as time has gone on. But he seems to have found that even if you compare to just the league splits, that that actually compares favorably to what has been published thus far. And, you know, that's just going based on what is out there, and we don't have access to all the things that you may have access to. Well, I ask again, and I welcome that conversation.
Starting point is 01:01:19 So, you know, we've offered a number of times to open the books with you guys and Fangraph specifically. We welcome that. As a technologist and with experience here, I would love to chat with Ben. Perhaps he's the author of that article that I saw a few years ago. But my question will always remain, what is accurate? So as we watch this progression go from 22 to 31, and then it plummeted back down to 20. And by the way, he didn't reach, he struck out. And if you watch
Starting point is 01:01:52 our strikeout percentage, which you didn't see, it also moved a variety of ways. And so, yeah, there's not a hardcore answer to this. But again, we stand by our data. We stand by the math. We stand by what we're doing. And we're so excited to share with the world a lot more about what we're doing. But right now, this is where we're at. And I would love to continue to chat and share more as we go along. So I hope you guys take me up on that.
Starting point is 01:02:23 And I hope that we can build you know, build the relationship. All right. Well, thank you. We do appreciate your coming on and answering the questions to the extent that you're able. And I do support the project of making baseball podcasts more stat savvy and stat rich, as long as the stats are sort of, I guess, have some basis in accuracy. I guess the question is, you know, I think it can be beneficial to publish these sorts of probabilities and can teach people things about baseball.
Starting point is 01:02:55 But I guess the question is, is it teaching them the correct thing or not? And is it potentially turning off people if they're seeing something that to them does not make sense for some defensible reason where they start to question every number they see on a baseball broadcast? That would be my worry. But I am in favor of the probabilities being accurate, and I hope that they are accurate and would enjoy them if I were more confident that they were, I guess. But I do appreciate your coming on and talking to us today. Well, thanks so much. Thanks for having me. And thanks for asking the tough questions. You know, we welcome it. And again, mad respect for what you guys do, your podcast for Fangraphs itself,
Starting point is 01:03:35 and love to talk numbers. And like I said, we'll continue to talk. All right. Thank you, Kelly. Do you want to recommend anywhere where people could find out more information about the company or you or anything else before we let you go? Certainly. Partnerships at InVenue.com. Feel free to shoot an info email over and we'd be happy to respond. You can also find us on our website, InVenue.com. There's a way to submit your email so that we can reach you. We're also on the standard social, so feel free to use any of those methods. All right. Thank you, Kelly. All right. Thanks, guys. Thanks, Ben. Thanks, Meg.
Starting point is 01:04:13 Thanks. All right. So you just heard us mention at the end of that conversation with Kelly, a study that Ben Clemens of Fangraphs has done that should be available now on the Fangraphs website as you were hearing this podcast. So we figured that we should probably bring Ben on for a few minutes here to explain exactly what it is that he did and make his methodology transparent. So hello, Ben. Hey, Ben. How's it going? Hey, Meg.
Starting point is 01:05:02 Hello. All right. And to be clear, when I reached out to Invenu initially and corresponded with them for the first few weeks at least, I was not aware that Ben was planning to research this. I don't know whether you were planning to at that time or when you decided to, but I was just kind of initially, because of the probabilities that I had seen, kind of mystified by some of them and then later learned that you were intending to do a study and some effectively wild listeners. And I also did some data collection for that too. And you can see all of that data because Ben is linking to it in his piece. So if you want to look at any of the individual probabilities from those past games, you can. But Ben, do you want to explain what your basic approach to testing these predictions was?
Starting point is 01:05:52 Yeah, the basic approach was pretty straightforward. I and you and some Effectively Wild listeners just recorded them all. We just wrote down the count and the prediction for every pitch where there was a prediction and we did some like methodological things about at bats where the state changes during the at bats they change what they're predicting but basically we just wrote down what was on the screen and then i didn't have anything to compare it to and you know as anyone who makes predictions to tell you like it's not clear what you should benchmark them to right yeah so i i ran what's called a briar score on the uh the probabilities are listed on the screen and i got a number of i think 0.2 and yeah okay cool briar
Starting point is 01:06:39 score of 0.2 does that do anything for you guys not. No, not for me. It's not like the ice cream, right? Right, exactly. So what I decided to do was for some predictions where I could create a kind of simple dummy model of my own, which is basically anything that I could model using just the count. So odds of reaching base, odds of striking out, walking, getting a hit, something that doesn't take into account the runners on base. So the NVENU makes predictions for RBIs
Starting point is 01:07:09 and grounded double plays. And I'm not confident that I can test those in a reasonable one factor way. So all I did was take the league average result probabilities after each count on the day before each game. So for an April 15th game, April 14th. For an April 22nd game, April 14th.
Starting point is 01:07:26 For an April 22nd game, I took stats through April 21st. So league production through the day before the game. So if I were guessing using only the count on that day, what data would I use? And I would just use, you know, league average after that date.
Starting point is 01:07:38 And not league average for the team or the player doesn't take into account the batter, the pitcher. It's exactly one factor. Incredibly simple model. Yes. I mean, I guess you could say it's got several dummy variables, but anyway, it's a one factor model.
Starting point is 01:07:51 Like all I looked at was what the league did overall. And I just ran the two of them through the same battery tests. Cause that tells, that doesn't tell me, you know, which one's better or which one will do better going forward. It just says, how were these two at predicting what happened and also i recorded what happened that's uh that's something i skipped earlier yeah i just also recorded what happened on every plate appearance and so we had two different models they each made predictions i mean i would say mine made them not particularly well it's's a very simplistic model on purpose. And then we just compared the results.
Starting point is 01:08:26 There are, I think, 2,077 pitches that the two both made predictions on. And in aggregate, the one-factor model did a little bit better. It marginally outperformed the end venue model in briar score over the 12 games we recorded. new model in briar score over the uh the 12 games we recorded and it outperformed them pretty significantly when what you could do is one model sets the odds and the other model bets on the outcome of the game based on those odds that's kind of a simplistic model versus model test that i've picked up as a way of saying like basically if if you are making mostly good predictions but a few outlandish ones that that's maybe not great for uh uh if you're gonna let somebody gamble against your odds and i think the idea was kind of suggested to me because n venue is you know partially a sports
Starting point is 01:09:19 gambling thing and it's just a very natural way to test predictions against each other is say oh you want to bet on it and it's pretty easy to set that up as well which which appealed to me both briar score and uh this this betting against each other model test are pretty like intuitive briar score just measures the average squared difference between your prediction and the outcome and the gambling thing is i mean i explained the exact mechanics of it in the article but it's straightforward it's just you cut i choose for every play and yeah so the the one factor model it did very well in the gambling against the other model thing it did better although not overwhelmingly so in briar scores and there's some
Starting point is 01:10:07 serial correlation there you know if i'm predicting the same at bat eight times because it's an eight pitch at bat and then i could just get lucky once i could be like the the one factor model could be wrong but you know we only have one observation that counts 10x in our sample. So I ran a separate subset of just 0, 0 counts. So that's a zero factor model, right? Like it's only 0, 0 counts. So mine, just like the dummy control model just gives league average every time because league average is after 0, 0 counts by definition.
Starting point is 01:10:40 And on these 0, 0 counts, there was still a slight briar score advantage for the league average model and a slight positive gambling return to the count only slash no information model. Not as significant as all the counts, and I think that has to do with some of the wrong way stat movements that you guys
Starting point is 01:10:57 talked about in the interview, but I think there was a pretty significant, not statistically significant, I mean, honestly, I did not try to measure the statistical significance of this because it's some odds on the screen and they could change next week and i don't feel like i have any you know any understanding or any ability to predict how their model is going to evolve and change over time it does seem like it's gotten better in the in the six weeks of games that i scored i should mention there have been seven weeks of games we only did six because one set of Apple TV Plus games was a day after the first day of the season.
Starting point is 01:11:30 Right. And that's kind of weird for season to date stats. So I just threw it out because I didn't think that I could actually come up with a... I think it would be a little bit unfair to use any stats that hadn't occurred yet. Yeah. For example, like on April 8th or whatever it was, I didn't... Like we didn't know and their predictive models didn't know the ball would be dead. And so I just tossed that one out.
Starting point is 01:11:51 I think that's, you know, probably the right thing to do. And it seems like there were still some kinks with getting the integration going at that point. Like some of the odds are coming up late at that point. And that's very understandable. So I thought it was just easiest to toss it out and kind of go from there. And is that 2000 plus pitches? I mean, is there any way to say whether that is sufficient to reach any kind of conclusion about those already published predictions? Like, obviously, we have no idea if they will refine their model and improve it. But is there any chance that over that span that compared to your extremely simplistic model, the lack of overperformance of that model is not telling, is random, is just by chance?
Starting point is 01:12:35 I mean, how much can you conclude from what has happened so far? I'm not a statistician. I should probably mention that from up top. But basically, no. That doesn't really say anything about the go forward. Like you said, we're looking at what comes out of a box, which is very different than the workings of the machine. It's entirely possible that just through sheer chance, the types of situations that were testable against my one-factor model happen to be the things that they do worst at. And that if we had looked at a different subset of batters and pitchers or a different subset
Starting point is 01:13:07 of counts that had come up, they would have done better. There's just no way of knowing that. And I don't want to make any claims about what that means going forward or anything like that. What I can say is that the predictions that have been on screen so far are, I think, pretty clearly have not been as accurate as just using count-based predictions. That, yeah, again, that doesn't say anything about anything other than that. Just they haven't matched up to the count-based predictions as of yet. Well, and I think that part of our reaction to all of this, well, I'll speak for myself. I think part of my reaction to all of this is the use of it. Like if we care about stats being accurate and one of the great sort of
Starting point is 01:13:47 challenges of our collective lives is having to explain probability to people who aren't super well-versed in it. And I don't mean that in a snarky way. Like I think that part of the power of what we all do with baseball analysis is that it can illuminate probabilistic thinking in a way that's useful beyond just baseball, right? And so if this were less accurate than just account-based model and it were being purported to be as accurate as it is being purported to be, like, I think that would still bother us, right? Because we like things to be more accurate than not. And because we'd like to stop having to tell people why our playeroff odds are fine, actually, for instance.
Starting point is 01:14:25 But I think part of why I find this so flummoxing is that, you know, there are going to be people who potentially make decisions about gambling based on the odds that they're seeing and might be doing that with less complete information than they think they're making those decisions based on, which, you know, we can debate how much that matters relative to the decision to gamble at all. But it seems like if you feel like you have information, you might make bolder choices than you would if you didn't think you had information that was, you know, more accurate than just an account-based model. So I think that's part of why this rankles for me. Yeah. I think for me, I mean, the way the odds are currently, I guess for one thing, I would be sort of surprised if they are used in that way, just because if there were actual money at stake here and given Ben's results, it seems like this would not be the house winning in this case, at least so far. Yeah, that's a fair point.
Starting point is 01:15:25 I mean, if anything, maybe it could convince people that it is easier to bet on baseball than it actually is in general. Like I'd be tempted to bet on baseball if I could bet on some of these very perplexing probabilities. I think also, yes, it is something that could color people's perceptions of just the use of stats and probabilities and sabermetrics in general. Like people who see these things on the screen, something that really clearly doesn't seem to make sense even to us, and then use that to kind of cast a wide net and say, oh, the stat nerds don't know what they're talking about, right? Because I think this could be an educational tool. It could enlighten people. It could illuminate things about baseball. But if it is doing the opposite of that, then maybe it makes it harder to get predictions or probabilities on the screen the next time around if there is a model that comes out that maybe doesn't produce probabilities that are confusing in this same way. So I think that's the thing for me. The reason that I started doing this is
Starting point is 01:16:33 because I watched the NBC Bay Area telecast last year and I was like, this is awesome. I have been waiting my whole life for them to have this in a little box on the side of a baseball broadcast and not every pitch but i love that like it pops up and you can see like a ball just came in and like wow his chances of getting a hit are going down even though he got a ball oh but his chances of getting a walk going up i loved it it was like it was one of the it was a thing that i had not seen on a broadcast before that i thought was really cool and so when apple started doing it i thought oh wow that's neat i wonder how good these predictions, wow, that's neat. I wonder how good these predictions are.
Starting point is 01:17:07 You know, that's kind of hard to tell. It's one thing when you see it once and I was like, oh, that's cool. Like, I'm glad they're showing probabilities. I think in probabilities a lot and I love that they're putting them on a broadcast. And then once they were there every pitch, I thought, ah, you know, I like testing things. Yeah.
Starting point is 01:17:21 I wonder how good these are. And it's really hard to make predictions what is it neil's war said it's hard to make predictions especially about the future like that's that's very true i would i would never purport to be able to make a model that could do better than my own count-based model i think everything i added would just make it worse and so i'd rather than saying i wonder if i can beat this i just thought ah i wonder how good it is and i don't know i i think that the the endeavor of trying to put odds to what's going to happen and like tell a story of what might happen next can be very interesting narratively i do worry that uh if the odds are kind of counterintuitive
Starting point is 01:17:58 on their face that people are less likely to say oh i like this and more likely to say why are there so many numbers on the screen right and so that was kind of my initial impetus is like, I think odds are cool. And I would like us to use them in more places in life. And I wonder if this is a good place to use them. Yeah, right. I tuned into the first Apple broadcast specifically to see this really, because I thought it was kind of a cool idea. I understand if people aren't that into the concept regardless of whether they're accurate or not, just because, you know, like if you don't need to see the probability on every pitch, I get that.
Starting point is 01:18:32 Like, I think that it is perfectly fine to say this does not enhance my enjoyment of the game. Like maybe I already have a sense of what the probabilities are or maybe just seeing them and then seeing different results happen, it just like cheapens it in some seeing different results happen it just it like cheapens it in some way or it makes you just feel like oh this is one trial that i happen to see i don't
Starting point is 01:18:52 know maybe it would actually affect people's enjoyment of the game in the other direction so i'm not saying that like if you are a stat head in general that you will love this concept i don't know that you need to it's just that the idea of things being published that maybe reflect poorly on just the endeavor of doing it at all. I guess that's the thing that gave me some misgivings about this. Yeah. I mean, I think that putting probabilities to a broadcast is an interesting narrative tool. And Meg, we've talked about this before about playoff odds. If you give something 5% playoff odds and somebody makes the playoffs, that doesn't mean, man, these odds were just terrible and bad. I mean, something awesome just happened.
Starting point is 01:19:33 Right. And that's the way that I kind of approach all these is, hey, if the odds look pretty reasonable from some kind of outside test, and we ran these tests on our playoff odds to kind of see how often the things that we predicted happened. And they did pretty well. Yeah. Like I find odds to be more useful as a narrative tool than as a, like, here's what's going to happen because they're not what's going to happen. Right. Like someone is not going to reach 20% of a base. There's just no chance. Like, I don't know if his arm gets there or something. Like either he'll get on base or he won't. What is on screen will never happen.
Starting point is 01:20:10 But giving you an idea of like, hey, this is an easy situation to get on base. Hey, this is a hard situation to get on base, et cetera. I think it's quite cool. And just to give people a sense, because we really fixated on that Semyon plate appearance, which it just was one that Kelly said she had looked at the numbers for and it was last week. So we focused on that, but that's just kind of a microcosm. There are many, many cases like that, that same sort of problem or what seems problematic to us. And you actually calculated how often that has happened and it has become less frequent as time has gone on, which I assumed based on my watching and seeing it a little less
Starting point is 01:20:46 often, but it's still not uncommon. Like just to give people a sense of how many times per game, roughly, that kind of quote unquote wrong way movement within a single plate appearance happens how often. Yeah. So I should mention that this is only for things that are kind of absolute, as in the odds of a hit don't need to automatically tick up when you take a ball. Yeah. The odds of a plate appearance ending in a hit after 3-0 count on just a random day I picked were 10.3%, and on a 0-0 count, they were 20.5%.
Starting point is 01:21:18 So they actually halved going to 3-0. So this is only for things like reaching base. Reaching base is monotonically increasing. The more balls you take, the more it goes up. The more strikes you take, the more it goes down. And that's also the case for strikeouts, walks, and outs. So for only those four, I looked at times where the count ticked one way and the odds of success or failure, as it were,
Starting point is 01:21:41 ticked in a way that is counterintuitive. And there were 18 per game on average out of you know 270 possible pitches on average and number of tracked pitches that i was comparing for these is about 150 per game so i don't know 10 of things or so but if you exclude the first two weeks of games it's only 14 per so it's it's definitely sharpening up uh over time and it's not like those odds can't be correct you know those aren't those aren't necessarily incorrect but they definitely do kind of raise my let's think about this one more radar a little bit yeah yeah i'll give give a few examples here because I think people probably have not been paying
Starting point is 01:22:29 as close attention to these things, I would assume and hope for their sake, but just a few that stood out to me and that I had shared with in venue weeks ago in some cases. There was a White Sox-Rays game, second inning. Eloy Jimenez was batting against Drew Rasmussen, so this is a right-on-right matchup. And when the plate appearance started, there was a 52.7% reach probability. Or I guess that was with a 1-1 count maybe. And I was kind of like, whoa, because Eloy Jimenez's career on base after 1-1 is under 300. You know, he's not exactly Ted Williams. And this is right on right.
Starting point is 01:23:10 Then he takes a ball. It's 2-1. His reach base probability falls to 28.8%. And again, in real life, you know, on 2-1, his OBP is 80 points higher. It doesn't go down. So that's the kind of thing that drew my attention. Or, you know, fourth inning of a Cardinals-Reds game, T.J. Friedel, a left-handed hitter, had a 61% chance to reach base against a left-handed pitcher with a 1-1 count. And no offense to T.J. Friedel, but he's not peak Barry Bonds. And I don't think even peak Barry Bonds had a chance
Starting point is 01:23:46 that high against a lefty probably. So then he took a ball and his probability of reaching fell to 29%. So that same kind of thing, like even in that game, Corey Dickerson was batting for the Cardinals. He had a 41% chance to reach base with a 3-0 count, okay, sounds possible. But then he takes a strike and his probability of reaching almost doubles to 83%, which is like, what? So again, this is going back a bit. And I think that kind of thing, which is maybe more glaring, they've cut back on that somehow. But even in more recent games, I'm still seeing that sort of thing. Like I think a late April game, maybe it was, there was a Yankees game. John Carlos Stanton
Starting point is 01:24:30 came up in the first inning. He had a 36% reach base probability before the first pitch. Fine. Then he takes a ball. No, not at all. But then he takes a ball and that falls to 30%. He takes another ball and it falls to 28%. So he's up 2-0, but his chances of reaching base have fallen since the plate appearance started. And then he swings and misses and suddenly his reach base probability goes way up to 47%. And then he fouls off the ball and it goes down to 24%. And it's just like, it can't move in both of those directions. I mean, there's nothing saying it can't. It tends not to. Right. But I think that one thing that is important to be careful about in these things is you should do the math. Yeah.
Starting point is 01:25:11 Yeah. Like, I'll give you an example. On that semi and at bat, and then you actually outperformed very slightly. Actually, underperformed very slightly. I just looked at it again, and it was quite close. very slightly i just looked at it again and it was uh it was quite close they actually did about the same as the naive count based model in both briar score and gambling because they they got it right on zero zero they were shaded the right way then there were some counterintuitive counts and they did worse on those because the count was going up while he was taking strikes and so that didn't do well for the fact that he ended up striking out. And then they, again, shaded him lower than the league average on two two counts.
Starting point is 01:25:48 And that worked out again. Like you can get an okay overall result, even with some of these like, oh, that's kind of a strange direction to tick in if your initial kind of prediction was pretty good. Right. And now in that game as a whole, the two models did almost exactly the same. But I guess my point is it's really tough to just look at examples and say, like, oh, this can't be right. This has to be wrong. That's one of the things that is so interesting to me about this is it really is tough to evaluate these models.
Starting point is 01:26:18 And, like, you kind of want it to be telling you counterintuitive stuff. Right. Yeah, exactly. It's like the Bill James, I think it's often attributed to him. Yeah, the 80-20, you know, any useful stat should surprise you 20% of the time or else what's the point? So, but if it's surprising you 80% of the time, then maybe something's wrong. Yeah.
Starting point is 01:26:36 So, you know, there was another case like Darren Ruff was facing Aaron Sanchez in one of these Apple games and he came up in the, and he had a 39% reach probability, which kind of caught my eye because in the third inning, same batter-pitcher matchup, Ruff started with a 28% reach probability. So it went from 28% to 39% in the next plate appearance. Now, OK, maybe he's facing him again, and so Ruff has a better chance of getting on base. But a 40% increase seemed like a lot to me. But then it goes from 39% to start that plate appearance to 24% after he takes a first pitch ball. Then he fouls a pitch off and it goes to 32%. He takes a called strike and it goes to 27%. So with a 1-2 count, he had a higher chance of reaching base than he did with a one oh count like that kind of thing you know one thing is if you keep making predictions like that and they aren't borne out by the data then eventually we'll just see it because you won't do as well at predicting those counts in the long run but it is more important i feel like to get the initial batter pitcher matchup right uhup right always than the direction things go. Like the Sun Yang one is a great example where, you know, yeah, the movements were kind of weird.
Starting point is 01:27:52 But by just shading right in the first place, they more than made up for that. And I think that's one of the things that is so interesting and promising to me about this kind of idea. And I mean, I am not a data scientist. I am not someone who is well-versed in machine learning. I dabbled in it for some of my data-driven hitter predictions this year. But like, you know, not fancy models like this. I'm just using off-the-shelf stuff you can put in Python. It's a little bit easier.
Starting point is 01:28:19 Like kind of some of the unavoidable fact of life with this is that it's going to be black box-ish. If you're running a bunch of different models and having the computer decide which of those is the best and then using its outputs, if it works, we will probably think something weird is happening a lot of the time. Yeah, right. That's the thing. that I mentioned where he goes from like 1% or 2% to homer. 19. Yeah, to 3% after the first ball to 19% after the second ball. Now, you know, that confused me because A, how does it go up one percentage point after one ball and then it goes up from three to 19 after one additional ball, but also just anyone having a 19% chance to homer in a plate appearance, even starting up 2-0. Like Barry
Starting point is 01:29:06 Bonds, the year he set the record, he homered in fewer than 7% of the plate appearances that he started up 2-0. And so did Mark McGuire the year he hit 70. Now granted, those guys walked all the time, but still 19% is astronomical. It's high. I bet you that, well, I can tell you that in all of the recorded data that we have this year, there have been no 19% home run percentages. So that's kind of, you know, aiming at a past target. Yeah. I picked that one out only because Kelly cited that one specifically as an example that the time it worked. Yeah. I believe that, yeah, that press release is actually quoted in my article. Well, that kind of thing. It's like, if that were right, how cool would that be? Yeah. I'll give you an example of a very cool one that yeah did work and was right so cole calhoun came up to bat
Starting point is 01:29:49 against christian javier in uh the astros rangers game and before the first pitch there was an eight percent home run probability and that's enormous yeah it's triple the probability the naive probability of a home run in that bat and cole calhoun hit up first pitch home run and he just you know bought the brand of the park and that is something that you might say that doesn't make any sense but it actually did make a lot of sense when you look at the factors a little bit closer cole calhoun has power christian javier is home run prone they're in houston like there was just a lot of stuff that kind of lined up that pushed it higher and having the ability to show that i think is really cool yeah i think that i think that's a valuable contribution to the discourse about like
Starting point is 01:30:36 baseball yeah i think that knowing that if the announcer said like hey this is a great spot for for calhoun like he's a lot more likely to hit a home run than he is in an average plate appearance like you know his type of like his swing matches up well against javier and javier is a fly ball pitcher and calhoun's like a like calhoun hits a lot of home runs so right getting the ball in the air sounds good for him yeah or i don't know how granular the data gets but teams are obviously looking at things like swing plane and pitch movement and comparing similar pitchers and similar batters. And maybe there's something that wouldn't be immediately obvious, but wouldn't be real. It's just, we can't really know from any individual example,
Starting point is 01:31:19 whether, oh, it was picking up on this actual real special proclivity to hit a home run in this plate appearance, or it was just a weird one that happened to be right that time so you need many examples of that to to see yeah and hence the uh hence the large sampling right and i'll be i mean honestly i'm probably not going to keep doing this because it's a lot of work i did four it was a lot of work yeah it took a couple hours so Yeah, so I'm probably not going to keep doing this. Not because I'm not interested. I am. But just because, I don't know, I don't want to do this every week.
Starting point is 01:31:52 I watch the games live because I like the broadcast crews, actually. And you can't pause Apple TV. No, you can't. You can't record it live. The picture quality on Apple broadcasts is great. Yeah, it's really good. It's beautiful cameras and everything. I love the design in general. I find the kind of color muted aesthetic really pleasing
Starting point is 01:32:09 and i like the the edm music that comes in and out of commercial breaks like it really gets me in the mood that this is an event yeah actually like i'm really enjoying the apple tv broadcast so far which is uh kind of i don't know if that's everyone's experience but i i found them uh quite Yeah. clarify one thing I mentioned. I had spoken to a few people with the White Sox broadcast that were using this very briefly last year until they decided to stop using it because of some concerns. And I won't quote anyone. No one went on the record because they didn't want to speak for the company or anything, and they weren't sure whether there could be a future relationship there. But the comments were not kind, I will say, but I don't want to just lob anonymous critiques out there. But the one thing that Kelly did mention about
Starting point is 01:33:13 a specific case with an RBI probability, from what I was told, that was a situation in this White Sox game in September where Johan Moncada was batting with Yasmany Grandal on second base and the probability of an RBI was listed as far higher than the probability of a hit, which confused the crew because it was unlikely, very unlikely, that Grandal would score on anything other than a hit. And so they said that they communicated that concern and that they were told that the algorithm was saying that either a hit or a fly out would have resulted in an RBI, which didn't seem right because there was almost no chance that Grondahl, a catcher, would score from second on a fly out, which happens extremely rarely, but generally not with a player like Yasmin Grondahl. So that was that specific example she brought up. But I think there were other examples there. I could play a quick clip of the very first pitch that it was
Starting point is 01:34:11 used on those broadcasts. With Next Play Live, a company powered by InVenue, and so they give you real-time data of what possibly might happen on the next pitch. So we're going to play around with that a little bit here tonight. So one hundred two hundred three hundred factors that they analyze to predict what's coming next in a game. So you'll see it in play here in the second inning. I know you have you have your own hundreds of factors that you
Starting point is 01:34:40 use sometimes thousands. All right. So. So OK. So you've got own hundreds of factories that you use. Sometimes thousands. All right. So, okay, so based on in venue and what they've got on this at bat, what Moncada's done, the very small sample size for Riley O'Brien, 17% chance of a hit, 33% chance of a strikeout, at least right now.
Starting point is 01:35:03 That's 0-0. First pitch a strike. How about 0-1? i think it's going to get worse for a hit i have to tell you well it's better how about that plus nine 26 percent i mean one of the very difficult things in doing something like this is if you have something that is like, I don't know, let's say, let's stipulate that it is better than a baseline at predicting what will happen next. But it is also dumb. Like it is just essentially dumb, right? It doesn't know baseball, right? Right.
Starting point is 01:35:40 It's looking at a big pile of numbers coming in. Right. It's looking at a big pile of numbers coming in. And if you don't specify like like enough initial things, it's going to make some things that are just evidently wrong. Even if overall it's a it's a good predictor. It's hard to put the right guardrails in. And I'm certainly no expert on that. But, you know, these like little things where it's evidently wrong are not disqualifying. But if you're not really sharp on everything else, they're certainly going to drag down your predictive accuracy. And one thing that I think is an issue separate from anything that I did in terms of studying their accuracy and testing them against other models is does it take you out of looking at the odds?
Starting point is 01:36:24 And like that one did, right? It took them out of looking at the odds? And like that one did, right? It took them out of looking at the odds. And so that to me seems more important than whether you can test them against my odds. Honestly, like most people watching Apple TV don't actually care that I tested the odds. No, I'm sure they don't pay any attention. And frankly, the broadcast doesn't seem to pay any attention, right? They never really mention it or explain it.
Starting point is 01:36:41 I caught exactly one mention of the odds on the broadcast. Yeah, they never like say, hey, by the way, we're producing these odds and this is how they work or this one is interesting. It's just there, kind of. or that there isn't a version of this that would be really interesting. I think one of the things that I appreciate about the way that just the aesthetic of the broadcast lays out is that those odds aren't, they're pretty unobtrusive down in the bottom corner. I think that there is a version of this broadcast, and I agree with you. I've liked these booths generally
Starting point is 01:37:19 and have thought that they've done a good job. I think that the potential exists here for folks like us who care about this stuff too, provided that some of the kinks get worked out of the model and so we are not driven mad trying to figure out how it is that Marcus Simeon's odds of reaching base went down after taking a ball, that we can have what we like as much as we want to engage with it. And folks who don't care
Starting point is 01:37:45 about that don't have to, you know, like it doesn't have to be the focus of the broadcast. And maybe, you know, it would be to the benefit of both the people producing those odds and those who enjoy them to like have the broadcast speak to them a little bit more and use the educational opportunity in small doses so that they're not, you know, beating people who don't care about that over the head with it, but are also engaging the people who do and kind of sparking curiosity. So I think that, like, we have been very down on this. And I do think that the, like, the potential gambling aspect of it does make me nervous. Although I think that you're right that, like that like it's not it doesn't seem like it's particularly actionable right now so like what seems fine but
Starting point is 01:38:30 there there is potential here for something really cool we just have to well not the three of us but like you know there needs to be improvement made to to sort of the whether it's the inputs to the model or how those inputs are weighted or sort of which things have an effect on a plate appearance, but not knowing how they really matter. You know, so there's work to be done here, but there is still potential. So that part is good, too.
Starting point is 01:38:55 I feel like we've been very harsh, so I felt like I should say that. One thing that is tricky, too, is you don't want to too much just say and give it to the baseball people and let them right what's fair because because sometimes we're really annoying well but also like one of the reasons to do this is because if you just let people say well that doesn't make sense get it out like you do want to find new insights right yeah so it is tricky to say like you know just
Starting point is 01:39:22 just iron out the parts i don't like and keep doing it which is i think why it's more useful to look at it as look here it is against this like one factor model here's how it did that's like what that tells you is that you would have done a better job in the games that we watched predicting this if you just looked at the count no no broader judgment on the long-term utility of it or how they should do it or how they should build it i don't know how they should build it i'm not good at this stuff it sounds hard but what i can tell you yeah it's just the study like that's that was my approach to it and i think that's actually a good way to look at it it's hard to be good at machine learning that there's a lot of people that go into that field you know yeah and i i certainly am not good at it i don't even
Starting point is 01:40:02 barely know what it is so the people we know who are good at machine learning and care about baseball work for teams. Right. There's a pretty hot market for them, in fact. So I guess, yeah, I wouldn't claim to know how to do it better by any means. I just think it's interesting to look at it and compare it to kind of a simple model. Right. Because I'm naturally suspicious of complexity in general. And so I always like to compare things to the lowest possible common denominator.
Starting point is 01:40:30 Right. Yeah, that's the thing. I think in the grand scheme of things, whether the odds are perfectly accurate or not on a baseball broadcast, we've outlined some reasons why it may matter in some ways, and maybe it matters more if there's wagering involved. But just in general, I guess we're kind of, even in non-baseball arenas, we're just bombarded by data constantly and advanced manipulation of data in ways that are too complex for laypeople to follow. And I include myself in that. And there are a lot of examples where models are flawed in harmful ways, like societal ways, because people design the models often. And so even if it is a computer actually
Starting point is 01:41:15 spitting out the numbers, like humans have to make some choices about what the model looks like and how it works and that kind of thing. And there are a lot of cases where in other industries, in industries where it might actually matter or have some real world effect, people will come out with very flawed models and maybe they know the models are flawed or maybe they don't. And there are serious ramifications and consequences that can come from that. And it's also bad because then it maybe makes people think that they can't trust data in general and just be very skeptical about that, which it's always smart to be skeptical probably, but also not to dismiss out of hand that there might be some utility there. So I wouldn't want it to turn anyone off and say, oh, there's no useful
Starting point is 01:42:01 application of this and the whole premise and purpose is misguided. So that's kind of why I am rooting for the probabilities to be right. Oh, me too. Yeah. All right. Well, thank you, Ben. We will link to your study, of course, and where people can find you at Fangraphs and on Twitter and so forth.
Starting point is 01:42:20 But thanks for doing the work and summing up what you did. Thanks for having me on. All right. Just for reference here, I thought I would read you the numbers in MLB entering Wednesday's games here, give you the splits by count that we've been talking about in this episode in case you're not familiar with what they typically look like. These are generally roughly the same every season. After the first pitch of the plate appearance, there's really no such thing as an even count. Sometimes we say that 1-1 is even or 2-2 is even, but not really. Results-wise, every count after that first pitch favors either the batter or the hitter on a league-wide basis.
Starting point is 01:42:54 And these are all after the count that I'm going to say, not on the count. So not a plate appearance that ends on 3-0, but plate appearances that start 3-0 and may end on 3-0, but may also end on a subsequent pitch. And this will be in the form of TOPS+. So 100 is average. Higher than 100 means better for hitters. Lower than 100 means better for pitchers. So starting with the hitter friendliest and going to the pitcher friendliest, after 3-0 as a 256 TOPS+, after 3-1 is 210, after 2-0 is 173, after 1-0 is 129, and after 2-1 is 126. So those are the hitter favoring counts. Then after 1-1, 85, after 0-1, 67, after 2-2, 58, after 1-2, 35, and finally after 0-2, 26 TOPS+. And given a big enough sample, this is about as inviolable a rule in baseball as I can think of.
Starting point is 01:43:53 Just going to stat head, I looked for hitters who have a higher career on base percentage when they are behind in the count than they do overall, and I set a career minimum of 50 plate appearances when behind in the count, and there are only six, six since 1988, who have a higher on-base percentage when behind in the count. Two of them are pitchers, and of the other four, the highest number of plate appearances is Bubba Crosby. Bubba Crosby, who played from 2003 to 2006, he had a.255 OBP overall and a.258 OBP when behind in the count. And again, 94 plate appearances went behind in the count for him,.269 total. So there is essentially no one who reaches
Starting point is 01:44:32 base more often when the count is less favorable toward them over any non-small sample. Just to close, after we recorded the interview with Kelly, Meg followed up and asked InVenue if they had any additional comment on Ben Clemens' study. Their statement is, we all know that in sports, player averages can't paint the whole story. InVenue believes in going beyond the average to generate predictions for each and every individual matchup and situation. Our team has run millions of regression data points outside of the 12 baseball games aired during Friday Night Baseball that have been included in this study. Our studies validate that our data is more relevant and accurate than an average. We love talking data, especially around baseball. We look forward to reviewing any studies as we
Starting point is 01:45:14 prepare to release our own in the future. All right, that will do it for today. Thanks, as always, for listening. You can support Effectively Wild on Patreon by going to patreon.com slash effectivelywild. The following five listeners have already signed up and pledged some monthly or yearly amount to help keep the podcast going, help us stay ad-free, and get themselves access to some perks.
Starting point is 01:45:35 Michael Vespi, Daniel Gonzalez-Stewart, Look for Overlap, Eric Schropp, and Nick Holcomb. Thanks to all of you. Patreon perks include access to a patrons-only Discord group, monthly bonus pods hosted by me and Meg, we'll be recording another one of those this weekend, and a couple of playoff live streams later in the year, among other extras.
Starting point is 01:45:55 You can also contact me and Meg via email at podcastfangraphs.com. You can join our Facebook group at facebook.com slash group slash effectivelywild. You can rate, review, and subscribe to Effectively Wild on iTunes and Spotify and other podcast platforms. You can follow Effectively Wild on Twitter at EWPod. And you can find the Effectively Wild subreddit at r slash Effectively Wild. Thanks to Dylan Higgins for his editing and production assistance. We will be back with a probably newsier episode sometime soon. So talk to you then.
Starting point is 01:46:32 Oh, even headed, yes. You've got in our numbers. Oh, even headed, yes. You've got in our numbers. Oh. I'm your boss.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.