Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1481: Multisport Sabermetrics Exchange (NASCAR and Cycling)

Starting point is 00:00:00 Hello and welcome to episode 1481 of Effectively Wild, a baseball podcast from Fangraphs presented by our Patreon supporters. I am Ben Lindberg of The Ringer. This is our first episode of 2020. Happy New Year to everyone. This is also the penultimate episode of the Multisport Sabermetrics Exchange series. If you're just joining us now, this is the sixth episode in the series in which we are talking to experts who can provide a primer on the past, present, and future of advanced analysis in their sport. We have already covered 10 sports, American football, basketball, hockey, cricket, tennis, golf, soccer, rugby, esports, and volleyball. And today we're tackling a couple of racing sports, NASCAR and cycling. Different velocities involved, but some of the same principles and team tactics apply to both. We'll begin with NASCAR. And to talk about that, I am joined by David Smith, who is a NASCAR writer and analyst for The Athletic.

Starting point is 00:01:16 He is also the founder and proprietor of Motorsports Analytics. And he is also the co-host of the Positive Regression Podcast. Hey, David. Welcome. Hello, Ben. How are you? I am doing well. And the first line of your bio at The Athletic says that you spent 13 years traveling the

Starting point is 00:01:34 country as a NASCAR driver talent scout on behalf of three different sports agencies. And now I want to know everything about that. What is a NASCAR driver talent scout? How do you scout NASCAR driver talent? Yeah, so a little bit different in NASCAR. There is no traditional talent draft every year. It falls on to sports agencies that represent young race car drivers to get them noticed and get them placed. There are thousands of drivers across the country and

Starting point is 00:02:05 overseas, and we need to find homes for them. We need to find them just in general, identify who they are. And very early on, 2007, every weekend, seemingly, I was going to smaller locales, just looking at local talent, young drivers. And one of our original signing classes, there were five of us out in the field, so to speak. But one of our original signing classes contained five drivers, two of which went on to win NASCAR championships in Ricky Stenhouse and James Buescher. So from there, just kept digging. But there was a point in time, I was in a Hampton Inn hotel room in Sparta, Kentucky, and had just pressed send on a report.

Starting point is 00:02:55 And I kind of came to the realization that I might not be very good at this. Not that the scouting report was not loaded with good observations, but there wasn't anything empirical that I could point to that could be fact-checked, and there was just something gnawing at me. I felt that that needed to be corrected. So, you know, I had read John Hollinger at ESPN. I had followed baseball fairly closely and noticed the rise of statistics and the analysis and dove headfirst into both of that. There are scoring loops around every major racing facility. So the data is there.

Starting point is 00:03:37 It just had not been parsed at that time. So I turned to that to help me identify talent just out of selfishness to become better at my own job. And yeah, lo and behold, I'm never been less confident of the answer than I am this time. So if you would place NASCAR on a spectrum of ease of analysis where, say, a 1 is a sport that's just impenetrable to analysis, you can't analyze it with numbers, and a 10 is baseball, where would NASCAR be on that spectrum? Actually, I'm going to go high. I'll say it's an eight. Okay. And when you consider what the sport is, I mean, I can't speak for other sports, but I have to imagine auto racing, and that includes NASCAR, is far more amenable to an analytics approach because at the epicenter of the sport are automotive engineers. And if they know two things,

Starting point is 00:04:45 they know cars and they know math. And I would actually pinpoint the rise of the use of analytics, maybe the beginning with the mid nineties when college trained engineers first permeated NASCAR. This used to be, I mean, look, NASCAR was born as something to do for the Southern moonshine runners and bootleggers that had these really souped up cars used to escape the police. They wanted something to do with that. It was a hobby. It then became an industry. And when the engineers came in, of course, they're going to turn to numbers to help facilitate what they do. We've seen the influence of analytics in the amenities being built. Both Toyota and Ford have full-scale car simulators that emulate every track and every

Starting point is 00:05:42 road profile. Stuart Haas Racing, which is one of the top organizations in the sport, has a rolling road wind tunnel on its campus. And to my knowledge, it's the only one of its kind on the continent. And then more than a decade ago, if teams wanted to create race-winning setups, they would load cars into a hauler and go to a track for a test day. They would burn man hours and daylight. And now, because setup simulation software is so sophisticated, winning race setups are created with a click of a mouse. It was a big deal back in 2016 when NASCAR invited its teams

Starting point is 00:06:21 to participate in an open test session in Miami. It was the track that hosted the championship race. And one of the key contenders that year, it was a team out of Denver, Colorado, opted to not participate because they felt confident in their computer simulations. They didn't win the championship that season, but they did a year later. And that was certainly a win for the new school approach. We've seen engineering grads come in, just be engineers, then get selected to become crew chiefs, which are the team leaders. And now they're getting front office jobs. with Team Penske. He might not have been the first, but it was a landmark occasion when he was named their competition director and he was a Vanderbilt engineering grad. So that was one thing that I was going to ask you, whether there was a team, a driver most

Starting point is 00:07:15 associated with the analytical approach, whether there was sort of a Billy Bean or an Oakland Ace of NASCAR. Yeah, I think it's just a collective. I mean, with all of these new brains coming into the sport and understanding, taking what they learned in school and in college and applying it as you would to auto racing teams that had, I mean, at least up until the end of the 2000s, some pretty sky-high budgets. They put those budgets to use on the right things. So we saw technology improve markedly, but we also saw people smart enough and with the right analytical tools to take advantage of that technology. So I mean, when I walk into the sport or walk into a racetrack, I'll talk to a crew

Starting point is 00:08:06 chief or an engineer. And the questions that I'm asking are probably high level. It's not normal racetrack chatter, but they welcome it, they appreciate it. And they're excited to talk shop about that because that's tangential to what they do. So you mentioned that this got started in the mid-90s. What have been some of the developments, the progressions, the major milestones, the increasing availability of data over the past 15, 20, 25 years? I think we saw sort of a nexus point when the economic recession hit, it felt like the majority of the sport was spending more at a time when sponsors participating in the sport were spending less. And as the NASCAR industry had to make the inevitable market correction, the decision

Starting point is 00:09:01 making became more shrewd. So we saw older drivers that in years past, they probably would have driven until their early 50s. We saw them come to the negotiating table, asking, commanding for their $10 to $20 million contracts and not getting them because teams realized that there was younger, cheaper talent kind of ready to go. And even though that was a bit polarizing for the fan base, because fans... Sounds like baseball these days. Yeah, yeah, exactly. And fans flock to different drivers. So there may have been some misunderstanding among fans, but that is what teams were doing. They were making these shrewd decisions

Starting point is 00:09:45 based on, look, I mean, they had the data in front of them and they realized there wasn't that much of a drop-off from a grizzled vet to a newcomer. And they were able to then save money on driver salary, put that back into the technology. So in most of the sports we've been talking about, there's always a point where someone starts charting things that are happening on the field or the court or the rink or the pitch, and it's usually just by hand and trying to record as many data points as possible. So when did NASCAR shift over, as I presume it has, to tracking everything in an automated way?

Starting point is 00:10:26 presume it has to tracking everything in an automated way. In 2005, I believe it was the first year of loop data, which is when they installed scoring loops at all of the racetracks. That was collecting every jostle, every movement of a car. Ben, when you think about it, if you've ever watched a race on television, you're seeing two cars on your television screen at once. You're missing, I don't know, 95% of the action that takes place. So the ability to capture the action that we couldn't see with our eyes was huge. And that was what prompted me to just to dive in and understand what exactly was happening that I wasn't even, I was at the races and still unable to capture all of it. These facilities are miles long. It's impossible. That was a key point. And then lately, the advent of GPS data in cars, there's a software called SMT that actually

Starting point is 00:11:21 creates, it captures every car on the racetrack, every movement, and also the telemetry, which is what a driver is doing with the throttle and the braking. And if you're, say, a driver and you're in between practice sessions, you can go back to the SMT data and superimpose your car over a car that's significantly faster and understand what your competition is doing compared to what you're doing. And then you can make the adjustment. I had one driver come to me. He struggled with his restarts this year, and he just asked for his restart data. And he wanted to take that and go confirm it, what he read with the SMT software to see what he was doing wrong and what he could adjust going forward.

Starting point is 00:12:07 That didn't exist 10 years ago, but now we're seeing drivers take a pretty active path to their own personal improvement. And that would not have been possible in the 1990s when nothing was being captured. It was just kind of fly by the seat of the pants, go by, gut feel. That sounds a lot like baseball too. So this is going to be a very basic question, but I don't even drive. I don't have a driver's license, so I don't even know much about driving regular cars, let alone NASCARs. I know that the cars are not named NASCARs. That's an acronym. But what makes a driver good? What are the skills of a driver? And what are maybe some of the ones that are difficult to detect with the naked eye?

Starting point is 00:12:52 Right. So I mean, this is probably the common question, right? I mean, if there's a driver of a passenger car can't really compute what NASCAR drivers do, I suppose. But so there's two things. And I've actually been fortunate enough to compete in just goof-off go-kart races against NASCAR drivers. So I've seen firsthand what they're able to do that I am not. And I can tell you that talent exists in the corners at racetrack. And that is your ability to work the throttle, the brake, potentially use the brake as an offensive tool. Our NASCAR champion, Kyle Busch, this year, his off-throttle brake, on-throttle cadence is so unique, almost herky-jerky. And he's doing it in a car that has a tight handling condition, which means it's very

Starting point is 00:13:45 difficult to turn. With the SMT software I just mentioned, drivers know what he's doing, but they can't duplicate what he's doing because he's just that good and he's able to do things with a car that not a lot of drivers can't. The second thing is being able to deliver feedback on what the car is doing to your crew chief and your race team in order to improve it. Jimmy Johnson is a seven-time champion of the sport. He's going to retire after the 2020 season, and he struggled this year. And I talked to him for The Athletic about his struggles, and he mentioned communication. One of the issues he was having is he had a new crew chief wasn't able to determine what the solution was to remedy what was occurring. They made a crew chief change and he discussed the difference was simply back to basics approach. They discussed everything the car did in every part of the

Starting point is 00:15:00 corner and on the straightaways. And that is something that not a lot of young drivers have experience with verbalizing what is happening with a car in order to improve it. It is very difficult. If you think you're driving a normal car and you think that something could be wrong, it's tough to pinpoint. But these drivers have a lot of, and the best are the ones that are able to make changes quicker than others to make their cars much faster. So what are the go-to sabermetric stats of NASCAR, and how do you isolate the performance of the driver from the performance of the crew and the crew chief and the car itself, for that matter? Yeah, I mean, I've played a hand in developing some of these. My keystone stat for Motorsports Analytics is called production and equal equipment rating. I look at the race result. I'm able to use the timing and scoring data at my

Starting point is 00:15:58 disposal to handicap what the equipment and team strengths are and isolate what the driver is doing. And we're able to come up with a rating. And that actually helped me create an aging curve, which didn't exist before I did it. But yes, it's age 39 is the statistical peak for race car drivers. So baseball players, once they peak at 27 or so, maybe they should just get into NASCAR and they could have a whole second peak. You know what? I mean, look, we're open to anything then. But also I've looked at passing efficiency. I've looked at restart position retention, different things like that. Can you explain restarting? Okay. So when an

Starting point is 00:16:42 accident happens, a caution flag comes out. It pauses the race for the time being. The field is reset. You keep the order, but you lose the distance that you have over the cars behind you. So when you start again, you're coming to the green flag at a set location, and this is called a restart. a set location, and this is called a restart. And recently, NASCAR has implemented stages, which are no different. It's like it's inning breaks, quarters. It's a purposeful pause in order to get TV commercials out of the way. But because of that, restarts have become more prevalent. There are drivers that are able to get to the gas quicker than others, to accelerate

Starting point is 00:17:26 quicker than others, and there are drivers that also have to defend their position because they may not be restarting in what is a statistically preferred location. There are actually two restarting grooves. There's a preferred groove and a non-preferred groove, and it's different at every track. From what I've been able to do, I've been able to identify what those grooves are and how drivers fare within them. So that was something that was not being recorded until I did it. And I still do it by hand and was doing it earlier today for one of the minor league series that I need to catch up on. But there are different peripheral statistics that capture moments that build into the overall race results. And what that gives us is a look at different driving profiles.

Starting point is 00:18:19 It's very stylistic. Some drivers are good in spurts, short runs. Some drivers are good on long runs when their tires wear and they're able to conserve their equipment and make passes. So we're now learning more about driving styles versus drivers that we just think are good and go out and win races. Now we're able to pinpoint why exactly they are winning races. Or if we're going to pick winners of future races, we can determine how might that occur. So when we have a winner take all race at the end of the year in Miami, we can understand

Starting point is 00:18:58 that, well, if a race broke in this manner, it would favor these drivers. If it broke the opposite way, these drivers would stand out. So how affected is the outcome by random, unpredictable events, just the way that baseball games might hinge on how a ball bounces? Is that the same with NASCAR and cars bouncing off each other? It is. So, I mean, the common joke is, you know, the fastest car is going to win a NASCAR race. In reality, the fastest car wins 40% of the time. So there is a whole 60% where it can come down to good fortune or forward thinking decisions, or just figuring out a way to get faster towards the end of the race, even if you didn't have speed at the beginning. There are various pathways other than the ones just a commoner would think exist.

Starting point is 00:19:54 And so that's what we're able to parse through looking at data, just determining there are a multitude of pathways to victory. Some teams aren't going to win without right speed, so they're going to have to take turns with pit strategy and different setups or different tire pressures, what have you, and just in a quest to get a good finish. non-racing sports that you've borrowed that are similar or analogous in any way, whether it's, I don't know, is there a win expectancy equivalent or something that fans of non-racing sports might recognize? Yeah. Well, Chris Mitchell actually did one of those for Motorsports Analytics during the playoffs this year. Chris Mitchell, former Fangraphs writer, former Effectively Wild guest, and he was the one who had Fangraphs Kato minor league projection system. And he had to leave Fangraphs because he started working for a team. And as he just announced, he's working for the Twins as a pro scouting analyst, but we lost

Starting point is 00:20:56 him. And that was your gain because now he's doing NASCAR stats too. Yeah. It's like you get him to write about racing, but yeah, there are a few few things I do equal peer to something like a wins above replacement. But there is something that that baseball and most stick and ball sports have that NASCAR does not. And I wish it did. We don't know how much NASCAR drivers get paid. It's not made public. There's no driver's union or spending cap to drive such a discussion. But that for me would be the next frontier of transaction analysis. On my browser, I've bookmarked the Fangraph free agency tracker, and I enjoy comparing the actual contracts against the crowdsourced estimates. And I listened to your free agency over-unders with Sam. And I thought, wow, that seems like it'd be a lot of fun to do, but we don't have that luxury. So I'm not able to put a price on how much does a win cost. That would be invaluable to the analysis and the story of our sport and could probably highlight some of the smarter people in front offices making these decisions. I wish we had that. We don't, so we're just going to have to make do. So in NASCAR, are there drivers who are consistently or over a long period underrated or overrated because people are looking only at victories in races, let's say,

Starting point is 00:22:26 and because there is this element of randomness, maybe someone is driving really well and he's doing all the things that should be doing to win races, but things just keep going wrong. And yet you can kind of look at the underlying metrics and say, this driver is very promising or is performing at a level higher than one would think based on the victories alone? Yeah. So, I mean, just for starters, this is very much a team sport disguising itself as an individual sport. So any driver that wins a race, it's because they had something go right that was out of their control, whether it was equipment or something that the team did. So there's already that hanging over a win.

Starting point is 00:23:11 But there was a young driver this year. He did win one race, but his name is Eric Jones. He is a 23-year-old Michigan native. He's in his third year competing in NASCAR's top series. native. He's in his third year competing in NASCAR's top series. And for a while, it was rumored that he would lose his job with Joe Gibbs Racing, which was the championship-winning team in NASCAR, in favor of one of the top prospects in the sport, who is also a year older than him. And on the surface, this other prospect, his name is Christopher Bell, had a lot of wins in the Xfinity series and was garnering a lot of excitement. But

Starting point is 00:23:51 Eric Jones had moved up from the Xfinity series three years ago. He had already conquered that. And 20 to 24 years of age is kind of the wilderness for race car drivers. There is a ton of inconsistency, and that's the norm. That's what should be expected. But now he's embarking on his age 24 season. His contract is up at the end of 2020. He is wildly talented. Just in terms of production and equal equipment rating, he's ranked in the top 12 in each of his first three years, in terms of production and equal equipment rating, he's ranked in the top 12 in each of his first three years. And that's pretty impressive considering his age. But not a lot of fans put emphasis on this because they look at his just two wins across three years for the team that put three cars into the championship four in the final race, and he wasn one of them so he he it looks as if he is the black sheep when

Starting point is 00:24:46 in reality he is a star in the making and i think fans lose sight of the the young age its impact on performance and what it means down the road because that's a that's a very talented driver that is being easily dismissed. All right. I'm buying stock in Eric Jones. So do you have any way to tell the relative value of a driver versus other members of the team? In other words, is it better to have a great driver and an average rest of the team? Or is it better to have a great rest of the team and a good driver? Then I would almost break it into tiers. Maybe at the very top tier, if teams are in the top tier,

Starting point is 00:25:31 then the driver is the separator. If we looked at the bottom tiers, then it's going to become the people touching the car, the crew chiefs, the engineers that are able to just gain something out of it. Because at that level, that far back in the field, you can mash the gas pedal to the floorboard. It's not going to go any faster. There's a cap on that. So it will eventually come down to how fast the car is. So I don't know that that's a complete answer to your question, but I think if you broke the field into maybe four tiers, I think you're going to get four different answers. But when it comes to the elite equipment, the top teams, they need the drivers that have the requisite skill level and then are able to

Starting point is 00:26:19 apply feedback to make those fast cars faster. And how many adjustments do you have to make for the track or the race length, let's say? Are there the equivalent of park factors as track factors, the surface or the number of laps, that sort of thing? Not necessarily the race length, but certainly the track size. When I look into pass efficiency, I break it into five different track types based on mileage. And even then, we can pull apart, look at the one-mile tracks, for instance. The track in Phoenix is asphalt. The track in Dover, Delaware is concrete. And if tires wear on those two surfaces, your car drives differently on top of them. So certain track types cater to specific drivers.

Starting point is 00:27:12 There's one up-and-coming driver named Alex Bowman who is, I would say, maybe a series average driver who happens to be exceptional on this one specific type of track. to be exceptional on this one specific type of track. And I find it interesting that in 2020, Phoenix, one of those one mile tracks is going to be the site of the championship race. So we have a playoff in NASCAR. It lasts 10 races with a winner take all championship scenario. And in that sense, it can be as much of a crapshoot as you and Sam complain about come baseball playoff time. You have this long regular season and this very brief playoff to determine a champion. We have that as well. So there is some volatility and it's some of these stylistic differences and adjustments that we have to consider for track types that increase or decrease the win expectancy for some of these drivers. Have there been changes, aesthetically speaking, to racing as a result of analytics? I

Starting point is 00:28:13 mean, is there any strategy that has fallen out of favor or that is ascendant now that is maybe spectator-friendly or unfriendly? I don't know what that would be. It seems like the cars still go really fast. But what, if anything, has changed just from a visual perspective? So it's going to be the passing. I know that NASCAR has changed rule packages. In 2019, they went to a low horsepower, high downforce rule packages in hope that cars would race side by side because people like seeing cars race side by side. But the problem with that is that when two cars are going side by side, they're wasting a lot of time and they are increasing the on-track delta to the car that is directly in front of them. So teams don't view that as conducive to their goal. They instill in their drivers the need to, if you have the pass, make the pass. If not, then don't worry about it and

Starting point is 00:29:17 don't get into that battle because, yeah, we may end up winning the battle. We lose the war. That's probably a bad analogy. But we lose the war. That's probably a bad analogy, but we lose the war in the long run. And that isn't something we need to do. Of course, fans want to see all cars next to each other and racing and side by side and beating and banging, but the teams don't. And they strategize around that. How much of an analytical component is there to pre-race planning when it comes to, gosh, I don't even know enough to speculate about what it would be, but the route that you take or the pace or when you plan to stop? Are there things that are very precisely mapped out before the race begins? And is that largely analytically determined? That is a good question. So I believe that these may be, the teams in NASCAR may be the most prepared going into a race as any team going into a contest in any sport.

Starting point is 00:30:14 They know what they have because they have the simulation tools to expect the pace of their car. They have an idea of when they are going to pit. What they don't know is when those cautions will actually fall and when that will allow them to pit. So they go in prepared, but as the race breaks, maybe in a favor that they weren't expecting, they have to adjust on the fly and they have to consider what their surrounding competition is doing as well, because most of the time they're going to have to bob when their competition is weaving. And that's the only way that they're going to jump positions on the racetrack. Crew chiefs and engineers have figured out ways when cars stop under green flag conditions, which means the race is happening. It's not a caution. It's

Starting point is 00:31:01 going on. They are actually able to game positions out of that without passing a single car, just timing their stop, perhaps getting on fresher tires where they can produce faster lap times sooner than some of the other cars. And they're making up a ton of ground that way. And we've seen the more forward-thinking crew chiefs do this in recent years. And that's become a commonality of figuring out how to jump positions and also protecting your own position, knowing that the crew chief behind you is probably going to do the same thing to you. So is there a way to compare cross-era racers from earlier eras? Do we have precise enough data on

Starting point is 00:31:42 the cars and how the technology has improved? Have the speeds increased? And is there a way to adjust for that and say that racers as a whole have gotten better at just racing, even aside from the cars? Yeah, that is a very tough question. I was asked to do a comparison of Jimmy Johnson to Richard Petty and Dale Earnhardt, two other seven-time champions, first ballot Hall of Famers. And Richard Petty won 200 races in an era where it wasn't popular to compete in every race. A lot of NASCAR was a very regional-based series, and most drivers cherry-p picked where they were going to race. Richard Petty raced everything and won seven championships. And the average number of total competitors when he

Starting point is 00:32:31 won those, full-time competitors when he won those championships was four. And when Jimmy Johnson won his seven championships, his average number was 37. So it's grown from there. I know now that there are more drivers with better everything at their disposal. And look, I might be biased, but I think with the advent of some of this technology, drivers are addressing their weaknesses a lot quicker than they used to. their weaknesses a lot quicker than they used to. And I have my eye on that aging curve for that reason. I think we might be seeing a statistical prime lengthened. Maybe they're getting to their peak earlier. And even on my podcast, Positive Regression, we posited something as simple as LASIK surgery as a performance enhancer, because as eyesight goes, that's the ability to perceive depth and have peripheral vision and everything that makes you a good race car driver.

Starting point is 00:33:31 It wears off. You're no longer that good race car driver. Well, what if a driver put that in their hands? So I think as we move forward, we're going to see the aging curve change its shape. So I don't know where that stands on the comparison of eras, but I'm excited to see what this does just because I think we're just going to have better competition going forward. And has all of this improved safety? I mean, I assume that the cars themselves are safer, but is there any analytical component to that too? I mean, obviously that's an engineering problem. And so therefore I would imagine that there are numbers involved in that at some point in the process. Yeah. And I don't know that it's an analytical component either, but there has not been a driver death and a top flight NASCAR series since Dale Earnhardt in 2001. They made, NASCAR made an emphasis on coming out with different generations of race cars

Starting point is 00:34:31 where the driver was more protected with a roll cage and crush panels and safety gear like the head and neck restraint, things like that. safety gear like the head and neck restraint, things like that. There have been scientific discoveries that have kept some severe injuries from taking place. There certainly have been very hard accidents, but fortunately we've seen everyone at least come away alive, walk away in most cases. I don't know how that falls on analytics, but that is a point in the direction of improved technology and that folks in our industry are able to address things much quicker than they did in the early years of NASCAR where driver death was pretty common. Are there any misconceptions that have been overturned by this new wave of analysis,

Starting point is 00:35:27 things that people believed about NASCAR that turn out not really to be true? Yeah, I think, I mean, what's the, I don't know that the Will Ferrell movie helped, but I think the perception of NASCAR is that it is a southern sport and those on the east and west coats can't really relate to it. But the industry itself is so forward thinking, so engineering centric that, and I live in Charlotte where the majority of race teams are, so I feel like I'm inside the bubble, but there's a ton of innovation happening. It isn't what it is most likely perceived to be. It's full-throated competition, but the innovation is something that creates optimism. So things like what we're doing here, even on the analysis side,

Starting point is 00:36:21 it's one of my goals. It's why I recruit someone like Alan Kavanaugh from Fox Sports and why I have Chris Mitchell apply his trade. I'm corralling the smart people and trying to create conversations that provoke thought because the sport itself has evolved. The analysis should as well. And my hope is that it's the analysis that brings in these people that maybe didn't know all of this was happening in NASCAR. And is there or has there been a scouts versus stats debate in NASCAR? Because if so, I guess you've been on both sides of it. Yeah, well, I went the other way. I was the old school scout first and then just became a stat Bible thumper, I suppose. But no, I think even with my original group,

Starting point is 00:37:09 the team that I scouted with, we all faced the same problem. We were going to different areas of the country and scouting different genres of racing, but we all wanted data, evidence, more things that helped us understand that we were right. We thought that a lot of these young race car drivers were talented, but we didn't know and we wanted to know. So capturing as much as we can with data has only allowed scouts to be better. I think the correct talent is being identified. I don't believe anyone has really fallen through the cracks. And I produce a top 75 prospects list every year. I feel like I'd catch one if it did. But to this point, no. I mean, I think this is analytics is something that has been embraced. And not only that, it's built this board into something that it just wasn't, you know, over 25 years ago.

Starting point is 00:38:15 It's all the better for it. Are there still things that a human can see that would improve an evaluation even added to complete tracking trajectories and all the other data that you have now? I mean, is makeup, is psychology, is that part of a scout's assignment in NASCAR? Because it seems like that would be very important since it's such a high stress activity. Yeah. So when I scouted originally, I just asked a lot of questions. When I scouted originally, I just asked a lot of questions. I've always been inquisitive, but one that I always focused in on was just asking questions of people that are around the driver just to, not intellect, but I was always looking for just someone with a general curiosity about everything.

Starting point is 00:39:05 Because if a young driver was interested in what made his car tick and why things break in a race the way that they do and getting to the bottom of those answers, to me that always suggested someone that was going to always focus on that, was always going to study that. And knowing your competition, especially in this sport, seems to be a big separator when it comes to talent perception. Kyle Busch is, look, I mean, maybe the comparison is Trevor Bauer. He's certainly a polarizing personality, but Kyle Busch has made being a student of the sport something of a way of life. The only difference is I think he's tight-lipped about it, Trevor Bauer not so much. But Kyle Busch, because he knew that other drivers were becoming hip to what he's able to do in the corners with his race car, stopped going to the Toyota simulator just because he didn't want anyone to find out.

Starting point is 00:40:14 And that makes sense. That's kind of his intellectual property. And someone close to his team has told me that he elects to do his own homework. And I have prodded as much as I can to write that story and understand what all that entails. I would love to tell it one day if I'm ever allowed. But I mean, that's one guy at the top level. All of these drivers are open to a lot of things. I think it was Martin Truex, who was the 2017 champion earlier this year, said that with the constant flux of rule changes in the sport, a driver can't just rest on his laurels. You have to be open to changing your style to maximize the package that has been given to you.

Starting point is 00:41:03 So you're now also considering those drivers who kind of have this open-mindedness. They're sort of prepared for anything because if you buy into there is only one correct way of doing things in NASCAR, you're not going to last very long because that one way won't always work just because we're changing cars. In 2021, it's already been announced there will be a new car. So drivers will have to get used to that. Is it a juiced car? Are they juicing the car or deadening the car?

Starting point is 00:41:37 Deadening, I would say. It's going to be lower horsepower, and that is part by design to influence more manufacturers like a Honda, like a Volkswagen, to come in and participate. And when I say participate, I mean pour money into the sport. And for the overall health and longevity of the sport, that is probably the correct thing to do. of the sport, that is probably the correct thing to do. But with less horsepower comes less of a reliance on traditional talent and drivers have to adjust, figure out, okay, my past talent no longer works. I have to create a new one. And have you done any work in other motorsports, Formula One, anything else? Do you keep up with research in those areas? And are there a lot of commonalities or big differences that you're aware of? I have been assigned by The Athletic to cover some sports car racing in 2020. I look forward

Starting point is 00:42:36 to that. But since launching Motorsports Analytics in 2012, I've only stuck with NASCAR and the lower series underneath it. And I just tell people it's not, IndyCar is fine. Formula One is great. I watch it whenever I'm able, but mentally the bandwidth only goes so far. So NASCAR is this, especially in America, is this giant beast and it kind of takes away all my attention. So what's next? If there's anything that we haven't touched on, new car, but is there any new data on the horizon, any new analytical techniques that you are working on,

Starting point is 00:43:18 anything that people should look forward to as the next frontier of NASCAR analysis? For analysis, I think the implementation of some of this telemetry. I'm eager to do that to a better explained driving styles. There's a top driver named Kevin Harvick. He is unique in that, at least in the prior rules package, he rode the brake a lot in the corners and you wouldn't think that that would work in a race car. He's kind of the only one that can make it work. But I would like to be able to explain that in a visual form, in a graphic form. Unfortunately, I haven't been able to do that in the past, but I'm optimistic going forward. And one wish for the horizon is

Starting point is 00:44:02 some kind of biometric data. I'd love to know what these drivers are experiencing health-wise. I'd love to know when they get tired. That dives into some tricky territory with wearables and turning over health data. But that would be very interesting to look at and understand how was a driver impacted because sometimes you're talking 140 degrees inside of a race car. That's pretty tough to bear. Uh, you know, in the summer months, um, how, how do our drivers in shape? How do they, how do they, how does that, how does their fitness figure into their result? I would love to, to study that. Uh, so fingers crossed going forward.

Starting point is 00:44:46 And I just noticed as I was browsing through your archives, you did a piece last month on speed rankings in crunch time. So is there clutch in NASCAR? Well, okay. So it's a little bit different than baseball, right? A run in the first inning counts just as much as a run in the ninth inning. But here, the whole point of the sport of auto racing is to be in the highest running position possible at the conclusion of the race. So everything comes forward, all the jostling. I don't know that there are clutch drivers just when it comes to personality, but there are teams that habitually get faster as races progress. I noted that as a strength for Team Penske. The trio, Brad Kozlowski, Joey Logano, Ryan Blaney, the thing that they probably do best as an organization is figure out throughout the

Starting point is 00:45:46 course of a race. And this can be three hours, how to make a car go faster on the fly without really debriefing. Um, and that is, that is just not easy. You do not see that. And that is, um, that's something I do play, uh, pay close attention to, to. Kyle Busch crept into the championship race with, I think it was something like the fifth fastest car, but he was the tenth fastest in the fourth quarter of races, suggesting that a major disconnect was occurring. Something, they were getting slower as the race continued. Never got to the bottom of that. They ended up winning the championship i do think that that's something they're going to have to address before 2020 because if that is a sign of an underlying problem that's only going to continue so um it that i i like looking at things like that just to determine you know what is what is actually happening in a race that we don't see

Starting point is 00:46:42 and that's um that's a big part. All right. Well, this was fascinating, the whole point of this series. The goal was to learn something from other sports, and I have just learned a ton talking to you. So you can also learn from David by reading his writing, which appears frequently at The Athletic, and also at Motorsports Analytics, where you can also find Chris Mitchell's work these days,

Starting point is 00:47:06 at least the public-facing part of it. And you can hear David every week on the Positive Regression podcast. So thank you so much. This was extremely enlightening. Thank you, Ben. I appreciate it. All right, let's take a quick break, and we'll be right back with Robbie Ketchel to discuss analytics in cycling. All right, we have reached the final sport in our series, cycling, and our guest is Robbie Ketchel. Robbie has served as the director of sports science for Garmin Shark Pro Cycling and later as the data scientist for Britain's Team Sky. Guy when he helped to three Tour de France titles. He brought a number of technological innovations to cycling, including an ultra aerodynamic skin suit, a suite of onboard

Starting point is 00:48:09 sensors called the Bat Box, and a real-time tracking and planning app called Platypus. More recently, he helped plan Elliot Kipchoge's historic sub two-hour marathon in Vienna in October. Robbie, thank you for joining me. Yeah, thanks for having me. This will be exciting. So let me pose this opening question one last time. On a 1 to 10 scale of ease of analysis, where a 1 would be a sport where data analysis isn't revealing at all, and a 10 is, say, baseball, which lends itself very readily to this sort of analysis where would you say cycling falls you know i truly believe that it falls in the potential being in the 10 but we don't utilize a lot of the information that we're gathering and it's a unique sport in that a lot of the stuff that is gathered isn't really

Starting point is 00:48:57 quantified from like video and you know visuals and things like we capture stuff on the physiology of the riders and uh power meters and such and so i'd say that we capture stuff on the physiology of the riders and power meters and such. And so I'd say that we're only utilizing in the range of about a six or a seven. And can you give me some brief history of advanced cycling analysis and when teams and riders really started tracking and applying some of this information when it kind of became quantified to the extent that it has been? You know, it's actually an interesting story because in cycling, we care a lot about power data. Pretty much every pro rider has a power meter on their bike.

Starting point is 00:49:36 And so they measure the energy or the power that's produced as the riders, as they train and as they race. introduced as the riders, as they train and as they race. And you have this interesting aspect of that's kind of an abstract view of what the rider did when they were out in the, either doing their workouts or in the race, because there's no context to that other than this was their physiological output. And that's where most of the analysis goes into. And so it was probably in the late 1990s where there were a lot of platforms for analyzing that power data and heart rate data and speed data, the elevation gains and losses that you did throughout the day. And it was a time that Garmin's and different data collection devices, other ones, if you're a cyclistist you'd be familiar with an SRM or any of the other power meters. Now we have tons of other brands out there that make these power meters.

Starting point is 00:50:30 And so it was about this time where that started to become more and more popular. And then there were tools like Trading Peaks or Today's Plan that came out with ways to analyze that. Today's Plan being the more modern version of it. And so different people are developing tools. And so as these tools started to kind of come into the sport, it became more of a data-driven analysis, not just for pro guys, but for amateur athletes as well. And was there any backlash to that? Were there people saying, this is taking the romance out of cycling, it's turning it into a science, anything like that? You know, there always is that no matter what the technology is that's put into the sport,

Starting point is 00:51:11 but you see that in any sport, you know, car racing, any of the Olympic sports, athletics, cycling, baseball, as you know, basketball, football, all that is really data driven and technology driven in terms of the sports science behind it, because everybody's always trying to push the edge and get better. And so I guess it just depends on where you stand on that line of romance, if you will, and what you view of it. But there always will be those that don't like the technological innovations that come into any sport, I believe. But I truly believe that's part of what keeps it exciting.

Starting point is 00:51:53 And in cycling, was there an early adopting team or rider who kind of became associated with this movement early on or was the first to really embrace it fully and derive some great advantage from it? You know, I don't believe so. I have to keep in mind that a lot of this came about even before I started working in pro cycling. And so, you know, I think that there were some early adopters of an SRM power meter. Greg Lamond back in the day was an early adopter. There were others that, you know, as they became available, first started using them for other riders. But nowadays, even when I first started working in the sport at the pro level around 2008, it was very common that everybody had a power meter on their bike.

Starting point is 00:52:34 And even when I raced as an amateur and everything, I had a power meter. And so as far as I can remember in my involvement in the sport, it's always been data driven from some aspect, but it definitely has picked up in terms of how many different versions of those products are now available, how many choices you have on the consumer side, the different coaching methodologies based off of the data that's come out. And so it's definitely taken off in terms of, you know, the information is available. So now people are creating more ways to use it. And can you explain the power meter? How does it work and what does it tell you? Yeah, great. So the power meter is essentially some sensors that exist on your bike somewhere.

Starting point is 00:53:21 They can be in the crank arm of the bike. could be in the pedal some have put these sensors in the in the rear wheel in the hub and so essentially what they do is they measure the torque that you produce as you're pedaling and if you know the cadence which is your angular velocity of the pedals then you can multiply those two to get the power output and so when you have the power output you know how much power the rider is putting into the pedals so if you do a 20 minute all-out effort and you had an average power of 200 watts you could compare that effort to another trading bout where you did a similar effort and try to improve it over time or you can look

Starting point is 00:54:04 at you know sprint efforts over five. Or you can look at sprint efforts over five seconds, 10 seconds. You can look at the different efforts that you do if you have a line plot of your power throughout, say, a three or four hour race. You can look at those different efforts and kind of get down to the nitty gritty detail of how you performed at different parts of your workout or your event. And what else about riders tends to be tracked on a high-level team? Because I know you're getting complete telemetry of the rider's path, and then you're also getting various sensors about maybe the rider's posture

Starting point is 00:54:37 and position of the bike and his or her kind of physiological condition at the time. So what does a high- level data analyst for a pro riding team know about its riders, whether in training or in the actual event? Well, I think what it comes down to is you start to develop a profile for that rider. And so over time, you look at what their training load is in a given week, in a given day, compare that over the last four weeks, and you have the ability to analyze an acute training load versus a chronic one or a longer term training load. And you could use time training, you could use the average power output, you could have all kinds of other methods for determining the number of efforts that they've

Starting point is 00:55:25 done throughout those workouts. And you combine all these analytics together to kind of have this, how you have a snapshot of what their progression is as they're training more and more compared to what their load is. And so you're always trying to balance this, the amount of fatigue versus the amount of freshness and the ability to perform at their best level. And so in pro cycling, it gets complicated because you have 30 riders on a team. And there are this year 170 days of racing. And so you're mixing and matching riders that have different ability levels compared to events that have different skill requirements in order to perform well there. If you have a

Starting point is 00:56:07 flat one-day race, you need somebody that can produce a lot of power, maybe a little bit bigger, but that same person is not going to perform well if you put them in the Alps during the Tour de France. And so you're always moving around these riders to support what the team's goals are, what the individual rider's goals are, And you have this kind of this orthogonal problem of trying to solve for getting your individual performances plus your team overall performance and managing when riders are injured, when they're fatigued, when they're sick, when they're performing the best, trying to kind of mix and match to move things around constantly. And so it just becomes a rolling window of monitoring those profiles and snapshots of the riders and how they're performing at any given time, and also trying to help them improve

Starting point is 00:57:00 so that they can become better at what they're trying to achieve. And just as you did with Kipchoge more recently, you really made some innovations when it came So that they can become better at what they're trying to achieve. air resistance, and then there are innovations in the bikes themselves. So is all of that regulated? Do you have to keep everything within a certain specification? Or is it, well, anything goes just, you know, lighter, faster, more aerodynamic, the better? And how do you, I guess, develop the expertise to derive those small gains from the equipment? You're always trying to improve and it's such a hard world to make those improvements on too because if you're talking about making the aerodynamic improvements, well, a lot of that really is determined by the individual rider. And so if you're trying to develop a skin suit for an entire team, you have to consider, are we just going to focus in on

Starting point is 00:58:05 this narrow window of key riders that you want to make those improvements to? And if you try to improve for one person, for one event, in one position, you can always find those ways to make things better. But then as soon as you start to generalize it across your entire team for different events, team time trial, individual time trial, you find that you have to, you're constantly trying to make a, make something good for different events. And so I think it really depends on the specificity of it. And that's why with Elliot and the sub to our marathon attempt, there was a lot of specificity to that, right? We were doing one event, one time. We had, we chose the day and time so it allows you to fine-tune all those specific details i like to you know say you're with cycling

Starting point is 00:58:54 you're kind of this this evolving battlefield of you know you're trying to adapt to these different situations for weather you can't control the. You try to predict what it is, but if you're trying to predict from this far out so that you can make a skin suit and different things good for that, you know, you've got a different problem to solve for. And so, yeah, it's always a problem of generalizing for bigger group versus trying to make

Starting point is 00:59:19 a very specific technological improvement with like a skin suit or, you know, an aeroformation. You're just always trying to tweak things to make it better. But the one thing that I wanted to mention, because you say, how do you keep on making those small gains? Well, cycling is also unique in that the riders, the team, and the courses all change every year. And so, for example, the Tour de France this year is not the same course as it was last year. And so the equipment that you want for that is different than you would want from last year. So you always have to change to make things perform better for what are this year's conditions

Starting point is 00:59:56 while trying to look in the future and say, okay, if we think that these are different types of scenarios that we could have next year, what could we start thinking about to make those improvements? So as you're preparing for a race with a team, how are you doing that exactly? How does technology enter into that? Because I know that you can train on simulators that will mirror the conditions that you expect the riders to encounter, and you can plan out the routes very precisely and who's going to ride which stage and so forth. So how accurate is all of that in advance?

Starting point is 01:00:34 And just how, I guess, precisely plotted down to the last detail is it before the race begins? So you're always trying to analyze what are the demands of the event. And like I said, the demands change every year. You never know. So when you're, again, looking back at Kim Chogui's performance in Vienna, that was trying to reach a specific time, an individual trying to go under two hours. When you go to a bike race, you just need to beat your competitors, right, in order to win. And so those are two different problems to solve for. And so the demands of the event also include what are the levels of your competitors that are showing up. And you also need to mix and match which riders you're putting at which races so that you can

Starting point is 01:01:22 protect your ability in order to perform well at races later in the year. And so you're putting at which races so that you can protect your ability in order to perform well races later in the year and so you're you're moving these riders around and then at the same time you're you're trying to analyze these evolving demands of event that include the weather they include the road surface include the elevation changes etc and so a lot of reconning goes into it where nothing ever replaces being there, actually being present and scouting it with your own eyes. But there are a lot of online tools and things that you can use to gather data, to have a really good understanding, because you're not going to go to all 170 courses around the world and recon at the same level. And so you're choosing which ones you really want to take a deep dive

Starting point is 01:02:05 into and understand the really tiny details. I see. And is data publicly available on all teams and all riders, or do you only have access to your own team's data? You only have access to your own team's data in terms of all that physiological data, power data, et cetera. You have kind of similar to baseball in the sense that you have some stats on how riders perform at different races historically. You don't have things like, you know, a rider did seven attacks, you know, or the rider was in the breakaway. So that's kind of the missing link that I always think would be really powerful if we could solve that problem. You can go back and look at video, but again, you have a lot of video to watch

Starting point is 01:02:49 in order to gather all that data. And we kind of, we don't have that in this sport. We don't have that problem solved for. And so I think things like that are just a little bit different in the sense of having access to other data, but it is because you, you race so much, the team directors that are out there in the cars with the, the other riders, they have a really good understanding of, and they also know a lot of the other riders in the race, not just on our team, but they have a really good understanding of how people are performing. And because they see them, they see them dropping back to the cars, they see them in breakaways or, you And because they see them, they see them dropping back to the cars, they see them in breakaways, or, you know, they see them with their own eyes. And so that information is kind of captured from just being present at the races.

Starting point is 01:03:33 And can you give me any examples from your time with Garmin or Team Sky where some analytical effort that you made helped shape the direction of the race or some decision came down to data and was beneficial in the end? Yep. So a lot of time we have looked at the team time trials and tried to understand the best formation that we could put the riders in that, you know, you look at all the different iterations of that. And so team time trial is a nice thing because it's super complicated, but it's also controllable in the sense that you don't look at all the different iterations of that and so team time trial is a nice thing because it's super complicated but it's also controllable in the in the sense that you don't have to deal with the other team tactics and so being able to look at the orientation of the riders the aerodynamic performance of putting them in different formations you don't have to

Starting point is 01:04:19 finish with all riders so if you start with eight most of the time you have to finish with five and so you can lose riders at certain points in the race. Who are they? At what point makes the most sense for them to do it? What is the pacing strategy to perfect their performances? So team time trial is a big one. This year, there aren't a lot of team time trials. There isn't one in the Tour de France. And so that's not a big thing for us. But another area that becomes big is really knowing and forecasting the weather conditions prior to the start of a race. Because one of the big things during a stage race is if you have these really high wind conditions. If you have high wind conditions, but you have high coverage with trees and buildings and things around you, it really doesn't matter that much. buildings and things around you, it really doesn't matter that much. But if all of a sudden you have this open field and there's a strong crosswind and your team is not at the front and other teams are at the front, they're going to know that and they're going to step on the gas and put in a big effort to try to separate you from the front of the peloton. And so those kind of conditions

Starting point is 01:05:21 have always been critical. And being able to kind of have that, you know, you think about 21 days in a Grand Tour, how many times those critical moments come up. And so you're always kind of crunching with the weather to have the most accurate information. And has all of this analysis changed the look of a typical race in a way that I guess either has improved it or maybe made it less entertaining from a spectator perspective? I mean, in terms of just what a race looks like, if you were watching one now, as opposed to 20 or 30 years ago, is there different pacing? Is there just a different approach to a long race, let's say, and has that made an aesthetic difference to the

Starting point is 01:06:07 sport? Yeah, I think so. If you look at how the different teams interact even with each other, you collaborate at different points in the race. Teams are thinking about, okay, these are my competitors, but at this point, we have a common interest. And so there's a lot of that happening with team tactics. You'll notice that the strategies are pretty similar to how they were in the past, but they might be a little more optimized. At a Grand Tour, we care about really optimizing the energy demands of the entire team over three weeks of racing. And so what that means is on given days, you're going to have guys that are going to ride easier because their overall time doesn't

Starting point is 01:06:45 matter. It's just the leader of the team that they're trying to help. And so they have a lot of, you know, you think about three weeks of racing, but some guys have easier days and some days, no day is an easy day, but they may not have as many roles to play. And so just the amount of, the amount of thought that goes into all of that to kind of optimize the performance, I think is big. But at the same time, I think the training, monitoring the training is what's really changed. We just have, we have the ability, if you think about where data analytics has come in a lot of sports and different things, you kind of have this massive improvement in data warehousing, right?

Starting point is 01:07:31 Being able to collect the data and put it in one central repository so that you can at least go and visually look at it is a big step and a big leap forward. Whether we're doing advanced analytics on that is up for debate, but there's at least the appetite for now, what can we do with the information? Because it's all been stored in a way that we can do something meaningful with it. I see. And is that something that fans ever complain about? The fact that maybe there's an interest in pacing and looking at the whole three weeks. And so the rider you're watching at any given moment is not necessarily going all out at that moment. I mean, is that something that's perceived to have made racing any less entertaining or not really? Actually, I think it's the opposite. It actually helps a fan understand it. Because I

Starting point is 01:08:12 think the methodologies were always still there, even though the data may not have been available to really analyze it and crunch it. But now that a lot of amateur athletes, amateur cyclists are also collecting this type of data, they have a better understanding of how to approach the strategy in a bike race. And so they can get more involved. And, you know, you ask somebody, how can you sit in front of a TV and watch four hours of racing during the Tour de France on any day? Well, if you understand it at that level, you can. And have the analytical movements in any other sports helped spur cycling's analytical

Starting point is 01:08:47 movement? Is there any parallel to either other racing sports or any sports at all that has transferred over to cycling or even just the way that it's been embraced in other sports maybe has made people in cycling sit up and take notice? Yeah, I think that there's kind of uniquely now, I don't know if you're aware, but INEOS has partnered with the Mercedes F1 team. And they also own an America's Cup sailing team. And the three teams, including the team, the cycling team, are all partnering together because there's a lot of similarity in terms of the approach, the data science, the human performance, everything that exists there because they're all about racing. And when they're all about racing, you're really trying to perfect your speed, right, And the overall strategy of how you execute on that speed. And so there are a lot of things that you think about

Starting point is 01:09:49 in Formula One, way more advanced in terms of the amount of data they collect, how they utilize it, rapid prototyping of technology, et cetera, that we can learn from. And then similar in sailing, you know, they're a lot about building the technology to optimize the performance. And so I think there are, you know, they're a lot about building the technology to optimize the performance. And so I think there are, you know, there's some unique things about those sports and probably many others out there that kind of share a similar desire to make use of data and science to optimize human performance.

Starting point is 01:10:27 performance. And how much faster has all of this made riders? And is it difficult to isolate the components of that improvement, whether it's the technology, the improvements in equipment, the training, the recovery, or just the pacing and the planning? I mean, how much better and faster are riders now than they were before data became a big part of the sport? And what do you think is responsible for the greatest gains? That's a really great question. I don't know the answer and how much faster, but I really want to say that I think that riders are more durable and the durability is, is, uh, really as a result of being able to analyze the training load over time and get a better understanding of that, of that rider and

Starting point is 01:11:05 and how they're doing because what what has changed with having these repositories of of data in a way to kind of track your your team of riders is that you've also improved on the communication with all of those riders you're kind of in constant contact with them and i think when you're in constant contact you can better help them and you can you can because you're kind of in constant contact with them. And I think when you're in constant contact, you can better help them. And you can, you can, because you're not always going to be in the same location as them. They're going to be at a race. You're not at that race. You're helping another rider, a different race, you know? And so being able to have that connection so that you can better monitor them. And so what I mean by durability is not getting sick as much, not getting injured as much, things like that.

Starting point is 01:11:52 And because of that, I think riders' performances improve. And again, I don't know if the speed is what the metric should be, more of how many races you win. And so I think you see teams that are really getting good at it are more consistent at winning. And I mean, you think about how many times INEOS has won the Tour de France. That's because the planning and the preparation and the amount of details that goes into that. And so I think consistency maybe is another way to say it. And how great a role does randomness play? I mean, if you are the best true talent team, how often are you going to actually win the race? Or is it going to be derailed fairly regularly by collisions or accidents or injuries or unpredictable conditions, that sort of thing? Yeah, the unpredictability is what makes all sports so exciting to watch, right?

Starting point is 01:12:40 And so that element is always going to exist. And so that element's always going to exist. But what you're trying to do is you're always trying to mitigate the prevalence of something hurting your ability to perform well. And so you're constantly trying to think, okay, how can we eliminate the natural talent of the writer, obviously that remains very important, but has the receptiveness to data and information and the willingness to really well and all that, but isn't interested in training as scientifically as someone else's or, you know, planning, drafting or pacing the way that someone else might be. Is it now more important, relatively speaking, to have a writer who, you know, reaches some baseline of performance, obviously, but is more receptive to this information than in the past? Yeah, it's interesting because you would actually think that having this information available would help somebody that may not be as talented, as physically talented, bridge that gap to those that are and maybe wouldn't be as interested in utilizing the

Starting point is 01:14:04 science behind it because they're already so good. But it's actually just the opposite, I think, in some circumstances, because the sport's so unique in that when somebody is really talented, they may not have to develop other skills that help them perform well. For example, having the craft to know when to tuck in behind all their other competitors and draft. If you're super talented, you can expend the extra energy to be out in a certain area. And so I think that sometimes it's actually helped those talented riders be even better because they realize, oh i i didn't think that because i didn't have to and now now i can become even better and so there's but you can look at it the other way

Starting point is 01:14:53 too and it's also with the next generation of riders as younger riders come into the sport they're more accepting of technology you know they've got all the cell phones and the latest laptop computers and different things that they're used to growing up with. Because of that, I think that helps bring that information to them because they're already kind of used to interacting with things like that. And so at the beginning, you mentioned that cycling has the potential to be a 10 on that scale of analysis and applying that information, but that it isn't there yet. So what is holding it back and what's the future? I think it's just that we're learning, that we are continuing to understand how to get there.

Starting point is 01:15:39 It's just the sport is really, really complicated. You're never going to be perfect in hitting a 10. There's always going to be something that you can do a little bit better. And so I think that we'll gradually get further ahead. But then that 10 is going to move away from us, right? Because more information from a different area is going to come in that we're not utilizing it. So you're always chasing that 10. And so I think it's just a matter of getting as close as you can and continuing to

Starting point is 01:16:05 improve any way that you can. So I also wanted to ask because performance enhancing substances have been a problem in baseball and in every athletic endeavor and cycling is no exception where doping scandals happen every now and then. In baseball, it's very difficult to apply data and statistics to identifying what a player might be doing because it doesn't produce a clear measurable output that you can isolate and say, oh, this guy started doing this thing at this time. array of outputs and power ratings and all of that. Does that give people a tool to try to pinpoint when someone might be doing something that they shouldn't be? You know, I have to say that I'm very fortunate to have come into the sport at a time where this isn't really a consideration for somebody as young as me. And so I'm not involved in any of that. The sport's in kind of a different place and a really good place. And so I'm not involved in any of that. The sport's in kind of a different place and a really good place. And so I think that, yeah, I'm fortunate to not have, not be able to

Starting point is 01:17:10 answer that question for you. Okay. And so I guess lastly, when you were helping out with the sub two hour marathon, what was the biggest analytical component of that? What were you able to do to help shave seconds off at that time? You know, my boss, Dave Brilsford, said to me constantly, your only job is to continually find every second. And so I got really obsessed with that. And it was just optimizing. It was giving Elliot the best opportunity to run as fast as he can, optimizing everything about the course and about the day and the time that he was going to start and do it. It was everything from the surface

Starting point is 01:17:49 to the elevation changes on the course, to the turn radius, to the weather, to the wind, picking the right location, time of year, time of day, all of that. And it all just accumulated half a second here, half a second there. And so it was just an optimization of bringing everything together

Starting point is 01:18:05 and a really crampacked timeline of five months. Right. And I know that, you know, you believe very strongly in sort of a lack of limits when it comes to human performance. So do you believe that in cycling and track in all of these endeavors, it's just going to be a continual improvement. Will we ever hit a wall when it comes to how good people can be, how fast people can run or pedal? Yeah, right. And I think no, because we've shown over how many hundreds of years that we continue to get better, right? And we get smarter, we prepare prepare better we give people opportunities and so we continue to build on all that information and performance and so i think that you can just always keep on pushing that limit i know that you have trained for marathons yourself and you've ridden so are

Starting point is 01:18:58 you able to apply some of these same insights to your own performance or do you need to be an ultra elite performer for some of the things that you have discovered to to really be of use yeah it's funny that you say that because so i run the new york city marathon for a charity for my son every year and first year i did it which was last year i was running to the left and the right side of the road to cheer on everybody that was super special running whether they were blind or they had some other challenge that they're overcoming and and i that was special for me that i was doing that but then after going through the ineos 159 i realized how important it was to not run too far because there's an optimal path on that course and so I always and so when I ran it this year I was like I need to run the straight line between

Starting point is 01:19:49 the corners which is hard to do in the in the crowds and everything but yeah I always learn things that may not specifically apply to me but there's always something there in some context I think oh wow I could I could do that a little bit differently. Yeah. Well, that reminds me of one more thing I meant to ask, which was, have there been any major misconceptions that have been overturned in any of the fields you've worked in, whether it's cycling or track, things that people used to believe and used to teach that the use of data and technology has exposed as maybe not being the best practice?

Starting point is 01:20:26 I think that it definitely has changed the way that we train. I know that what you're asking is kind of a big thing where I think that there are strategies that we use in team time trial and things that we wouldn't have thought of before. And, you know, a little simple things. There's no big thing that comes to mind that we think back and we're like, why did we do it that way? It's so obvious now. I think it's just a continuation of all these little details that, yeah, nothing really big comes to mind. All right. Well, this has been really fascinating. You can find Robbie on Twitter

Starting point is 01:21:01 at his name, Robbie Ketchel. And thank you very much for your time. Yeah, thanks for having me. All right. Thanks to David and Robbie for sharing their knowledge. We have one more episode in the Multisport Sabermetrics Exchange series to go. That'll be up next time. I have a few follow-ups here. First, on our previous episode, my usual co-hosts, Sam Miller and Meg Rowley and I were talking about famous home runs in history that were not walk-offs, and we couldn't come up with that many aside from Hal Smith's in the 1960 World Series. Well, Anthony Sheff, our Patreon supporter, wrote in to suggest the very obvious one, Bucky Dent's home run in the 1978 AL East

Starting point is 01:21:35 tiebreaker game. Not a walk-off, but of course an iconic one, the Bucky bleeping Dent blow, one of the more famous home runs in history. Also in the baseball news department, since our last episode, the Twins signed Homer Bailey and Rich Hill. We've been wondering when the Twins would get around to adding some starting pitching, so they've done it. I don't know who this leaves for the Angels now, but that's a topic for another time. Rich Hill is a folk hero of this podcast, and so it's notable that he signed somewhere,

Starting point is 01:22:00 and extra notable that he signed with the Minnesota Twins, who also employ another hero of this podcast, Williams Estadillo, which means that it's conceivable that Rich Hill could throw to Williams Estadillo in a major league game at some point this season. Won't necessarily happen because Hill has to come back from injury, Estadillo has to stick on the roster, he has to be playing catcher on a day that Hill is pitching, so a lot of things have to come together for that memorable moment to occur. But at the very least, they are teammates right now, and that's pretty exciting for fans of this podcast. And now a couple of addenda to the cricket segment of this series, which it seems like a lot of you enjoyed, as I did. First thing is that someone suggested that

Starting point is 01:22:39 I mention the Duckworth-Lewis-Stern method in cricket, because this is the rare example of a statistical formula becoming part of the rules of a sport. I'm only going to summarize this because my understanding of it is imperfect and because the Wikipedia page goes on for thousands and thousands of words, but essentially the Duckworth-Lewis-Stern method, formerly known as the Duckworth-Lewis method, until Professor Stephen Stern came along. This was a method devised by two English statisticians, Frank Duckworth and Tony Lewis, to resolve rain-affected cricket matches. So there were multiple methods in the past to try to figure out what to do if a cricket game was suspended. And just reading from the Wikipedia page here, the Duckworth-Lewis-Stern

Starting point is 01:23:22 method is a mathematical formulation designed to calculate the target score for the team batting second in a limited overs cricket match interrupted by weather or other circumstances. When overs are lost, setting an adjusted target for the team batting second is not as simple as reducing the run target proportionally to the loss in overs because a team with 10 wickets in hand and 25 overs to bat can play more aggressively than if they had 10 wickets and a full 50 overs, for example, and can consequently achieve a higher run rate. The DLS method is an attempt to set a statistically fair target for the second team's innings, which is the same difficulty as the original target.

Starting point is 01:23:57 The basic principle is that each team in a limited overs match has two resources available with which to score runs, overs to play, and wickets remaining, and the target is adjusted proportionally to the change in the combination of these two resources. I know that may not have been easy to follow if you are not a cricket connoisseur, but this method came about, and I'm continuing to read here, as a result of the outcome to the semi-final in the 1992 World Cup between England and South Africa, where the most productive overs method was used, and the problem with that method was used, and the problem with that method was that it didn't take account of wickets lost by the team batting second,

Starting point is 01:24:29 and so it penalized the team batting second for good bowling by ignoring their best overs in setting the revised target. So in this 1992 semi-final, Reign stopped play for 12 minutes, with South Africa needing 22 runs from 13 balls. The revised target left South Africa needing 21 runs from one ball, a reduction of only one run compared to a reduction of two overs, and a virtually impossible target given that the maximum score from one ball is generally six runs. And so these two statisticians realized that this was a problem, and according to the Duckworth-Lewis method, in that match, the revised target would have left South Africa 4 to tie or 5 to win from the final ball. This was first used in international cricket on January 1st, 1997,

Starting point is 01:25:09 and then it was formally adopted in 1999 as the standard method of calculating target scores in range-shortened one-day matches. You can read up on that further if you care to, but kind of cool that math and analysis was used to actually improve the sport and deal with a flaw that had prevailed until that time. Lastly, in the outro to that cricket episode, I mentioned that cricket had just introduced a bat speed measurement to its broadcasts, which it was calling the smash factor. I was lamenting that this didn't exist on baseball broadcasts because we have no public measurement of bat speed. And I was alerted by listener Lucas Apostolaris of Baseball Perspectives, who was watching an old ESPN broadcast from 2000, that in addition to the pitch speed,

Starting point is 01:25:50 the broadcast displayed the bat speed. I will link to that on the show page if you're interested in seeing it. And after doing some digging, I turned up something I had completely forgotten about if I had ever known, which was the ESPN Bat Track, which was a tool that was developed by Sport Vision, the same company that developed PitchFX and the glowing puck in hockey and the first downline in football. And Bat Track, for I guess a couple years on ESPN broadcasts, would actually measure and display the bat speed on the swing. And I really couldn't find much information about this at all, so I kept googling and I found the name of someone who had worked on it, Dan Ryazansky, and I emailed him to

Starting point is 01:26:24 ask about it. And this is what he said. Wow, this brings back memories. It was my first job out of college. I was working for Sport Vision in New York City. Basically, at my first day in the job, this is back in 2000, they dumped spreadsheets on me and said, This is pitch and bat speed data from the Mets and Braves from their real games. See if you can make something of it.

Starting point is 01:26:42 If I remember correctly, BatTrack was multiple radars designed to precisely measure pitch and bat speed. I remember this is 8 to 10 years before PitchFX debuted. He continues, I believe they had radars in various places in the ballpark to get the number correct, as opposed to a single radar behind the plate. I wasn't involved in the hardware side of it at all. I think that was done out of SportVision's California office. So I start organizing the data, load it into a database, and create a program to statistically analyze it. Basically trying to figure out if there's a correlation between pitch and bat speed and the outcome of the hit. I don't remember the pitch speed results, but the bat speed was obvious immediately. There was absolutely no correlation at all.

Starting point is 01:27:18 The graph was splattered all over the place. It was obvious to even a novice like myself that Batspeed was a completely useless statistic. I showed the bosses the data, they saw the uselessness, and that was it for me and BatTrack. Moved on to other projects and then quit for a better paying job soon after. This was about June of 2000, so no idea what happened to BatTrack since then, but I think it was retired from ESPN broadcast shortly after. after. And I replied to Dan and I said, I wonder if it just wasn't measuring what it was supposed to be measuring, or maybe there was another way to do the analysis that would have shown the analytical value of this bat speed measurement, because you'd think that if it was accurately tracking this, it would have been somewhat useful. You could find ways to dissect that data and have

Starting point is 01:27:58 it be pretty telling. And Dan said, that's what I was thinking when I was writing the initial response. What if there was some other way to look at that data other than what I and whoever else followed me did? I remember charting pitch speed versus bat speed and then trying to analyze if bat speed led to home runs, but perhaps if there was data about the length of the hit, regardless of its outcome, or correlating pitch speed, bat speed, and swinging strikes. I do remember that even for specific players, bat speed was all over the place, and when I looked up the big hitters of the era, they did not stand out from the rest. It just seemed completely random. My guess is that the teams that participated did not see the extra benefit and bowed out. I did see

Starting point is 01:28:33 somewhere else online that Mark McGuire was tracked at 99 miles per hour, and that that may have been the fastest at the time. So now we're in this era with swing sensors, and those clearly have some analytical value. So hopefully this will come back to baseball broadcasts at some point but i was wrong in thinking that this was something cricket did first baseball broadcasts did have it just not for very long you can support the podcast on patreon by going to patreon.com slash effectively wild the following five listeners have already signed up and pledged some small monthly amount to help keep the podcast going and get themselves access to some perks lance daniel hepper colleen barr gordon kristin nick sievers and tim wolf thanks to all of you you can join our facebook group at facebook.com slash group slash effectively wild you can rate review and subscribe to effectively wild on itunes and other podcast platforms keep

Starting point is 01:29:21 your questions and comments for me and sam and meg coming via email at podcastandfangraphs.com or via the Patreon messaging system if you are a supporter. We'll probably do an email episode next week. Thanks to Dylan Higgins for his editing assistance, and we'll be back with one more episode this week. Talk to you soon. He is faster than everyone Quicker than the flame in a man Like a flash, you could miss him, no fun No one knows quite how he does it, but it's true They say he's a master at going faster No one knows quite how he does it But it's true they say

Starting point is 01:30:07 He's a master going Faster

Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1481: Multisport Sabermetrics Exchange (NASCAR and Cycling)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.