Scuffed | USMNT, World Cup, Yanks Abroad, futbol in America - #457: Sports-Reference founder Sean Forman joins the pod

Episode Date: December 13, 2023

Sean Forman, who founded Baseball-Reference in 2000 and launched fbref.com in 2018, joins Belz to discuss the origin story of the company that runs several of the world's highest-traffic sports stat w...ebsites, including the one that has made soccer fandom richer in the past half-decade. Lots to discuss, we covered a lot of ground. Second half of the episode is available to patrons of Scuffed. See link below for details.Subscribe to Scuffed on Patreon! You get exclusive episodes one or two times a week, plus access to the Discord and live call-in shows, by signing up for as little as $2 a month: https://www.patreon.com/scuffed Skip the ads! Subscribe to Scuffed on Patreon and get all episodes ad-free, plus any bonus episodes. Patrons at $5 a month or more also get access to Clip Notes, a video of key moments on the field we discuss on the show, plus all patrons get access to our private Discord server, live call-in shows, and the full catalog of historic recaps we've made: https://www.patreon.com/scuffedAlso, check out Boots on the Ground, our USWNT-focused spinoff podcast headed up by Tara and Vince. They are cooking over there, you can listen here: https://boots-on-the-ground.simplecast.comAnd check out our MERCH, baby. We have better stuff than you might think: https://www.scuffedhq.com/store Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Transcript
Discussion (0)
Starting point is 00:00:03 Welcome to the scuff podcast, where we talk about U.S. soccer. Our guest today is the guy who started the company that runs several of the world's highest traffic sports stat websites and launched FBREF.com, the site most germ to our podcast in 2018. He originally started baseball reference in 2000 loading data from CD-ROMs onto the internet. He's an Iowa native. He's been a patron of scuff since 2019, so as long as pretty much. anyone and he runs sports reference from a building at a church in Philadelphia. Sean Foreman, welcome to scuffed. Thank you for having you,
Starting point is 00:00:47 it all. It's great to be here. First of all, I just want to thank you for your work. I asked the people in the Discord for questions, and several of the questions were just thank you, Sean, for launching FBREF,
Starting point is 00:00:59 making it so much more fun to be a soccer fan. Well, I'm very lucky that people like a lot of the same things that I like, so I get to do these things and people respond. and positively to them. So it's really good for my ego. It's a lot of fun to do this. So I really enjoy it. Good, good. Now, you grew up in Western Iowa, right? Can you tell us about your upbringing and how you got into sports and math? Sure. So I grew up in a very small town in Western Iowa called Manning, Iowa. It's about 1,500 people. I had 33 in my high school class when I graduated. My dad was the
Starting point is 00:01:37 football coach in town. And so he would bring his stats books home on Friday nights after the games. And being a third or fourth grader, I would pour over them, start calculating how many rushing yards different people had. And just always, I was always really enjoyed math and was good at math. And so those two things kind of came together, you know, in my interest. So I was the guy at the fantasy draft who had the crazy spreadsheet, you know, with the all the calculations in it and auto-updating and stuff like that.
Starting point is 00:02:10 So it's always been an interest of mine. And yeah, so I played sports, you know, went to a small, it was a small town. We didn't have a soccer team. I did play through like third grade before we moved to Iowa and then soccer kind of just went away as part of my background. But, but yeah, sports have always been, you know, pretty central to my life. So you, so were you actually born in Nebraska, moved to Iowa? Well, I was actually born in South Dakota.
Starting point is 00:02:39 So I was born outside. I was born in Sioux Falls, South Dakota. And then my dad was a assistant football coach in Valley, Nebraska, through like 1980. And then got the head coaching job in Manning. And he actually retired there, 2018, 17, something like that. So the football field is actually named after him in town. He's in the Iowa High School Hall of Fame. and won a state title and many, many trips to the playoffs and that.
Starting point is 00:03:11 So he was a very successful football coach in Western Iowa. So he stayed there for 30-plus years. And that's really where I consider myself to be from. Okay. I see. And that sort of helps explain why your family is Nebraska fans and not Iowa fans. Yes. Yeah.
Starting point is 00:03:30 Yeah. So we grew up in Western Iowa. My dad was a huge, we subscribed to the Omaha paper. and my dad was a huge Tom Osborne fan and kind of, you know, modeled his coaching career on Tom Osborne a little bit. So I grew up, you know, listened to KFAB out of Omaha and, you know, every Saturday and, you know, living and dying with the Cornhuskers. And so I don't follow them as close anywhere near as closely now as I did, you know, through the 80s and 90s, but, you know, still occasionally we'll check it. My brother actually works for the Purdue men's basketball team. So our family is probably more anomaly Purdue fans now, thanks to him at this point.
Starting point is 00:04:12 Well, that was the golden age for the Cornhuskers, 80s and 90s. It's always fun when we're a friend, everybody's 63 to 7, you know. And, you know, that's easy, easy to be a, easy to be a fan of a team like. Yeah, poor Iowa State, you know, poor Greg Velasquez. Exactly. So enough about that. Can you give us a basic update on the company? How many employees now? What has been taking up most of your time? What is the next big thing? So our company has been, we baseball reference launched in 2000, as you mentioned. And so it grew very slowly through the years. I didn't even do it full time until 2006. And so in 2007, we formed the company with basketball, baseball, and football, American football. as our three core sites.
Starting point is 00:05:02 And then we started growing from there. And it was still very slow. And then we really started picking up steam like 2018, 2019. And so now we're up to 33 employees. And it's been a pretty gradual growth curve since then. We operate seven sites. We also have college football, college basketball, hockey. We did have an Olympic site for a while, which we shut down.
Starting point is 00:05:25 And then we have a subscription feature. stathead, which we actually just launched for FB. Ref for soccer in the past month. And so that's, those are kind of our main core offerings. And then this summer we bought the game Immaculate Grid and also launched Immaculate Footy, which is kind of a new direction for us, but you know, fit really well with what we, what we do generally. So it's, it's, we've been growing in those directions and hopefully probably we'll be doing a little bit hiring next year as well, I think. So that brings up a question.
Starting point is 00:06:00 Claire in Texas asks, are you really good at Immaculate Grid? I bet you are. I am not kind of, my joke is that I created the site so I don't have to remember anything. And so I just know where to look for it. And so I'm occasionally I'll have some good days. I'm not playing every day now at this point, but occasionally baseball is my sport. I, on Immaculate Grid. If I try basketball or hockey, I'm terrible, but football and basketball a little bit more of a struggle.
Starting point is 00:06:30 But baseball, I can, you know, depending how risky I want to make it, I, you know, usually can get nine out of nine. It's how much I'm going for rarity scores that kind of dictate whether I make mistakes or not. So, yeah, it's definitely our office, you know, people are, we have some pretty big sickos in the office who can name like, you know, 1920s players and stuff like that. So you get some pretty crazy rarity scores that people are boasting. Nice. Has you told the New York Times, I think it was in 2017 or 2018, I can't remember exactly when, that you expected basketball reference to overtake baseball reference or you thought maybe it would. Has that happened?
Starting point is 00:07:12 That actually had, so it was kind of trending in that direction. And then Immaculate Grid really kind of kicked baseball in the pants a little bit. And just because a lot of the use, so after people play the game, they're often looking up who all the potential answers would be. And baseball is like half of the play for Immaculate Gras. So that's driven a huge amount of traffic. So that hasn't actually happened yet. Baseball has actually kind of reestablish itself as the lead site.
Starting point is 00:07:43 So basketball is very popular. We get probably the most worldwide traffic for basketball at this point. but it's, yeah, it's not quite catching up to baseball anytime soon. Okay. It was like a billion page views, which really impressed the pastor of your church in that New York Times article. Is it like $2 billion now or what is it? We actually did crack $2 billion this year thanks to the game. So, yeah, we actually raised banners in our office.
Starting point is 00:08:16 So we have, there's a company in New England that we, so we have. So we have a couple of banners raised for different milestones and stuff like that to celebrate things. That's cool. We updated the banner this year for $2 billion served. So we're lucky we hit them, sure. As the Reverend said, $2 billion is a lot. I know $2 billion is a lot. Yeah.
Starting point is 00:08:40 Yeah, we're probably averaging, we're definitely averaging over a million people a day across all of the sites. And so it's, you know, and the nice thing about the sports is that they're pretty counter-cyclical. So when baseball is out of season, you know, football and basketball are in season. So it's pretty consistent throughout the year. Let me ask another, well, going back to 2000 when you started, when you started this, I believe you were working on your dissertation at the University of Iowa in applied math and computational science.
Starting point is 00:09:18 Why did you start doing, like entering data from CDs onto a computer? Like what got you going on that? I mean, I, so I was, you know, like I said, I was one of these people who like created my own projection system for fantasy baseball. And I started, I was into Sabermetrics at the time. And so there were like these books like, I was spending a lot of time. Most of my free time was spent on rec sport baseball, which was an old Usenet group, which was kind of the Discord server of the late 90s and early 2000s. And so really, I've been, you know, most of my adult life, I've been arguing with people about different things online in one form or another, whether that's a web forum, a Discord server, or a Usenet group.
Starting point is 00:10:07 And so I, you know, I got connected with like-minded people, sort of like, you know, the SCUP Discord. And so, you know, and I was like, you know, I can probably figure out a better way to identify prospects than how people are doing it now. And so I created my own prospect evaluation system. I scraped, ended up scraping the old stats, billboard service, bulletin board service, which was like, I don't even know what. I mean, I was like a terminal that I had to log into and like go through. through all the pages and capture the screen contents and stuff and then loaded them into Excel cleaned it up. And so I was publishing these reports online. It was called the Iowa Farm Report, which I thought was a pretty clever name. And I still own that domain name in case
Starting point is 00:10:52 anyone wants to start like a futures newsletter or something like that. But I, so I started this minor league rating system. I was publishing it on Usenet and then somebody who wrote a book, which was kind of the successor to the Bill James Abstract approached me. I started writing for that book. And I really created baseball reference, mostly to scratch my own edge, but I also thought it would promote the book somewhat. And it kind of, so I launched it in February, so I was avoiding my dissertation, doing this project.
Starting point is 00:11:23 I was living in Macon, Georgia. My wife was a teacher at Mercer University in her first year. And so I was, you know, this was like the internet bubble as well, so a lot of froth around and I was reading every article I could on the internet and the future of the web and things like that and how it changed the world and all that. So I was really motivated. And basically over two or three months created baseball reference, you know, while avoiding my dissertation. And my wife was working long hours. And so I was kind of at Luceette.
Starting point is 00:11:58 So it launched and it was immediately got a lot of buzz. It wasn't like hugely popular, but it was doing well. I was spending all my nights and weekends on it. I got a job here in Philly at St. Joe's was teaching. And eventually you had finished your dissertation at this point? I did. It took me longer than it should have. I actually taught at St. Joe's for a year and a half before finishing it, which I think
Starting point is 00:12:25 my chair called me into the office and said, if you don't finish this dissertation, we're going to cut your pay. And so it motivated me pretty significantly to get my PhD. So I did eventually finish my PhD I actually got tenure at St. Joe's and then took a leave to do the site full time for a year just to see how it would work out. And it worked well.
Starting point is 00:12:48 And so I quit my job and, you know, kind of the site's been going. You know, basically it's been just one day after another building on what we have since that to, you know, to get to where we are now. Was it, this is sort of a non-sport. question, but was it difficult to go from being like a single person
Starting point is 00:13:11 doing all this work that was definitely like a passion of yours to now managing 30 people, 33 people who help you do it? Like how did you make that transition asking for a friend? Yeah.
Starting point is 00:13:32 It's, you know, it's an interesting process. And I've always been very, you know, like I have a collection of probably 80 to 100 business books in my library here that I've read over the last, you know, 20 years. So it's always been something I've always been motivated to do. I know a lot of people like I just want to do software development. I just want to do product development. I don't want to have to worry about the business, which is I completely understand. But like, I don't know. I've always been the type of person when I get a challenge.
Starting point is 00:14:03 When some hurdle comes up, I like do all the reading necessary that I can to try and clear that hurdle and understand the problem and solve it and find the best approach, whether that's, you know, managing people and dealing with conflict in the office or, you know, teaching myself accounting so that I can learn, you know, enter things into QuickBooks and all that and understand. I still don't, can't keep debits and credits straight in my head. I know enough to know like what a balance sheet is, what a PNL is, and double entry accounting and stuff like that. But, you know, and so it's really just been a matter of like, you know, when a challenge comes up, just deciding I'm going to, you know, having the motivation to do it.
Starting point is 00:14:44 And so sometimes I do miss, you know, programming. I do occasionally get to program a few things and stuff. But we're at a point now. You know, we've got an engineering team of like 16 people. And so things are moving beyond like my understanding. You know, I haven't done much programming for the last five or six years since we created F. I did program more or less most, a lot of the initial FB ref stuff. But after that, I kind of, everybody, it's been delegated to other people.
Starting point is 00:15:15 And so we're probably at a point where I'm not going to be doing, you know, I wouldn't, I wouldn't be able to do a lot of the programming that I was doing, you know, five, you know, 10, 15 years ago. So it's, yeah, I miss that a little bit sometimes. I definitely do. But it's, you know, as long as you can fire good people, I think it's fun to work with them and see them grow and do new things. We have a, you know, we have a really great team.
Starting point is 00:15:40 And so that part of that is a lot of fun as well. Okay. Another question from Claire. Do you have a favorite stat? Actually, a couple of classes. Yeah. So I am, so the one stat, that I'll point to on the soccer side is we, I, I, I, so, you know, on our other sports,
Starting point is 00:16:03 we're well known, we're, we're not like bleeding edge for analytics and coverage, but we try to be pretty, you know, far along the continuum. And so we wanted to do the same thing for FB. Rep, for probably even more bleeding edge on FB. Rep than we are on the other sports that we cover. But I was always felt like there was, like, people who created shots. were undervalued in soccer. You know, because really the only stats we had for that were assists and then key passes and things like that, which are passes that lead to a shot being taken. And so, like, killing Mbapé, it can do, like, four guys and then make a pass across the
Starting point is 00:16:43 face of goal and, you know, and then, you know, the striker just, you know, taps it in from two yards. And the striker gets most of the credit. And Mbabe does get an assist, but I felt like that was undervaluing what those players did. So we created our own stack called shot creating actions or goal creating actions, which are the two on-ball, the two actions that lead to a shot or a goal being taken. And so, you know, so if Pulisic receives a ball, dribbles a guy, and then cuts, you know, process it to Wea, who taps it in, he gets two-shot creating actions. So it's the last two
Starting point is 00:17:27 actions before the shot or before the goal. And so that's, I think that captures much better what those, how those, who's contributing to those actions. We include stuff like, you know, tackles if you take the ball off a off a defender and that leads to a goal. So I think Brendan Aronson's only goal creating action with leads was when he tackled that Mendi, in the you know facing Chelsea and you know passed it passed to a teammate or maybe he's I can't remember he scored himself but he would get a shot creating action for that as okay and so you know being fouled is another example like if you're fouled in a dangerous area you know that's a valuable action and so that can lead to a shot or a goal
Starting point is 00:18:14 goal creating action so we try to try to we track all of those things I've seen it's been And occasionally I've heard it on the Bundesliga broadcast. I've heard it a few other places. So it's starting to catch on a little bit, I think. And I know Bob Morocco likes it and stuff. So it's, you know, it's a, you know, I think it has some value. And I think it's getting a little bit of traction, a little bit of traction. Cool.
Starting point is 00:18:38 I have tons more questions about soccer stats. But let's do a few other sort of more general things. You allow gambling banner ads, but you do not have. have any affiliate relationships with gambling companies. Can you explain this and why, you know, I know you do actually explain it quite well on the website, but could you explain it in your own words here? Yeah. So, you know, I think one of our strong, one of our core strengths is that we're well aligned
Starting point is 00:19:07 with our users. And so, you know, we really want to, you know, we want our users to be successful in whatever actions are trying to take on our site. And so gambling affiliate ads, the way, you know, so if you go to almost any, many sports sites now, they have relationships with a sports book. And the way those work is that they promote the sports book, users go and sign up for the sports book, and then whatever losses those users incur on those, at the sports books, the websites get a portion of that, get a significant, you know, over the course of maybe,
Starting point is 00:19:45 you know, two, three years even. So our team just did not feel, we didn't feel great about like, that we're going to make a lot of money when you lose a lot of money. And it just felt like a misalignment of our interests with the users. The other thing is you kind of have to like give yourself over to the sports books for those relationships to even be worth doing. And really kind of pepper the whole site with that sort of stuff. And we also, you know, thought, you know, just from a strategy standpoint, everybody's doing this. there's a lot of exhaustion around it. You know, it probably would be, you know,
Starting point is 00:20:21 there's probably an opportunity to zag while everybody else is going in this other direction as well. So it was a combination of all of those things. I do gamble a little bit on, you know, on game outcomes or end of season championships and stuff like that. But so it's not like anti-gambling. It was more like wanting to be aligned with what our users' successes were. So we don't accept the affiliate deals.
Starting point is 00:20:51 Our banner advertising is like bid out to the highest bidder generally through an ad company that we deal with. So we don't have a lot of insight into exactly who's being advertising at all the times. So we do accept gambling advertising in that regard, just like we'd accept, you know, automakers or, you know, maybe another consumer product type of advertising. But it's not we're not getting a cut of any losses you have or anything like that on those sites. I'm all about zagging or ziggin when everybody else sags or whatever.
Starting point is 00:21:29 How are the statement goes? I think that works too. As long as you're doing the opposite of the other person, I think that can be a useful approach. Which Bill Al in New York or actually New Jersey, I guess, asks which reference is actually your favorite? Well, I mean, you know, baseball references my baby. So that's, that's, you know, I'm probably spending more time on FB. Ref in terms of like I'm actually kind of, I'm nominally the product manager for FB.RF. And so spending a little more time thinking about that, trying to get that off the ground and grow that. But yeah, I mean, I definitely probably use,
Starting point is 00:22:11 I'd probably use baseball reference in FB.B.R.F. about the same. Okay. And you were going back to that New York Times article again, I think you told them that advertising was like 95% of revenue back then. And you were hoping, and I'm guessing, I think I understand that Stathead is a push in this direction, hoping to make more of the pie of revenue come from subscriptions. How's that going? It's going all right. I mean, it, Stadhead is growing. I mean, actually our advertising business has been growing as well so it's still it's not it's probably a close 85 90% advertising at this point um just you know it's not the worst thing in the world but it's you know you are a little bit uh susceptible uh to uh you know economic cycles and
Starting point is 00:23:02 things like that as well right a recession would hit us i mean you guys would feel a recession because people may pull back on subscriptions and things like that but subscriptions i think do a little bit better in those in negative economic cycles than like an advertising based business would so we're still working on that so yeah stathead is our subscription tool it really allowed you to like interact directly with our database and it's it's it's so we provide tools we call finders so users can come and they can try they can set up searches so you say I want to order this So for example, we just launched Stadhead FB Ref.
Starting point is 00:23:42 It's free through the end of January. And so if you want to sign up for an account can come in and do as many searches as they want through the end of January, then it will be around $9 a month, a little bit. You know, give you a discount if you sign up for a year. But it really allows you to go into our database and search for a specific item. So we don't have national team play as part of this yet. So we don't have, can't search for which US men's national team game did they have the highest XG or, you know, who had the most shots against Mexico in a U.S. match. We don't have, it's mostly league play at this point, but you can do things like, you know, show me all of the, you know, under 20 Americans to score goals in a big five league in the last 10 years. You can look at, you know, which right back have the most minutes in a top, top tier.
Starting point is 00:24:38 league, you know, across the top 14 leagues we cover, which, you know, everything from the area to Vizzi to, you know, to League 1 to, you know, to MLS. We have advanced stats for all of it, for like 14, 15 leagues on the men side. We have advanced stats for eight leagues on the women's side as well. So, and so you can really, you know, one of my, one of my concerns about the national team is we don't have enough progressive passers in the, in our lineups. And so, you know, you You can look for, you know, which midfielders have the most progressive passes per 90, you know, over the last three years. And, you know, who are eligible for the U.S. national team. So you can add a nationality filter on there and look for Americans.
Starting point is 00:25:23 And so that's, you know, that's, you know, an aspect of the site that certainly I think would be relevant, you know, to this audience at least. We don't, we don't, we're probably going to add, we don't currently have a date filter. so you can't look at like last 60 days or something like that if you wanted, but we're probably going to add that here pretty soon, I'm guessing, in the next month or two. So, you know, if users have suggestions, always open to suggestions. You know, you can reach out to me either on the Discord or anywhere, and I'm always always happy to take user feedback as things that would be more useful.
Starting point is 00:25:55 So it's, again, it's completely free to try. Go to stathead.com and look for the FBREF banner that we have on there. I have to go as to give us an email and sign up. it, you know, give it a world. Yeah. Well, somebody was asking what they could do to support your work, and maybe that's one thing. That would be one thing. I would just say just citing us and telling other people about us is, you know, it's kind of
Starting point is 00:26:19 the other thing. So it's, you know, it's always, links are always good for us as well. So that's another, another valuable thing people can do for us. Links are better than screenshots? Well, I would, yes, links are always better than three. screenshots, but if you screenshot it, feel free to add a link to the bottom of it below it, what is always useful. Okay.
Starting point is 00:26:44 It's interesting because like we launched a soccer site five years ago, like you mentioned, it's interesting to me the degree to which soccer users are constantly like repackaging the data and like creating scatter plots, creating, you know, like here's like XG versus XA, you know and then plotting all the players on there and pointing out messies you know up here in the upper right hand corner or so Sophia smith is up here in the upper right hand corner of these of these charts which is not something that you really see as much on basketball or baseball or hockey you know i i'd say um yeah that's been a significant difference in how the users have interacted with the site you know the repackaging of some of the data or creating these little cards that show
Starting point is 00:27:33 you know, way I had, you know, 56 touches and was 45 of 48 passing and created two shots and, you know, stuff like that. I, you know, I think historically people would just like, here, look at way of stats for this game, you know, and provide a link to that. But it's a little bit different. The content creation on soccer is a little bit different than what we've seen in the other sports. Are you a fan of the scatterplots as a way to communicate information. Scatter plots are valuable.
Starting point is 00:28:01 Yeah. Yeah, I'm not as big of a. We have the scouting reports obviously that get a lot of play. And we take a lot of stick for that among some people who, you know, I've already been accused of ruining baseball with stats. So you know, you say we're ruining soccer as well. So I'm not as big of a fan of like the radar plots that a lot of people like, which, you know, I understand what the audience wants is probably more important than what I want.
Starting point is 00:28:25 But so yeah, there are some visualizations I don't like as much. But yeah, scatter plots are certainly a valid. now on visualization tool. I would just plead with all scatterplot makers to be clear about what the X and the Y axis are actually showing. But that's just my little axe to grind. D.K. Nash in Nashville asks, probably a long time ago now, you used to be able to individually sponsor pages on the
Starting point is 00:28:54 sports reference pages. Was it just not worth it from a revenue content moderation perspective? And how much would it theoretically cost for the Discord, the scuff Discord to sponsor or say, Patrick Bamford. So, yeah, so way back when when I first launched to say, before we even had advertising on the site, I hit upon the idea of sponsoring individual pages. And basically, and this was started probably 2000, two, three, somewhere in there. And so it was, that was actually like our main revenue driver for like, it was kind of an NPR
Starting point is 00:29:28 model is how I fashioned it in my mind. So it was the main way we supported the site while I was doing it part-time. And so the problem was that as the site, so we tried to tie it to what like a standard advertising rate would be. So the issue became that as the site became more popular and was getting more traffic, to sponsor Derek Jeter would cost you like $3,000 a year or something like that. at which point, it's not really, there's no, the number of people willing to sponsor Derek Jeter for $3,000 is very small. And so, you know, we were kind of pricing out the people who were sponsoring Brad Radkey because he and his wife went to their first game and saw Brad Radkechkey pitch and they wanted, you know, a little, nice little note to his wife on the page that he could show her, you know, on their anniversary and stuff like that. So it was in order for it to be a viable economic model, it just, it wasn't, it was getting misaligned with the size the site was getting. And so we, anybody who is, there are still some out there, like I know Joelle and Beads page,
Starting point is 00:30:40 anybody that had an active sponsorship when we sunset it, we actually like set it up so that it would appear on the page in perpetuity. And so the only one I know, I remember off the top of my head, I know the Joelle and Beed, one still has a sponsorship very low down on the page. And we moved them down on the page and stuff. But I'm not sure what Bamford's traffic would be, you know, it would probably be in the hundreds or hundreds of dollars, I'm guessing, something like that. I'm sure the other issue with the Bamford sponsorship
Starting point is 00:31:13 is that we generally only allow like more positive messaging on the sponsorship. And so I suspect he's not planning for a positive message or maybe it's just support, you know, Patrick, keep trying. You'll eventually learn how to finish it. It'll all come good. That's probably the message you. I'm sure that's what it is.
Starting point is 00:31:33 Yeah. So here's kind of a three-part question, all focused on sort of competitors, people also in the stats space. I think take them a little bit tongue-in-cheek, but I'll read them all and then you can maybe sort of take them all together. Andre asks, how do you feel about Nate Silver? Bilal asks, do you have? have beef with the guy who started fan graphs.
Starting point is 00:31:55 And I spit hot fire asks, how does it feel to have people like Jeremy Frank, MLB at MLB Random Stats, leveraging your sites into full-blown successful sports data careers, has that happened often? And do you see similar people coming up in the soccer space right now? Yeah. So those are all people I have a lot of respect for. And so, you know, Nate Silver, yeah, I've had the opportunity to talk to. a few times always very gracious I you know his his outfit was one of the better ones
Starting point is 00:32:27 about linking to us when they when they cited us when they used us for articles and things like that but I certainly haven't have no issue with them I you know yeah and then David Athamon at Fangraphs I know a little bit we we talked and brought a couple things this past year around the use of war and arbitration hearings and stuff like that so I I would say we you know I I try to have, you know, reasonably cordial relationships with, you know, even sites that are competitors. We, you know, we often have similar issues and, you know, so I'm not saying we're trading notes regularly, but, you know, occasionally we, you know, we bump into each other
Starting point is 00:33:08 things. And, you know, we talk about how it's going and stuff like that. So, um, regarding Jeremy Frank, Jeremy was actually an intern at sports reference, uh, three or four years ago, I think. So, you know, we, Jeremy was great. We really enjoyed having them, having them with the team for the summer. And so, you know, if people are using us, again, as long as we get cited every once in a while, I'm very happy for people. You know, that's why we're here for people to use us and find information. So certainly no concern, you know, with doing that. We have a pretty robust internship program, which I probably should mention.
Starting point is 00:33:43 You probably have some students who listen to the podcast. So we hire six or seven interns a summer. We pay $20, $22 an hour, and it's a full-time job over the summer. We have people in the data space, people in marketing, people in engineering. We're going to have a product intern and kind of an operations analytics intern, not sports analytics, but more business analytics intern this summer as well. And so, you know, you work on real problems that, you know, a lot. of the initial data we used on the site was cleaned up and identified but and ID'd by interns
Starting point is 00:34:22 one summer that we had which was probably the most grinding dull internship we've ever had just because it was going through like thousands of lines of soccer data to you know and IDs to players but generally we have some more interesting projects and you know stuff that actually makes it on to the site and stuff like that so we're actually going to be advertising those probably at the end of this month, early next month, and then we usually fill them in January or February. And the nice thing about them is we, so the interns are, obviously they're doing work, but we also invite like outside speakers to come in and talk to them. So we've got a couple people.
Starting point is 00:35:02 We've done this long enough that we now have kind of alumni who have actually got jobs working in teams and they've, some of them have been able to come back and talk to our interns about what it's like working for teams and, you know, the assistant GM for the Mets is a former intern of ours. And so he came back last year and talked to people and we've had people from the WNBA office and things like that come in. So it's been a really good program for us. And so, you know, obviously if, you know, if anybody listening out there would be interested
Starting point is 00:35:34 in an internship, we have, it's pretty competitive. We get lots and lots of applicants, but, you know, certainly encourage them to apply. So Jeremy was one of those. On the soccer side, by the way, you've done an excellent job of remembering the entire question here. Yeah, so who is the soccer side? Yeah. So on the soccer side, I'm probably not in tune enough to actually answer that question real well at this point.
Starting point is 00:36:04 I mean, in terms of analytics, like I, not to promote other podcasts, but like the double pivot podcast I listen to quite. regularly and they talk a lot about analytics and numbers. Football Tactics Podcast would be another one that I listen to. In terms of reading, I'm not doing as much, let's see, I'm trying to, yeah, I don't have any real good answers for that as well. I know there, I know there are sites out there that are promoting. I'm not on Instagram a whole lot, and I know a lot of them are on Instagram and so, you know, a little bit. I don't know, there's an LCD sound system a song about losing my edge. So sometimes I feel like that's an appropriate song for me because I know, you know,
Starting point is 00:36:47 at my age, I'm 52, I'm going to be 52 soon. So I know I'm probably not going to be up on the most relevant things at this point. So hopefully we have people here who are keeping their eyes on that. Yeah, yeah, okay. You said people have said that you are ruining baseball. And maybe some people will say you're ruining soccer eventually too. but you know I saw on I saw on LinkedIn I'll just say where I saw it a guy named Bill Chuck a baseball writer said quote there are very few individuals who will ultimately affect the history
Starting point is 00:37:20 of the game like Sean has he's talking about baseball of course um is that true uh was that what you set out to do uh what you think yeah there was no grand scheme uh behind behind all this it was uh I I I yeah Bill's very gracious and and I you know certainly appreciate you know appreciate what he what he said the past at different times I am I mean like I said it was mostly scratching my own itch and wanting to wanting to know things myself and wanting to be able to find things myself and I you know I do think we've always been very focused on I think one of the skills we bring to this is that we can imagine you know one we're sports fans ourselves but two we can
Starting point is 00:38:07 imagine what others are thinking when they're looking for things. And so, you know, and then not being satisfied with kind of halfway satisfying that, you know, that, that, that question or that, that interest that they have. And so, you know, trying to make things as easy to use, trying to make things reliable, trying to build them well, you know, that's always been something that's been really important to us. And, you know, and, you know, it's kind of fed on itself. We a lot of, you know, we've been, had the good fortune. You know, so if any of your users out there are sitting on like a trove of soccer data, please call me because I'm happy to, you know, discuss like incorporating it into our site.
Starting point is 00:38:51 But we, you know, we've been very fortunate that we, I think we've had a good reputation. And so people who, you know, people seek us out and are willing to like deal with us. And so there are a lot of individuals who just as passion for a path, you know, we have an individual who is collected like, you know, I don't know how like probably 15, 16,000 seasons worth of college basketball data from like the 50s to the 90s, just transcribing information of college websites and old books and things like that that he's been willing to work with us on. And so our college basketball site is dramatically better because this person produced this work. and we all benefit from it and we're able to increase the site. We, you know, a year or two ago, we launched, we added, so SACS became an official NFL staff in 1982, but there were a couple gentlemen who would go to NFL films, they would watch old game film, they would track how many sacks, you know, they would read newspaper accounts,
Starting point is 00:39:53 they would look through media guides, you know, all these things. And we knew how many times the quarterbacks had been sacked, but we didn't know who had done the sacking. So they like did a giant reconciliation and gave us SAC data back to the 1960s, which, you know, we were happy to have. We put it on the site. And it just kind of blew up. We didn't think that, you know, we didn't even do like a big promotional aspect of this. But like the level of interest that it generated was just massive. And so there was this.
Starting point is 00:40:23 When did this old SAC data go alive? Like, do you said two, three years ago? Sometimes during the pandemic. So it was sometime after 2020, maybe 2021. And so we added that and then, you know, what happened? Basically, Bubba Baker, who was a player in the, I'm going to screw up the year, but like in the 1960s, this SAC data said he was the all-time single-season sack leader. And it was pretty buttoned down.
Starting point is 00:40:52 I mean, they knew what they were doing. And, you know, I thought, well, that's really cool. That probably means a lot to them. But then like this, some people were like reaching out to interview him. And he was like breaking down and crying during the interviews because people had finally recognized his contributions to the game and stuff. And it was really beautiful. And, you know, not something that you really imagine you're going to have an impact doing. And so, you know, so that's been great.
Starting point is 00:41:17 I mean, the other project that's very similar to that is, you know, we launched neighbor league baseball stats in 2021, the summer of 2021. You know, Major League Baseball, you know, we had come. kind of come to that conclusion ourselves, but Major League Baseball also said, you know, the Negro leagues or major leagues should be considered major leagues. And so we, you know, kind of of the one of the biggest projects we'd ever done was integrating the Negro League stats into the White Major League stats, the AL and the NL. And so it's, you know, there's, you know, we kind of take these issues very seriously and like data democratization is a big value of our company. And I think that shows up on the soccer side because we're like really the only
Starting point is 00:41:58 source for a lot of women's soccer data that's available anywhere on the web. And so, you know, we had a full, we actually did a big, big project to get Women's World Cup data available in 2019. So we had, like, we have a record of every game in the Women's World Cup on our site, in the history of the Women's World Cup on our site. All the players, we have stats for all of them. You know, we, you know, I think, as far as I know, I think we may be the only licensee of Opta's women's soccer data across eight you know eight top leagues we have you know full advanced stats from the Fraulein Bundesliga NWSL women's World Cup women's euros you know La Liga Fomenina you know Italian you know French
Starting point is 00:42:45 etc so we you know it's it's important to us that we you know those sites those pages are not getting as much traffic right now as the men's side but I think you know we're hopeful that over the next five to ten years those are going to grow and we're going to be there, you know, allowing people to do the same sorts of studies on the women's side that they can do on the men's side and the same sorts that have the same sorts of insights. And so, you know, that's something that's also, you know, been very important to us. So it's, you know, I would encourage people. If you're a WOSO fan, I absolutely believe we're by far the best source for women's soccer
Starting point is 00:43:20 data that's available. What's the ratio of, like, you said the traffic's not as, not as high. as the men's side. Is it like 10 to 1, 5 to 1? It's probably more like, you know, 15 to 1 at this point. It's definitely, it's definitely slanted. But it's, you know, we saw a nice uptick around the Women's World Cup. We did a newsletter that got, you know,
Starting point is 00:43:42 probably 10, 15,000 subscribers to it. So it's, you know, there's an audience there for it. And we're definitely trying to serve. Yeah. Hopefully the audience keeps growing. Let's take a little break. Come back in a second and talk just strictly about soccer for a while. Okay.
Starting point is 00:44:02 The second half of the episode will be available to patrons. Much more to discuss about FBREF, the data, how to interpret it. The link to our Patreon is in the show notes. Thanks for listening. We'll see you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.