The Data Stack Show - 258: Confidently Wrong: Why AI Needs Tools (and So Do We)

Episode Date: August 20, 2025

This week on The Data Stack Show, John and Matt dive into the latest trends in AI, discussing the evolution of GPT models, the role of tools in reducing hallucinations, and the ongoing debate between ...data warehouses and agent-based approaches. They also explore the complexities of risk-taking in data teams, drawing lessons from Nate Silver’s book on risk and sharing real-world analogies from cybersecurity, football, and political campaigns. Key takeaways include the importance of balancing innovation with practical risk management, the need for clear recommendations from data professionals, the value of reading fiction to understand human behavior in data, and so much more. Highlights from this week’s conversation include:Initial Impressions of GPT-5 (1:41)AI Hallucinations and the Open-Source GPT Model (4:06)Tools and Determinism in AI Agents (6:00)Risks of Tool Reliance in AI (8:05)The Next Big Data Fight: Warehouses vs. Agents (10:21)Real-Time Data Processing Limitations (12:56)Risk in Data and AI: Book Recommendation (17:08)Measurable vs. Perceived Risk in Business (20:10)Security Trade-Offs and Organizational Impact (22:31)The Quest for Certainty and Wicked Learning Environments (27:37)Poker, Process, and Data Team Longevity (29:11)Support Roles and Limits of Data Teams (32:56)Final Thoughts and Takeaways (34:20) The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it’s needed to power smarter decisions and better customer experiences. Each week, we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, I'm Eric Dots. And I'm John Wessel. Welcome to The Datastack Show. The Datastack Show is a podcast where we talk about the technical, business, and human challenges involved in data work. Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. Before we dig into today's data. episode, we want to give a huge thanks to our presenting sponsor, RudderSack. They give us the equipment and time to do this show week in, week out, and provide you the valuable
Starting point is 00:00:38 content. RudderSack provides customer data infrastructure and is used by the world's most innovative companies to collect, transform, and deliver their event data wherever it's needed all in real time. You can learn more at Rudderstack.com. Welcome back to the Data Show. Got Matt, the cynical data guy here with us today. Matt, welcome to the show. Yo, I'm here. Right. We're going to break from our norm a little bit today. That may have a link to share with us, but we actually have a couple topics we want to cover today. So I'm excited to jump into those. And then I think this is our first debt where we've got a cynical data guy recommendation for a good read. It's a cynical recommendation. Stay tuned.
Starting point is 00:01:26 Stay tuned for that at the end. Okay. So we're going to just launch right into AI stuff today. First topic here. I'm curious when it comes to data or just maybe every day. What are your thoughts on GPT5, cynical data guy? So mine are probably a little different because I don't need to code as much with it. But I don't know. You know, you read stuff and people are like, this is the beginning of super intelligence. Reed Hoffman. And then you get others that like it's total crap and to me it was you know it's about the same it's a little maybe a little bit better in some ways it's not as sycophantic which i appreciate but otherwise i don't know i do notice that it doesn't take my instructions very well so i'd like to know how to fix that problem
Starting point is 00:02:13 but other than that it's not a ton different from what at least what i use it for yeah you've mentioned before the show that i thought it was a fascinating response to the sycophantic like nature If you will, and that like some people, and I don't want to like put words in their mouth. Maybe this wasn't the reason they missed it, but it seemed like some people missed 4-0 because of the like personality and like kind of style, because they did seemingly make some stylistic choices for the default between 5 and 4-0. Yeah, any hot takes are interesting experiences when it comes to that, that stylistic change. I mean, I think probably the biggest ones are the people who love to use it for their like,
Starting point is 00:02:55 you know the like AI girlfriend or boyfriend or you know the one you know there's a whole group of people that really love the fact that it will tell them just how amazing they they are i saw an article the other day about a guy who was having a conversation with four oh and was convinced he had figured out like a whole new realm of mathematics wow because four oh told him that he's just asking questions others aren't comfortable with and stuff like that what was the reality though oh the reality was it was total crap you know it was like it was like it was just it was one of these like branching off of pie or something like that and so he'd come up with this he'd even given it a new name for what
Starting point is 00:03:42 this thing was and then he went and searched for something on google and jemini was like this is an example of an lm that tells you it tells you this is really brilliant when really there's nothing there. I kind of burst this bubble. Yeah. It happens. Okay. So I do have one, I do have one LinkedIn submission for us
Starting point is 00:04:02 if you don't mind because it's related to this topic. Sure. Go ahead. All right. So this is on the GPTOSS model. So if you're not following so this is like the subheadline stuff that kind of happened right before the 5GPT5
Starting point is 00:04:19 release around the open source model. So So here's the bits of the post. It's a longer post. So I'm going to edit it a little bit. All right. So I'm obsessed with GPTOSS, a model that hallucinates over 91% of the time. So 120 billion variant, still nearly 80, the larger variant, which is the 120 billion, still nearly 80%.
Starting point is 00:04:42 We're talking about a model so unreliable to instinct would lose to a magic eight ball in a geography quiz. And that's exactly the point. So he goes on, and this is what I think is really interesting. And he says, and the guy like poster here, it says, a model that hallucinates 91% of the time, how can that possibly be safe? And he says, it's not. And then this is like the really interesting part. If you deploy either of these models, you know, open source for GP5,
Starting point is 00:05:10 you'd get fired faster than a recruiter slides into your DMs right after it suggests users eat a rock a day like geology. It's like geologists recommend. But the open, and then, like, the Open AI team knew this and made it a feature, not a bug. But they built models that reached for tools instead of hallucinating facts. So then he goes on to, like, start, cite some stats on how well this thing performs when you give it tools. So I thought that was really interesting. And I don't know enough technically to know if there's an intentional tradeoff thing here. But I thought it was fascinating.
Starting point is 00:05:45 And I didn't realize until I read this that apparently, in this like GPT open source model that it's really heavily reliant on tools to get you know to make it useful so yeah what are your initial thoughts or reactions to that clearly this is the beginning of Skynet and go out and learn on its own
Starting point is 00:06:04 we can't have this the tools is a big thing obviously you know even the place I work at it's a large part of how we we have our agents work is there's a set of tools they can go and they can call on I think it also makes me have a little bit of a side laugh from all the people who are like LLMs can do everything. And it's like, yeah, you still need some deterministic stuff in control.
Starting point is 00:06:29 And a lot of ways it's kind of more of like, you know, I think there's there are forms of this where the, the LLM is not the primary hub of what's making decisions. but there's probably some deterministic kind of, you know, app or something or whatever you want to call it, like controlling it. And it is more of like the interface or the translation for a lot of things. And I think that's probably, especially for these things where you want to be able to say, oh, we're going to have it and it's going to be able to do all these different things. Well, yeah, you're going to have to use tools for that. And the tools are deterministic in a lot of ways. Yeah. Again, this is a long post.
Starting point is 00:07:09 But at the end of the post, he says, what if we made a mob? that's confident, fast, and wrong about everything unless you give it a calculator. And essentially says they nailed this. This is fast, a tool-first model that you don't need to run in the data center. Really, really, and I don't have a lot of first-hand experience yet with this model, but really interesting approach here
Starting point is 00:07:32 because, like, when I'm watching GPT-5 or these other, like, the content coming out around them, it's all like, we've reduced hallucinations by X percent. Like, there's all this, like, interesting work being done on the commercialized products. And then it's interesting to the open source. Like, we're going to swing the exact opposite way coming from, you know, the same companies. Right. So I thought that was.
Starting point is 00:07:57 Because a lot of the Open AI five announcement was around less hallucinations was, like, part of the pitch. Right. So it was fascinating. I think it's also going to have expose you a little bit to, like, okay, so you're, you know, what if there's a bug in one of your tools or what if tools down is the one thing you can say about some of these large foundation models is at least they can pull from their own like quote unquote knowledge set you know from their training data if you don't have that you're going to still you could run into some problems if you get some silent failures in the background yeah like we're always like having this human analogy thing
Starting point is 00:08:35 right where they comes into play with agents like people are like oh think about it like a junior I've heard that 100 times as you have too. Like, think about it as an intern. Think about it as a specialized researcher. So applying that here is scary. Think about it as somebody that has all access to all your systems and super powerful tools. But in and of itself, can't do anything outside of the tools.
Starting point is 00:08:57 Like, it was a little scary. Oh, it's a 23-year-old at McKenzie. Good. Good to know. Okay. Oh, man. All right. So this is the Datastack show. I got to ask this question. related to, you know, again, we've had this, these latest models come out.
Starting point is 00:09:14 How do you think this relates to data? How do you think it relates to the model that we've been under, which is some form of, if you're a data lake person or a data warehouse person, some form of like, hey, let's get all the data in one spot or at least kind of accessible from one spot versus the other paradigm of like, oh, like MCP and tools, like that's the future. You're like, let's just have the AI things reach out to where the data lives and it's home. And, like, it's responsible for, like, gathering everything and it can do all the collection and analysis. And, like, we don't have to worry about all this other data stuff.
Starting point is 00:09:48 Oh, is that a question for me? Yeah, I guess I'm curious. I'm happy to react to it as well, but I'm curious, what do you, from what you've seen, what do you think? Do you think there's enough out there to think that we're moving in the one direction of, like, essentially, like, like, yeah, tools, MCP, like, data's just going to live in home systems and, like, the AI can, like, take care of it? Or do you think there's still, like, a compelling kind of warehouse, lake house, you know, component to people's stacks? I can reject your premise and say, this is going to be the next big data fight over the coming years. this will be
Starting point is 00:10:30 Python versus R for AI because my bet is and I know from us talking that I think you have a similar view it's going to depend on your use case it's going to depend on how much data you have it's going to depend on how you're going to use it
Starting point is 00:10:45 because there's some situations where I think it makes a ton of sense to be like leave it where it is have an age and go get it. Give it tools and let it go and there's other ones where it's like no we really need to get this
Starting point is 00:10:57 everything needs to be in one spot for a variety of reasons and of course we will have no subtlety on this and we'll have team warehouse and we'll have team agents and they will just battle on LinkedIn about this. Yeah
Starting point is 00:11:13 I think I yeah we talked about this for the show I think I agree on some level here but I think I like to think about it like who are these actual people and I think there's like data people who are like most comfortable with SQL
Starting point is 00:11:27 the warehouses, their happy place, or the lakehouse of their happy place, they're going to tend to opt for that solution. And there's going to be pros and cons. The one I can think of right now is with given technology, AI included, I don't know of any practical way to do a good job of taking millions of records from multiple systems, all in-flight, like, in-memory doing complex analysis and transformations and applying all this business logic and getting something used. full. I haven't seen that. If that's out there, man, that'd be cool to see, but I have not seen that. And I think there's some just practical, like, like, we were talking about recently around, like, context windows. Can't do it all in context. I know that much. Maybe there's some clever, like, vectorization, rag stuff you could do, like, end memory. But that still seems, like, fairly out of reach, given that. That also feels like, even if you pulled all that stuff and threw it in memory, that's going to get very expensive, very quickly.
Starting point is 00:12:25 Yeah. Right. Right. And the blessing occurs of AI is people are pushing hard to get on the democratization of it, which is great on a lot of levels. But like, say you did get all that working with some like magical cool and memory, you know, vectorization stuff. And then you like let a bunch of people loose on it. Like, say it works, that's been crazy expensive. And nobody will have a quantitative idea. Like, I just asked the question. Like, I didn't know that was going to sell the $1,000 dance. for that question. Right. Well, I think that's also, because it gets caught up in that whole, we want real time, we want real time, we want real time. And if you want real time, the idea of agents and, oh, we can just pull it whenever we need it. And it's the most recent.
Starting point is 00:13:08 It'll be great and wonderful. We'll fool you, partially because, as you said, if you're pulling millions of records and then trying to somehow stick them together and do something with them from different systems, real time is not going to be real time at that point either. Right. Right. It's going to take a long time for that to process. relatively speaking.
Starting point is 00:13:27 And then you look at the thing. It's supposed to be real time. What's happening here? Why did this take three minutes to run? Because you pulled 12 billion records from three different systems and they had to be joined together and do all this other stuff with them. Yeah. That's funny.
Starting point is 00:13:42 Yeah. So I like to think of the data persona that I'd think of like a developer persona who like, if they need data, like maybe they, maybe even historically they reach for like Python and just hit an API. Like that's where they're most comfortable. triple or whatever language they like. If that's my persona, then I think, like, oh, man, like, MCP's so cool. And, like, I'm just going to, like, anytime I have to do any analysis, like, I'm
Starting point is 00:14:05 going to reach for, like, AI tool MCP. If it gets kind of complicated, I'll just have it dump out Python and I'll run the Python again, you know, if I need whatever it was. I think that's, I think that's a valid way to do it from an analysis standpoint. And I'm imagining, like, a, you know, a more technical team that, like, yeah, like, we don't, we haven't hired any analyst yet or something, and they're kind of, that's a persona. And then the third that's interesting to me is the,
Starting point is 00:14:32 is kind of almost like an integrations, what used to be an integrations engineer, like a data engineer like that persona. Like they're mainly concerned with like moving data around. It's like, how are they going to feel about this problem where like, they're very familiar with APIs moving data that way. They're also very familiar with databases and like that way. Like, what are they going to reach for?
Starting point is 00:14:51 So I think that's going to be a really interesting. like evolution I'm going to point out you just defined the two sides of my LinkedIn war right there and you described why you're going to have it because you're going to have one side that's like I like databases this is what I'm used to
Starting point is 00:15:08 therefore we should do it you're going to have another side that's like APIs are all you need why are you doing anything else and they're going to just talk past each other in greater escalating posts and conferences and stuff like that Yeah. That's how you know there's an escalation. It starts with social media and post. And then the ultimate escalation is like literally separate million-dollar sponsored conferences with like essentially opposing views.
Starting point is 00:15:36 Where they just take shots at each other just because. Yeah. Right. Right. Oh, man. We're going to take a quick break from the episode to talk about our sponsor, Rudder Stack. Now, I could say a bunch of nice things as if I found a fancy new tool. But John has been implementing Rudder's for over half a decade. John, you work with customer event data every day and you know how hard it can be to make sure that data is clean and then to stream it everywhere it needs to go. Yeah, Eric, as you know, customer data can get messy. And if you've ever seen a tag manager, you know how messy it can get. So Rudderstack has really been one of my team's secret weapons. We can collect and standardized data from anywhere, web, mobile, even server side, and then send it to our downstream tools.
Starting point is 00:16:20 Now, rumor has it that you have implemented the longest running production instance of Rutterstack at six years and going. Yes, I can confirm that. And one of the reasons we picked Rutterstack was that it does not store the data and we can live stream data to our downstream tools. One of the things about the implementation that has been so common over all the years and with so many Rutterstack customers is that it wasn't a wholesale replacement of your stack. it fit right into your existing tool set. Yeah, and even with technical tools, Eric, things like Kafka or PubSub, but you don't have to have all that complicated
Starting point is 00:16:57 customer data infrastructure. Well, if you need to stream clean customer data to your entire stack, including your data infrastructure tools, head over to rudderstack.com to learn more. All right, so I want to leave plenty of time for this. Let's talk. I want to talk risk now,
Starting point is 00:17:15 which I always think of the fun topic. when it comes to some of this data and AI stuff. But you specifically read a really interesting book. And this isn't actually risk, like risk, I think people, oh, security and PI, like, you know, privacy and stuff. Not that kind of risk. So I want to you to you up for your cynical data guy recommendation here on a book that you read recently.
Starting point is 00:17:37 Yes. So in case you guys don't know, twice a year I put out on my own stuff, basically, what did I read and what are some of the highlights of it? And my big highlight from the first half of this year was I read On the Edge by Nate Silver. So if you've never heard of Nate Silver, he's a guy who started with like kind of baseball stats and predictions and stuff like that. He's probably most well known for at this point when he moved over to do election predictions. And that was he was the founder of 538 that did all of the like, you know, predictions of who's going to win the elections, who's going to win the Senate, all those types of things. he is also a like semi pro professional poker player he used to be he used to make his living this way for a short period too so this is his book on basically looking at the world of risk taking and how people kind of quantify it how they work through it how they live with it um specifically through the lens of professional poker players who have a very high risk tolerance but also like pride themselves on being very good at quantifying risk and like you know and chances and everything so
Starting point is 00:18:41 I thought that was, it's an interesting book in that sense. If you like poker, you will like some of the, a lot of the stories that come out of it. One of my biggest takeaways from this, and this is why I wanted to kind of give it as a recommendation is, one of the things he talks about is how people don't take enough risks in their own careers and jobs. And that's something that I feel like in my time, I have seen in the data world of you have these people who, in theory, are supposed to be helping businesses make better decisions, quantify risk, and for some reasons, you know, we get into what we think they might are, I have noticed a lot of data people are extremely risk adverse to the point of where like they don't like to recommend
Starting point is 00:19:24 things that have risk to them. They don't like to do stuff in their own lives that they feel have risk associated with them. And I think it's one of these things where it's hurting the individuals. And I think it's also hurting like data teams. and companies, too, that we're going through this. So this is kind of like, as part of this book recommendation, it's kind of my pitch to people to like take some more risk. You cannot live a risk-free life. And you're probably quantifying risk wrong anyways in your attempts to do so.
Starting point is 00:19:53 So yeah. So that's kind of the overview I would have from it. Like I said, it's a good book in that sense that you get to, you know, it's a narrative that you get through it. But I think for a lot of people, it's this idea of like you're actually being riskier than you think in your attempts to minimize risk, if that makes sense. Yeah, I'd like to drill in on that point because this is something that I've sought, I've thought some about not in a little while. I've read some of Nate Silverers other books
Starting point is 00:20:20 and read some other books that kind of touched this topic. And I think, I think it's really interesting to look at it from the, and I'll give an example in a minute, to look at it from the business perspective of the measurable versus unmeasurable risk perspective. And then like the perceived risk versus like actual risk. Like I think there's probably more accesses than that. But I think the most interesting one to me is actually how companies treat cybersecurity and security. Yeah. I think that's a really fascinating one, especially like there's one that I've interacted with,
Starting point is 00:20:57 company I've interacted with that in my opinion the actual biggest risk for the company was the company sailing doing essentially imploding doing to such high sauce of ever getting anything done yeah from like layers of and I've seen this happen like when smaller companies like fall to the trap of hiring tons of employees that are used to being a multi-billion dollar companies and start running the $20 million company like a multi-billion dollar company doesn't typically go well. So from an outsider,
Starting point is 00:21:34 like, okay, the risk here, in my opinion, is like, you don't land clients like in the business shuts. Like, you don't blame clients, you can't move fast enough to like satisfy the needs of your existing clients and your business shuts down. Like, that is actually the biggest risk.
Starting point is 00:21:50 Yeah. But a lot of the conversations internally are all about like, like, minutia around like security, like very, like very, specific security protocols and this, that, and the other, because they had a, they had a cyber event, like, several years ago that was a big deal and, like, it caused a lot of disruption for the company. So, it's so interesting how these pendulums can swing, so you got $20 million
Starting point is 00:22:12 dollar your company, like, rough numbers, like, three years ago that was just operating wildly, probably wildly insecure, like, not thinking about it at all. And, like, and, but maybe, like, you know, a little bit better, go-to-market motion and like a little bit better speed and like getting things done for customers, right? And then like you fire a layer of management, you fire some people, this was a major cybersecurity thing,
Starting point is 00:22:37 this was a black guy for the whole organization, lots of drama and you fire a bunch of people. Then you bring in like the risk-free team, you know, you bring in like, oh, they worked at like X Fortune 500 company and Y, Fortune, like they were going to eliminate all the risk. And they do, the job that you hired them to do,
Starting point is 00:22:54 in a very real sense and, like, eliminate all the risk. But what you don't realize it can happen is you just, like, essentially break your go-to-market machine, you break your service that you used to be, like, fast and reflective, and now everything's like layers of ticketing systems and, like, and then, like, you know, it takes three months to ongoing or a client because of the security protocols. So, like, you break all that. It's a sense of more organized and a sense it's more secure for sure, but you break the thing that, like, your customers loved about you.
Starting point is 00:23:22 and then like you can accidentally essentially kill the whole company so like great you've got this like locked up secure process driven thing that's going to die yeah yeah no it's and it's that idea of also not being able to tell the difference of like when risk is okay and when risk isn't okay because it's like you know you get through you know you get into situations where it's like oh man if we do this we might you know as a company it might hurt us or something like that it's like yeah we are already slowly dying like Like, it does not matter at this point, whether we hold it off for six months or something like that. Like, you need to do something different there versus when you're in other situations where it's like, well, no, we don't need to take as wild of a swing. But you're always going to take on some level of risk. You cannot go if you're not taking on some level of risk. Right. And that's like where I've seen that with, you know, you get like data teams that have analysts on them or, you know, when you have engineers recast as analysts, which is usually not a good role for them anyways. Right.
Starting point is 00:24:20 And there's this complete reluctance and refusal to make a recommendation because the recommendation could be wrong or there could be this. Or there's pros and cons to each choice. And so they try to hide behind this like, well, here's what the data says. Right. And all you're basically doing is showing a bunch of information, but you're not telling people what makes sense to do. Right. And now you get into a situation where they're like, well, but I don't want to, you know, like, oh, but I could be wrong. I don't want to be wrong.
Starting point is 00:24:48 like this sense of like I'm going to lose something from doing that. One of the reality is that everyone's just getting pissed at you because all they feel like you're doing is compiling spreadsheets and handing charts and handing it. Right, right. Like you're not of any value if you're not actually pushing something forward. Yeah. Well, but the problem is the incentives.
Starting point is 00:25:06 Go back to the security thing. If I'm like the security officer or person that got tasked for security and it's only like my part-time job, which is like a lot of companies that would be in this like market space, like, if I have a major security incident on my watch, I'm held responsible, bad things happen to me. Maybe I get fired. Maybe I get demoted. Like, whatever. Yeah. Really bad things happen. If I'm, let's flip it the other way, if I'm, like, going to have a massive security budget, spend tons of money on it every year, like, lock everything down super tight, get in the way of
Starting point is 00:25:37 everybody working, and then just say, like, well, it's in the name of security. And then, like, and then I'm a typical, say, I'm the, like, typical CEO. Like, I don't know who's right. I don't know, it's like less security, we'd still be fine. Like, how am I supposed to know? Right. And in one sense, like, how is anybody supposed to know? Because, like, cybercriminals are getting better and more crafty every day. Like, there's new attack vectors.
Starting point is 00:25:57 Like, there is a real sense where this is one of my favorite topics when it comes to risk because, like, it is nearly impossible or it is impossible to fully quantify, like, hey, what's my cyber exposure, you know, at my company? It's like, well, do you have people that work there? Well, you have exposure. Good people work there. People are the biggest weakness you're going to have in any company. And then obviously there's some really great tools out there and layering like an AI solutions, right, and all sorts of things from like your inbound communication, from your network perspective, from your, you know, desktop, whatever.
Starting point is 00:26:32 So there's tons of like good solutions in the space and people that are good at implementing it and such. But like, to me, I think it's a fascinating space because the people that like are able to nail, the, hey, let people still do their jobs, part of it, are the ones that really can take on so much of the market, but can balance it reasonably with risk. And it's not in either or. There are plenty of, like, I think, secure solutions out there that don't necessarily have to, like, make people's world impossible. But there's at least occasional tradeoffs and occasional small trade-off that I think, at least for a while. Like, maybe this changes, is like, the, like, people that can, like, successfully quantify, like,
Starting point is 00:27:19 hey, this tradeoff makes sense. The, you know, we're going to do it. Like, that's valuable and very hard to quantify. And therefore, like, because it's hard to quantify, like, it's easier to opt toward, like, well, just to be saved. Right? Yeah. I think you're also getting towards the two things that I see from that are there's
Starting point is 00:27:41 this quest for certainty, and there is no certainty. You are always bearing a certain level of risk one way or the other. Cannot have certainty. You can have clarity on what your strategy is going to be and the risks you're willing to take, but you cannot have certainty on it. And to kind of go with that, like when you're in that position as a cybersecurity person, and I think this, I would say it also applies to like a lot of data teams, you're in kind of this wicked learning environment of like,
Starting point is 00:28:11 cause and effect are not always going to be coupled. Even more so, you can do everything right, quote unquote, right, and it could still not go well. Yeah. Right. You're a security person, you can find the perfect balance, and you still have a breach. And now it's your fault. And that's the like, and that's the, and I really feel for security people on teams on this, like, because you could have a security breach, and it's literally like a one in a million, like thing that happened where it was like an immediate exploit.
Starting point is 00:28:41 of a bug nobody knew about and they got into like this thing and like you you would always historically patched your firewall every week like you could be like on it and then there's this like breach it's like literally not your fault and the exact same thing could happen to like somebody that's completely lax like doesn't know what they're doing and there's no like I mean as a technical person you could kind of probably suss this uh suss that out but like downstream to like customers of customers like nobody cares right it happened you know yeah And to bring it back to kind of the book, this is one of the things that like if you're going to be a good poker player, you have to learn to do, which is can I quantify the risk based on the knowledge I have? And am I okay with the fact that like, you know, yeah, I've got I should win this hand 75% of the time. That still means one out of four times I'm going to lose. And it's not about the result necessarily. It's about the process. And I think for like a lot of data teams, like I've written about this before, the idea of like you've got like two years. if you're like a new data, you know?
Starting point is 00:29:43 Yep. And there is a chance that you will do everything right. You'll work to build the right culture, get the right foundation, and you will not get a project that will actually get you what you need. And you're going to be out in two years. And how are you going to handle that? Is it something that you can look at it and say, you know what? I know this will work and I was just unlucky in this situation.
Starting point is 00:30:03 Or are you going to like overreact and be like, okay, I don't care about any of that. We just need to get the things to the people right now as fast as we can. we will just bubble gum and duct tape it for as long as we have to. Yeah. Well, since we're coming up on fall, I think, and we've talked about this before, like, it's that football analogy, right? You come as a head football coach to the team, and you've got, I don't know, maybe you have a year nowadays, maybe you have two years depending on,
Starting point is 00:30:29 it's like, it's a similar thing we're like, okay, every, you know, recruiting's broken, like, I don't have the talent I need. I've got a number of coaches that, like, I need to fire, but I have to keep them for a certain amount of time because my boss told me to keep like you could have so many variables because they just fired the previous head coach and they're going to be paying him for the next four years so they don't want to do that with any other coaches. Yeah, they don't have money to, yeah, you don't have money to recruit the coaches you need. And I think people end up in the data equivalent of that. Yeah. And part of the people that are successful to be quite honest are the
Starting point is 00:31:03 people that suss out the situation ahead of time and don't take the job. Honestly, that's part of it. And then the other part of it is the people that in the context can develop. the skills, tools, and processes to be successful, realized that, like, in two years, like, all right, I was essentially set up for failure. Like, I couldn't have been successful. But I can go somewhere else, and I learned what I needed to, refine what I needed to,
Starting point is 00:31:25 and I can be successful somewhere else. Both of those are options. Yeah, and sometimes it's going to be a thing, because the football analogy is my favorite one for that, because it's like, what's one of the most important things if you're going to be successful in the long term as a football coach? It's like, you've got to have the right culture in play
Starting point is 00:31:39 that's going to sustain you. You know what? doesn't win right away, having the right culture in place. And so you get this thing of this weird balance between that. And you can look at really good coaches. And it's a little bit of like the pieces had to fall together. And they didn't fall together in the first time, but they fell together in the second time. Or they were there for the first time, but not the second time they did it.
Starting point is 00:32:01 And like, there's still good coaches. There's still great coaches. They still have it there. But there was that one situation where they didn't do it. You know, you can even look at like, it was the former head coach. like the Carolina Panthers, Matt Ruhle, right? Very successful before. He's been successful since.
Starting point is 00:32:17 There were things that got in the way from him to be successful, but it wasn't like he necessarily was the one who completely screwed it all up or something. You can still see some of the same good qualities there. But it didn't work out and he didn't have time. And there's a bunch of things that go into that. And so I think that's with a lot of the data teams, you get that too.
Starting point is 00:32:35 And kind of like what you said, you got to be willing to have that idea of like, I may do everything right. And in two years, it won't work. Right. And even in the situation, where you're like, this is a slam dunk, it doesn't always work. Yeah, and can you handle that?
Starting point is 00:32:47 And are you okay with that? And can you kind of be like, can you tell the difference between here's what I need to change and here's what I, here's what it just didn't work out this time. Yeah. And for data teams, like, you're in a support role, right? It's a support role. And like, imagine you were like working on a political campaign and like, and you're like helping to try to get somebody elected.
Starting point is 00:33:07 It's like, okay, like, cool. Like, I can nail it as the data. team here, but, like, if we lose, like, we lose, and it's not really my fault. Like, I can contribute to winning by whatever data teams with political organizations do. Not super familiar with the space, but, like, I can't really affect winning by any, like, direct contribution, even if I'm over data for the whole, like, thing. I mean, I can affect it to some extent, but, like, if there's good, there can be wins against me where, like, literally this, this won't happen regardless of how hard I work, how good I am, you know, what we
Starting point is 00:33:41 You can have the best understanding of the electorate. You can have the best targeting that you've got out there. You can know where all the persuadables are. If your candidate is a terrible speaker, or if their policies are just not popular in that cycle, it doesn't matter at that point. Yeah, exactly. Right. Can you tell I actually worked on a campaign? Yeah, I brought that up when I was like halfway through this.
Starting point is 00:34:03 I was like, oh, yeah, Matt worked on this. Like, I'm glad he's able to speak to it. Awesome, man. We can go forever. I think we're almost at the buzzer. other little tidbits maybe from your from your kind of six month summary here like any learnings or
Starting point is 00:34:16 tidbits through your six months reading and learning summary. I will so this is always my plug to tell people to read more fiction because you will learn more especially if you want to be a leader in any space like you have to know people and you're not going to learn
Starting point is 00:34:33 that from a textbook you've got to learn it from actual people and fiction and novels and things like that are a very good way of doing that. So It's my, every time I talk about this, it's always one of my plugs I put in there. So I think that's always a good one. Nice. So, yeah.
Starting point is 00:34:48 So there you go. On the edge by Nate Silver. And read some fiction. Get away from all of the pseudoscience crap out there. Yeah. All right. Awesome. Thanks for coming on, Matt.
Starting point is 00:35:01 And we will catch you next time. The Datastack show is brought to you by Rudderstack. Learn more at rudderstack.com. I'm going to be able to

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.