The Data Stack Show - 258: Confidently Wrong: Why AI Needs Tools (and So Do We)
Episode Date: August 20, 2025This week on The Data Stack Show, John and Matt dive into the latest trends in AI, discussing the evolution of GPT models, the role of tools in reducing hallucinations, and the ongoing debate between ...data warehouses and agent-based approaches. They also explore the complexities of risk-taking in data teams, drawing lessons from Nate Silver’s book on risk and sharing real-world analogies from cybersecurity, football, and political campaigns. Key takeaways include the importance of balancing innovation with practical risk management, the need for clear recommendations from data professionals, the value of reading fiction to understand human behavior in data, and so much more. Highlights from this week’s conversation include:Initial Impressions of GPT-5 (1:41)AI Hallucinations and the Open-Source GPT Model (4:06)Tools and Determinism in AI Agents (6:00)Risks of Tool Reliance in AI (8:05)The Next Big Data Fight: Warehouses vs. Agents (10:21)Real-Time Data Processing Limitations (12:56)Risk in Data and AI: Book Recommendation (17:08)Measurable vs. Perceived Risk in Business (20:10)Security Trade-Offs and Organizational Impact (22:31)The Quest for Certainty and Wicked Learning Environments (27:37)Poker, Process, and Data Team Longevity (29:11)Support Roles and Limits of Data Teams (32:56)Final Thoughts and Takeaways (34:20) The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it’s needed to power smarter decisions and better customer experiences. Each week, we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Hi, I'm Eric Dots.
And I'm John Wessel.
Welcome to The Datastack Show.
The Datastack Show is a podcast where we talk about the technical, business, and human challenges involved in data work.
Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies.
Before we dig into today's data.
episode, we want to give a huge thanks to our presenting sponsor, RudderSack. They give us the
equipment and time to do this show week in, week out, and provide you the valuable
content. RudderSack provides customer data infrastructure and is used by the world's most
innovative companies to collect, transform, and deliver their event data wherever it's needed
all in real time. You can learn more at Rudderstack.com. Welcome back to the Data
Show. Got Matt, the cynical data guy here with us today. Matt, welcome to the
show. Yo, I'm here. Right. We're going to break from our norm a little bit today. That may have
a link to share with us, but we actually have a couple topics we want to cover today. So I'm
excited to jump into those. And then I think this is our first debt where we've got a cynical
data guy recommendation for a good read. It's a cynical recommendation. Stay tuned.
Stay tuned for that at the end. Okay. So we're going to
just launch right into AI stuff today. First topic here. I'm curious when it comes to data or just
maybe every day. What are your thoughts on GPT5, cynical data guy? So mine are probably a little different
because I don't need to code as much with it. But I don't know. You know, you read stuff and people
are like, this is the beginning of super intelligence. Reed Hoffman. And then you get others that
like it's total crap and to me it was you know it's about the same it's a little maybe a little bit
better in some ways it's not as sycophantic which i appreciate but otherwise i don't know i do
notice that it doesn't take my instructions very well so i'd like to know how to fix that problem
but other than that it's not a ton different from what at least what i use it for yeah you've mentioned
before the show that i thought it was a fascinating response to the sycophantic like nature
If you will, and that like some people, and I don't want to like put words in their mouth.
Maybe this wasn't the reason they missed it, but it seemed like some people missed 4-0
because of the like personality and like kind of style, because they did seemingly make some
stylistic choices for the default between 5 and 4-0.
Yeah, any hot takes are interesting experiences when it comes to that, that stylistic change.
I mean, I think probably the biggest ones are the people who love to use it for their like,
you know the like AI girlfriend or boyfriend or you know the one you know there's a whole group of
people that really love the fact that it will tell them just how amazing they they are i saw an
article the other day about a guy who was having a conversation with four oh and was convinced
he had figured out like a whole new realm of mathematics wow because four oh told him that he's
just asking questions others aren't comfortable with and stuff like that
what was the reality though oh the reality was it was total crap
you know it was like it was like it was just it was one of these like branching off of
pie or something like that and so he'd come up with this he'd even given it a new name for what
this thing was and then he went and searched for something on google and jemini was like this is
an example of an lm that tells you it tells you this is really brilliant when really there's
nothing there.
I kind of burst this bubble.
Yeah.
It happens.
Okay. So I do have one,
I do have one LinkedIn submission for us
if you don't mind because it's related to
this topic. Sure. Go ahead.
All right. So this is
on the GPTOSS
model. So if you're not following
so this is like the subheadline stuff
that kind of happened
right before the 5GPT5
release around the open
source model. So
So here's the bits of the post.
It's a longer post.
So I'm going to edit it a little bit.
All right.
So I'm obsessed with GPTOSS, a model that hallucinates over 91% of the time.
So 120 billion variant, still nearly 80, the larger variant, which is the 120 billion, still nearly 80%.
We're talking about a model so unreliable to instinct would lose to a magic eight ball in a geography quiz.
And that's exactly the point.
So he goes on, and this is what I think is really interesting.
And he says, and the guy like poster here, it says,
a model that hallucinates 91% of the time, how can that possibly be safe?
And he says, it's not.
And then this is like the really interesting part.
If you deploy either of these models, you know, open source for GP5,
you'd get fired faster than a recruiter slides into your DMs right after it suggests users eat a rock a day like geology.
It's like geologists recommend.
But the open, and then, like, the Open AI team knew this and made it a feature, not a bug.
But they built models that reached for tools instead of hallucinating facts.
So then he goes on to, like, start, cite some stats on how well this thing performs when you give it tools.
So I thought that was really interesting.
And I don't know enough technically to know if there's an intentional tradeoff thing here.
But I thought it was fascinating.
And I didn't realize until I read this that apparently,
in this like GPT open source model
that it's really heavily
reliant on tools to get you know to make it useful
so yeah what are your initial thoughts
or reactions to that
clearly this is the beginning of Skynet
and go out and learn on its own
we can't have this
the tools is a big thing
obviously you know even the place I work at
it's a large part of how we
we have our agents work is
there's a set of tools they can go and they can call on
I think it also makes me have a little bit of a side laugh from all the people who are like LLMs can do everything.
And it's like, yeah, you still need some deterministic stuff in control.
And a lot of ways it's kind of more of like, you know, I think there's there are forms of this where the, the LLM is not the primary hub of what's making decisions.
but there's probably some deterministic kind of, you know, app or something or whatever you want to call it, like controlling it.
And it is more of like the interface or the translation for a lot of things.
And I think that's probably, especially for these things where you want to be able to say, oh, we're going to have it and it's going to be able to do all these different things.
Well, yeah, you're going to have to use tools for that.
And the tools are deterministic in a lot of ways.
Yeah.
Again, this is a long post.
But at the end of the post, he says, what if we made a mob?
that's confident, fast, and wrong about everything
unless you give it a calculator.
And essentially says they nailed this.
This is fast, a tool-first model
that you don't need to run in the data center.
Really, really, and I don't have a lot of first-hand experience
yet with this model, but really interesting approach here
because, like, when I'm watching GPT-5 or these other,
like, the content coming out around them,
it's all like, we've reduced hallucinations by X percent.
Like, there's all this, like, interesting work being done on the commercialized products.
And then it's interesting to the open source.
Like, we're going to swing the exact opposite way coming from, you know, the same companies.
Right.
So I thought that was.
Because a lot of the Open AI five announcement was around less hallucinations was, like, part of the pitch.
Right.
So it was fascinating.
I think it's also going to have expose you a little bit to, like, okay, so you're, you know, what if there's a bug in one of your tools or what if
tools down is the one thing you can say about some of these large foundation models is at least
they can pull from their own like quote unquote knowledge set you know from their training data
if you don't have that you're going to still you could run into some problems if you get some
silent failures in the background yeah like we're always like having this human analogy thing
right where they comes into play with agents like people are like oh think about it like a junior
I've heard that 100 times as you have too.
Like, think about it as an intern.
Think about it as a specialized researcher.
So applying that here is scary.
Think about it as somebody that has all access to all your systems
and super powerful tools.
But in and of itself, can't do anything outside of the tools.
Like, it was a little scary.
Oh, it's a 23-year-old at McKenzie.
Good. Good to know. Okay.
Oh, man.
All right.
So this is the Datastack show.
I got to ask this question.
related to, you know, again, we've had this, these latest models come out.
How do you think this relates to data?
How do you think it relates to the model that we've been under, which is some form of,
if you're a data lake person or a data warehouse person, some form of like, hey, let's get all
the data in one spot or at least kind of accessible from one spot versus the other paradigm
of like, oh, like MCP and tools, like that's the future.
You're like, let's just have the AI things reach out to where the data lives and it's home.
And, like, it's responsible for, like, gathering everything and it can do all the collection and analysis.
And, like, we don't have to worry about all this other data stuff.
Oh, is that a question for me?
Yeah, I guess I'm curious.
I'm happy to react to it as well, but I'm curious, what do you, from what you've seen, what do you think?
Do you think there's enough out there to think that we're moving in the one direction of, like, essentially, like,
like, yeah, tools, MCP, like, data's just going to live in home systems and, like, the AI can, like, take care of it?
Or do you think there's still, like, a compelling kind of warehouse, lake house, you know, component to people's stacks?
I can reject your premise and say, this is going to be the next big data fight over the coming years.
this will be
Python versus R for AI
because
my bet is
and I know from us talking
that I think you have a similar view
it's going to depend on your use case
it's going to depend on how much data you have
it's going to depend on how you're going to use it
because there's some situations
where I think it makes a ton of sense
to be like
leave it where it is
have an age and go get it.
Give it tools and let it go
and there's other ones where it's like
no we really need to get this
everything needs to be in one spot
for a variety of reasons
and of course we will have
no subtlety on this and we'll have
team warehouse and we'll have team
agents and they will just
battle on LinkedIn about
this. Yeah
I think I
yeah we talked about this for the show
I think I agree
on some level here
but I think I like to think about it like
who are these actual people
and I think there's like data people
who are like most comfortable with SQL
the warehouses, their happy place, or the lakehouse of their happy place,
they're going to tend to opt for that solution.
And there's going to be pros and cons.
The one I can think of right now is with given technology, AI included,
I don't know of any practical way to do a good job of taking millions of records
from multiple systems, all in-flight, like, in-memory doing complex analysis and transformations
and applying all this business logic and getting something used.
full. I haven't seen that. If that's out there, man, that'd be cool to see, but I have not seen that. And I think there's some just practical, like, like, we were talking about recently around, like, context windows. Can't do it all in context. I know that much. Maybe there's some clever, like, vectorization, rag stuff you could do, like, end memory. But that still seems, like, fairly out of reach, given that. That also feels like, even if you pulled all that stuff and threw it in memory, that's going to get very expensive, very quickly.
Yeah. Right. Right. And the blessing occurs of AI is people are pushing hard to get on the democratization of it, which is great on a lot of levels. But like, say you did get all that working with some like magical cool and memory, you know, vectorization stuff. And then you like let a bunch of people loose on it. Like, say it works, that's been crazy expensive. And nobody will have a quantitative idea. Like, I just asked the question. Like, I didn't know that was going to sell the $1,000 dance.
for that question.
Right.
Well, I think that's also, because it gets caught up in that whole, we want real time, we want
real time, we want real time.
And if you want real time, the idea of agents and, oh, we can just pull it whenever we need
it.
And it's the most recent.
It'll be great and wonderful.
We'll fool you, partially because, as you said, if you're pulling millions of records and
then trying to somehow stick them together and do something with them from different systems,
real time is not going to be real time at that point either.
Right.
Right.
It's going to take a long time for that to process.
relatively speaking.
And then you look at the thing.
It's supposed to be real time.
What's happening here?
Why did this take three minutes to run?
Because you pulled 12 billion records from three different systems
and they had to be joined together and do all this other stuff with them.
Yeah.
That's funny.
Yeah.
So I like to think of the data persona that I'd think of like a developer persona who like,
if they need data, like maybe they, maybe even historically they reach for like
Python and just hit an API.
Like that's where they're most comfortable.
triple or whatever language they like.
If that's my persona, then I think, like, oh, man, like, MCP's so cool.
And, like, I'm just going to, like, anytime I have to do any analysis, like, I'm
going to reach for, like, AI tool MCP.
If it gets kind of complicated, I'll just have it dump out Python and I'll run the
Python again, you know, if I need whatever it was.
I think that's, I think that's a valid way to do it from an analysis standpoint.
And I'm imagining, like, a, you know, a more technical team that, like, yeah, like, we don't,
we haven't hired any analyst yet or something,
and they're kind of, that's a persona.
And then the third that's interesting to me is the,
is kind of almost like an integrations,
what used to be an integrations engineer,
like a data engineer like that persona.
Like they're mainly concerned with like moving data around.
It's like, how are they going to feel about this problem where like,
they're very familiar with APIs moving data that way.
They're also very familiar with databases and like that way.
Like, what are they going to reach for?
So I think that's going to be a really interesting.
like evolution
I'm going to point out you just
defined the two sides of my
LinkedIn war right there
and you described why you're going to have it
because you're going to have one side that's like
I like databases this is what I'm used to
therefore we should do it
you're going to have another side that's like
APIs are all you need
why are you doing anything else
and they're going to just talk past each other
in greater escalating posts and conferences
and stuff like that
Yeah. That's how you know there's an escalation. It starts with social media and post. And then the ultimate escalation is like literally separate million-dollar sponsored conferences with like essentially opposing views.
Where they just take shots at each other just because. Yeah. Right. Right. Oh, man.
We're going to take a quick break from the episode to talk about our sponsor, Rudder Stack. Now, I could say a bunch of nice things as if I found a fancy new tool. But John has been implementing Rudder's
for over half a decade. John, you work with customer event data every day and you know how hard
it can be to make sure that data is clean and then to stream it everywhere it needs to go.
Yeah, Eric, as you know, customer data can get messy. And if you've ever seen a tag manager,
you know how messy it can get. So Rudderstack has really been one of my team's secret weapons.
We can collect and standardized data from anywhere, web, mobile, even server side, and then send it to our
downstream tools.
Now, rumor has it that you have implemented the longest running production instance of Rutterstack at six years and going.
Yes, I can confirm that.
And one of the reasons we picked Rutterstack was that it does not store the data and we can live stream data to our downstream tools.
One of the things about the implementation that has been so common over all the years and with so many Rutterstack customers is that it wasn't a wholesale replacement of your stack.
it fit right into your existing tool set.
Yeah, and even with technical tools, Eric,
things like Kafka or PubSub,
but you don't have to have all that complicated
customer data infrastructure.
Well, if you need to stream clean customer data
to your entire stack,
including your data infrastructure tools,
head over to rudderstack.com to learn more.
All right, so I want to leave plenty of time for this.
Let's talk.
I want to talk risk now,
which I always think of the fun topic.
when it comes to some of this data and AI stuff.
But you specifically read a really interesting book.
And this isn't actually risk, like risk, I think people,
oh, security and PI, like, you know, privacy and stuff.
Not that kind of risk.
So I want to you to you up for your cynical data guy recommendation here
on a book that you read recently.
Yes.
So in case you guys don't know, twice a year I put out on my own stuff,
basically, what did I read and what are some of the highlights of it?
And my big highlight from the first half of this year was I read On the Edge by Nate Silver.
So if you've never heard of Nate Silver, he's a guy who started with like kind of baseball stats and predictions and stuff like that.
He's probably most well known for at this point when he moved over to do election predictions.
And that was he was the founder of 538 that did all of the like, you know, predictions of who's going to win the elections, who's going to win the Senate, all those types of things.
he is also a like semi pro professional poker player he used to be he used to make his living this way for a short period too so this is his book on basically looking at the world of risk taking and how people kind of quantify it how they work through it how they live with it um specifically through the lens of professional poker players who have a very high risk tolerance but also like pride themselves on being very good at quantifying risk and like you know and chances and everything so
I thought that was, it's an interesting book in that sense.
If you like poker, you will like some of the, a lot of the stories that come out of it.
One of my biggest takeaways from this, and this is why I wanted to kind of give it as a recommendation is,
one of the things he talks about is how people don't take enough risks in their own careers and jobs.
And that's something that I feel like in my time, I have seen in the data world of you have these people who, in theory,
are supposed to be helping businesses make better decisions, quantify risk, and for some
reasons, you know, we get into what we think they might are, I have noticed a lot of data
people are extremely risk adverse to the point of where like they don't like to recommend
things that have risk to them. They don't like to do stuff in their own lives that they
feel have risk associated with them. And I think it's one of these things where it's hurting
the individuals. And I think it's also hurting like data teams.
and companies, too, that we're going through this.
So this is kind of like, as part of this book recommendation,
it's kind of my pitch to people to like take some more risk.
You cannot live a risk-free life.
And you're probably quantifying risk wrong anyways in your attempts to do so.
So yeah.
So that's kind of the overview I would have from it.
Like I said, it's a good book in that sense that you get to, you know,
it's a narrative that you get through it.
But I think for a lot of people, it's this idea of like you're actually being riskier
than you think in your attempts to minimize risk, if that makes sense.
Yeah, I'd like to drill in on that point because this is something that I've sought,
I've thought some about not in a little while. I've read some of Nate Silverers other books
and read some other books that kind of touched this topic. And I think, I think it's really
interesting to look at it from the, and I'll give an example in a minute, to look at it
from the business perspective of the measurable versus unmeasurable risk perspective.
And then like the perceived risk versus like actual risk.
Like I think there's probably more accesses than that.
But I think the most interesting one to me is actually how companies treat cybersecurity and security.
Yeah.
I think that's a really fascinating one, especially like there's one that I've interacted with,
company I've interacted with that in my opinion the actual biggest risk for the company
was the company sailing doing essentially imploding doing to such high sauce of ever getting
anything done yeah from like layers of and I've seen this happen like when smaller companies
like fall to the trap of hiring tons of employees that are used to being a multi-billion dollar companies
and start running the $20 million company
like a multi-billion dollar company
doesn't typically go well.
So from an outsider,
like, okay, the risk here, in my opinion,
is like, you don't land clients
like in the business shuts.
Like, you don't blame clients,
you can't move fast enough to like satisfy the needs
of your existing clients
and your business shuts down.
Like, that is actually the biggest risk.
Yeah.
But a lot of the conversations internally
are all about like, like,
minutia around like security,
like very, like very,
specific security protocols and this, that, and the other, because they had a, they had a
cyber event, like, several years ago that was a big deal and, like, it caused a lot of disruption
for the company. So, it's so interesting how these pendulums can swing, so you got $20 million
dollar your company, like, rough numbers, like, three years ago that was just operating wildly,
probably wildly insecure, like, not thinking about it at all. And, like, and, but maybe, like,
you know, a little bit better,
go-to-market motion and like a little bit better speed
and like getting things done for customers, right?
And then like you fire a layer of management,
you fire some people,
this was a major cybersecurity thing,
this was a black guy for the whole organization,
lots of drama and you fire a bunch of people.
Then you bring in like the risk-free team,
you know, you bring in like,
oh, they worked at like X Fortune 500 company
and Y, Fortune, like they were going to eliminate all the risk.
And they do,
the job that you hired them to do,
in a very real sense and, like, eliminate all the risk.
But what you don't realize it can happen is you just, like, essentially break your go-to-market
machine, you break your service that you used to be, like, fast and reflective, and now
everything's like layers of ticketing systems and, like, and then, like, you know, it takes
three months to ongoing or a client because of the security protocols.
So, like, you break all that.
It's a sense of more organized and a sense it's more secure for sure, but you break the
thing that, like, your customers loved about you.
and then like you can accidentally essentially kill the whole company so like great you've got this like locked up secure process driven thing that's going to die yeah yeah no it's and it's that idea of also not being able to tell the difference of like when risk is okay and when risk isn't okay because it's like you know you get through you know you get into situations where it's like oh man if we do this we might you know as a company it might hurt us or something like that it's like yeah we are already slowly dying like
Like, it does not matter at this point, whether we hold it off for six months or something like that.
Like, you need to do something different there versus when you're in other situations where it's like, well, no, we don't need to take as wild of a swing.
But you're always going to take on some level of risk.
You cannot go if you're not taking on some level of risk.
Right.
And that's like where I've seen that with, you know, you get like data teams that have analysts on them or, you know, when you have engineers recast as analysts, which is usually not a good role for them anyways.
Right.
And there's this complete reluctance and refusal to make a recommendation because the recommendation could be wrong or there could be this.
Or there's pros and cons to each choice.
And so they try to hide behind this like, well, here's what the data says.
Right.
And all you're basically doing is showing a bunch of information, but you're not telling people what makes sense to do.
Right.
And now you get into a situation where they're like, well, but I don't want to, you know, like, oh, but I could be wrong.
I don't want to be wrong.
like this sense of like I'm going to lose something from doing that.
One of the reality is that everyone's just getting pissed at you
because all they feel like you're doing is compiling spreadsheets and handing charts
and handing it.
Right, right.
Like you're not of any value if you're not actually pushing something forward.
Yeah.
Well, but the problem is the incentives.
Go back to the security thing.
If I'm like the security officer or person that got tasked for security
and it's only like my part-time job,
which is like a lot of companies that would be in this like market space,
like, if I have a major security incident on my watch, I'm held responsible, bad things happen
to me. Maybe I get fired. Maybe I get demoted. Like, whatever. Yeah. Really bad things happen.
If I'm, let's flip it the other way, if I'm, like, going to have a massive security budget,
spend tons of money on it every year, like, lock everything down super tight, get in the way of
everybody working, and then just say, like, well, it's in the name of security. And then, like,
and then I'm a typical, say, I'm the, like, typical CEO. Like, I don't know who's right.
I don't know, it's like less security, we'd still be fine.
Like, how am I supposed to know?
Right.
And in one sense, like, how is anybody supposed to know?
Because, like, cybercriminals are getting better and more crafty every day.
Like, there's new attack vectors.
Like, there is a real sense where this is one of my favorite topics when it comes to risk because, like, it is nearly impossible or it is impossible to fully quantify, like, hey, what's my cyber exposure, you know, at my company?
It's like, well, do you have people that work there?
Well, you have exposure.
Good people work there.
People are the biggest weakness you're going to have in any company.
And then obviously there's some really great tools out there and layering like an AI solutions, right,
and all sorts of things from like your inbound communication, from your network perspective,
from your, you know, desktop, whatever.
So there's tons of like good solutions in the space and people that are good at implementing it and such.
But like, to me, I think it's a fascinating space because the people that like are able to nail,
the, hey, let people still do their jobs, part of it, are the ones that really can take
on so much of the market, but can balance it reasonably with risk. And it's not in either
or. There are plenty of, like, I think, secure solutions out there that don't necessarily
have to, like, make people's world impossible. But there's at least occasional tradeoffs
and occasional small trade-off that I think, at least for a while. Like, maybe this changes,
is like, the, like, people that can, like, successfully quantify, like,
hey, this tradeoff makes sense.
The, you know, we're going to do it.
Like, that's valuable and very hard to quantify.
And therefore, like, because it's hard to quantify, like, it's easier to opt toward, like,
well, just to be saved.
Right?
Yeah.
I think you're also getting towards the two things that I see from that are there's
this quest for certainty, and there is no certainty.
You are always bearing a certain level of risk one way or the other.
Cannot have certainty.
You can have clarity on what your strategy is going to be and the risks you're willing
to take, but you cannot have certainty on it.
And to kind of go with that, like when you're in that position as a cybersecurity person,
and I think this, I would say it also applies to like a lot of data teams,
you're in kind of this wicked learning environment of like,
cause and effect are not always going to be coupled.
Even more so, you can do everything right, quote unquote, right, and it could still not go well.
Yeah.
Right.
You're a security person, you can find the perfect balance, and you still have a breach.
And now it's your fault.
And that's the like, and that's the, and I really feel for security people on teams on this,
like, because you could have a security breach, and it's literally like a one in a million, like thing that happened where it was like an immediate exploit.
of a bug nobody knew about and they got into like this thing and like you you would always
historically patched your firewall every week like you could be like on it and then there's this
like breach it's like literally not your fault and the exact same thing could happen to like somebody
that's completely lax like doesn't know what they're doing and there's no like I mean as a technical
person you could kind of probably suss this uh suss that out but like downstream to like customers of
customers like nobody cares right it happened you know yeah
And to bring it back to kind of the book, this is one of the things that like if you're going to be a good poker player, you have to learn to do, which is can I quantify the risk based on the knowledge I have? And am I okay with the fact that like, you know, yeah, I've got I should win this hand 75% of the time. That still means one out of four times I'm going to lose. And it's not about the result necessarily. It's about the process. And I think for like a lot of data teams, like I've written about this before, the idea of like you've got like two years.
if you're like a new data, you know?
Yep.
And there is a chance that you will do everything right.
You'll work to build the right culture, get the right foundation,
and you will not get a project that will actually get you what you need.
And you're going to be out in two years.
And how are you going to handle that?
Is it something that you can look at it and say, you know what?
I know this will work and I was just unlucky in this situation.
Or are you going to like overreact and be like, okay, I don't care about any of that.
We just need to get the things to the people right now as fast as we can.
we will just bubble gum and duct tape it for as long as we have to.
Yeah.
Well, since we're coming up on fall, I think, and we've talked about this before,
like, it's that football analogy, right?
You come as a head football coach to the team, and you've got, I don't know,
maybe you have a year nowadays, maybe you have two years depending on,
it's like, it's a similar thing we're like, okay, every, you know,
recruiting's broken, like, I don't have the talent I need.
I've got a number of coaches that, like, I need to fire, but I have to keep them
for a certain amount of time because my boss told me to keep like you could have so many
variables because they just fired the previous head coach and they're going to be paying him for
the next four years so they don't want to do that with any other coaches. Yeah, they don't have money
to, yeah, you don't have money to recruit the coaches you need. And I think people end up in the data
equivalent of that. Yeah. And part of the people that are successful to be quite honest are the
people that suss out the situation ahead of time and don't take the job. Honestly, that's part of it.
And then the other part of it is the people that in the context can develop.
the skills, tools, and processes to be successful,
realized that, like, in two years, like, all right,
I was essentially set up for failure.
Like, I couldn't have been successful.
But I can go somewhere else,
and I learned what I needed to, refine what I needed to,
and I can be successful somewhere else.
Both of those are options.
Yeah, and sometimes it's going to be a thing,
because the football analogy is my favorite one for that,
because it's like,
what's one of the most important things
if you're going to be successful in the long term as a football coach?
It's like, you've got to have the right culture in play
that's going to sustain you.
You know what?
doesn't win right away, having the right culture in place.
And so you get this thing of this weird balance between that.
And you can look at really good coaches.
And it's a little bit of like the pieces had to fall together.
And they didn't fall together in the first time, but they fell together in the second time.
Or they were there for the first time, but not the second time they did it.
And like, there's still good coaches.
There's still great coaches.
They still have it there.
But there was that one situation where they didn't do it.
You know, you can even look at like, it was the former head coach.
like the Carolina Panthers, Matt Ruhle, right?
Very successful before.
He's been successful since.
There were things that got in the way from him to be successful,
but it wasn't like he necessarily was the one
who completely screwed it all up or something.
You can still see some of the same good qualities there.
But it didn't work out and he didn't have time.
And there's a bunch of things that go into that.
And so I think that's with a lot of the data teams,
you get that too.
And kind of like what you said,
you got to be willing to have that idea of like,
I may do everything right.
And in two years, it won't work.
Right.
And even in the situation,
where you're like, this is a slam dunk, it doesn't always work.
Yeah, and can you handle that?
And are you okay with that?
And can you kind of be like, can you tell the difference between here's what I need to change
and here's what I, here's what it just didn't work out this time.
Yeah.
And for data teams, like, you're in a support role, right?
It's a support role.
And like, imagine you were like working on a political campaign and like, and you're
like helping to try to get somebody elected.
It's like, okay, like, cool.
Like, I can nail it as the data.
team here, but, like, if we lose, like, we lose, and it's not really my fault. Like, I can
contribute to winning by whatever data teams with political organizations do. Not super
familiar with the space, but, like, I can't really affect winning by any, like, direct
contribution, even if I'm over data for the whole, like, thing. I mean, I can affect it to some
extent, but, like, if there's good, there can be wins against me where, like, literally this,
this won't happen regardless of how hard I work, how good I am, you know, what we
You can have the best understanding of the electorate.
You can have the best targeting that you've got out there.
You can know where all the persuadables are.
If your candidate is a terrible speaker, or if their policies are just not popular in that cycle, it doesn't matter at that point.
Yeah, exactly.
Right.
Can you tell I actually worked on a campaign?
Yeah, I brought that up when I was like halfway through this.
I was like, oh, yeah, Matt worked on this.
Like, I'm glad he's able to speak to it.
Awesome, man.
We can go forever.
I think we're almost at the buzzer.
other little tidbits maybe from your
from your kind of six month summary here
like any learnings or
tidbits through your six months reading
and learning summary. I will
so this is always my plug to tell
people to read more fiction
because you will learn more especially if you want to be
a leader in any space
like you have to know people
and you're not going to learn
that from a textbook you've got to learn it from actual
people and fiction
and novels and things like that are a very good
way of doing that. So
It's my, every time I talk about this, it's always one of my plugs I put in there.
So I think that's always a good one.
Nice.
So, yeah.
So there you go.
On the edge by Nate Silver.
And read some fiction.
Get away from all of the pseudoscience crap out there.
Yeah.
All right.
Awesome.
Thanks for coming on, Matt.
And we will catch you next time.
The Datastack show is brought to you by Rudderstack.
Learn more at rudderstack.com.
I'm going to be able to