The Data Stack Show - 260: Return of the Dodds: APIs, Automation, and the Art of Data Team Survival
Episode Date: September 3, 2025This week on The Data Stack Show, the crew welcomes Eric Dodds back to the show as they dive into the realities of integrating AI and large language models into data team workflows. Eric, Matt and Joh...n discuss the promise and pitfalls of AI-driven automation, the persistent challenges of working with APIs, and the evolution from big data tools to AI-powered solutions. The conversation also highlights the risks of over-reliance on single experts, the critical importance of documentation and context, and the gap between AI marketing hype and practical implementation. Key takeaways for listeners include the necessity of strong data fundamentals, the hidden costs and risks of AI adoption, the importance of balancing efficiency gains with long-term team resilience, and so much more.Highlights from this week’s conversation include:Eric is Back from Europe (0:37)AI and Data: Jurisdiction and Comfort Level (4:00)APIs, Tool Calls, and Practical AI Limitations (5:08)Scaling, Big Data, and AI’s Current Constraints (9:16)Stakeholder-Facing AI and Data Team Risks (13:20)Self-Service Analytics and AI’s Real Impact (16:04)AI Hype vs. Reality and Uneven Impact (20:27)Cost, Context, and AI’s Practical Barriers (25:25)AI for Admin Tasks and Business Logic Complexity (29:13)Tribal Knowledge, Documentation, and Context Engineering (32:07)AI as a Productivity Accelerator and the “Gary Problem” (35:10)Healthy Conflict, Team Dynamics, and AI’s Limits (39:15)Back to Fundamentals: Good Practices Enable AI (41:47)Lightning Round: Favorite AI Tools and Workflow Integration (45:56)AI in Everyday Life and Closing Thoughts (48:14)The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it’s needed to power smarter decisions and better customer experiences. Each week, we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
Hi, I'm Eric Dots.
And I'm John Wessel.
Welcome to The Datastack Show.
The Datastack Show is a podcast where we talk about the technical, business, and human challenges involved in data work.
Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies.
Before we dig into today's data.
episode, we want to give a huge thanks to our presenting sponsor, RudderSack. They give us
the equipment and time to do this show week in, week out, and provide you the valuable
content. RudderSack provides customer data infrastructure and is used by the world's most
innovative companies to collect, transform, and deliver their event data wherever it's needed
all in real time. You can learn more at RudderSack.com. Welcome back to the Datasack show.
This is Eric Dodds reporting from the United States. I've been in
Europe for the entire summer and because of the time zone offset missed a bunch of shows but I am
here IRL with John and the cynical data guy we thought a little reunion episode would be great
now that I'm back in the States so guys good to be back thank you for just granting your
presence to us and you're welcome you know we were just lost for two months without you
here to just guide us through this wilderness.
Sometimes to understand the value of something
and needs to be taken away for a time.
We're glad to have you back, Eric.
It's good to be back.
It's good to be back.
Okay, I was in on a couple episodes,
but, okay, I have an AI question
that I want to lead the show off with,
but what did you guys talk about?
Like, what were the main things that,
what were the main topics that emerged over the summer?
I don't know I remember anything
after we stopped talking about it.
I said, John, did you start?
You just, your brain, vacuuming.
You just, you're here, you do it, you walk out the door.
It's like severance, you know.
Yeah, it's like severance, yeah.
Shout out to severance.
A lot about data and AI and a lot of AI.
A lot of data and AI, a lot of AI.
That's the summary.
I don't know, we've got a lot of good topics for today
that I think are at least tangentially.
related. But one of the ones I'm excited about is, well, you know what? This actually perfectly
relates. We talked about AI and guardrails. That came up several times. And we're going to talk
more about that today, I think. Yeah. We also, we did some LinkedIn stuff as we always do. And then
I think we got desperate and started messing with the format just to make it different for ourselves.
Yeah, a little bit of commiserating, right? We spent some time doing that. Okay. All right. So good
summer. Yeah. Good summer. Okay. Well, here.
Here's something that I've been thinking about.
It's, of course, data and AI.
But one thing that I think we're seeing emerge now,
and this hopefully transcends a lot of the marketing promises
and sort of the projections about where things will be,
which, of course, the models are going to get better.
But generally we know there are areas
where LLMs create dramatic and unprecedented,
improvement in a process, a task, you name it. And then other areas where, okay, maybe there's a long
way to go, right? And so a good example would be math, right? That's sort of one of the
notorious things is, you know, LLM's about math and, you know, that whole side of things. And so
when it comes to data, one thing I'm interested in, you know, especially to ask both of you
because you will bring the mindset of someone who's, you know, run a department,
run an entire, like, technical organization in a company.
Where is the jurisdiction of AI and where is your personal comfort level
in terms of incorporating it into processes that generally were almost exclusively deterministic,
right?
You know, let's say with the exclusion of machine learning,
where you're intentionally introducing
like a probabilistic practice
and this is layered
and so there are a number of things
of this question but
one thing that's interesting is like
if LLMs are not good at math
they are getting better and better at writing code
code can generate math
and so you can sort of back into it that way
but you're also piling in
a lot of you know sort of process lineage
there that can be difficult to trace down
So when you think about what you were delivering, you know, owning a data organization
and where you would be comfortable with layering non-deterministic elements into that,
how do you think about the jurisdiction there?
I'll go first.
This is actually, this is going to get really specific really quick.
I think one of the things that we were talking about before this show was the power of these models,
with tool calls, right?
MCP, servers, tool calls,
whatever you want to call it.
And the, which is interesting,
and we were talking about
some of the limits and frustrations
around context length and stuff
before the show.
But there's another component here
that I think people failed to realize
for sure in the broader business context
and maybe even somewhat in the data team,
which is typically more dealing
with databases versus APIs.
Because practically like the MCP layer,
tool call layer is just on top of an API.
right and it's how bad most APIs are they're bad they're bad they like in like two or three like
specific ways one if you want a bunch of data from an API that's usually a problem yeah yeah number one
number two if you want to send a bunch of data yeah yeah number two if you want to send a bunch of data
number three if you want web hooks at work all the time every time like that's actually a problem
too there's like three three major problems that the mcp or tool calling in front using AI you can
do a ton of things and a ton of cool workflows.
But then there's a practical of like,
but most of these APIs are not very good.
And this doesn't fix that problem.
Yep.
And now you've got two layers.
One, the AI like hallucinates or screw something up.
But two, the more practical layer of like timeouts, issues.
Sure.
Like whatever various like, you know, errors from the API itself.
Yep.
So I don't know that answers your question,
but I think that's like the first practical thing of like,
I got to get past that to like,
fully answer that question, if that makes sense.
Because like the, what I hear you saying,
if I had to summarize it, would be,
I need to actually put some use cases into production
in order to start to answer that question,
but it's really hard to do that.
Right.
With APIs that are not.
Because there's a bunch of really neat,
let me just give one example,
like Shopify has an MCP server now.
If you're an e-commerce company, like,
neat, let's hook that up, let's do all this cool analysis.
this would be great.
And Shopify actually has a pretty decent API,
so maybe this is a bad example.
But then it's like, okay, how do I,
how do I really just want to give somebody
like Claude in this thing?
And then like they get just right before they answer
and then Cloud runs out of like Conno,
you have to start over.
Like, is that what I want to give to people?
Even if the Shopify MCP is great
because their API layer is great,
you're introducing another set of dependencies
that actually,
interestingly enough is blurry isn't the right word, but like it's hard for Shopify to control
that, depending on the user experience they want.
And I said it was pretty good. I didn't say it was great.
And it's, okay. Anyone from Shopify listening, please come on the show.
It really is good. And because of the MCP stuff, they have all these like extra scale problems,
extra demand on the API that like. Totally. But then the question is, what do you hand off to the user?
Right. And so like, are they going to handle it for you?
or do you sort of bring your, you know, it's like, that's tricky.
Well, it's also, I think what you're talking about is one of those,
that it's not like you can, well, we'll POC this and then if that goes well,
because it's this kind of, it's not that it's going to go wrong
or something's going to, you're going to have issues like the first time you use it,
it's that it's going to be, it's good, it's good, wait, it's broke,
wait, it's not, it's returning a bad call or something like that.
And those are much harder.
I mean, that's like an overall data problem, I feel like in a lot of these,
is that scaling
and that kind of like
long-term usage of it.
Yep.
The idea of like,
I can make it work
in a condensed time frame
with a limited amount of data
or like a limited number of uses.
But once you start putting it out,
we don't really do this like gradual thing.
We don't go 10, 20, 30.
We tend to go like
1, 5, 10, a thousand.
Yeah.
Well, that clause this problem.
That's where you find a lot of these problems.
That's the perfect.
illustration because what I'm finding and it's such an interesting trend like in data is we have like I don't know when you would say the spreadsheet moment is but like you know years ago when spreadsheets like became ubiquitous there's a moment there and then we've got this arc in data for like 10 plus years of like big data like Hadoop and snowflake and things like that and then when you introduce the AI stuff it's like that does not work well with big data this is not like context or super limited like computer
It's super expensive.
But when we've been on this big data trend
and people are like, well, why doesn't it work?
Like, I've got millions of records here.
And I want to do like this.
And you're like, it doesn't work.
I just, I want to take a moment to pause here.
And maybe this is just because I've been away for a little bit.
But it really struck me that you were just like, you know,
Hadoop and Snowflake.
You just lumped in the ring.
Basically the same thing.
Lumped them into this.
Not at all.
It's all this is.
Snowflake.
Yeah.
Death Flake team, I apologize.
No, by a humblest
apology.
Like 10, 15 years of history.
Of course, you know, we know
that people at Snowflake and they're great
and hopefully they get a chuckle out of that.
But I think in the best way possible,
that's kind of a good thing
in that they are like a massive
data, you know, enterprise data cloud, right?
Which is, so maybe that's a sign
of their success.
Innecnotal examples of like the big data movement,
which we wouldn't call it big data probably anymore.
But the point being
that like we're trending
more and more data
more and more compute optimized
and work on a lot of data
and then like we throw an AI
and it's like
this is not
compute optimized
or work on a lot of data
right but would you
so the interesting thing is
let's think about Hadoop and Snowflake
right
the things that you're talking about
around API reliability
and scale and all those sorts of things
like what's interesting
is if you think about
the Hadoop and Snowflake ecosystem
you know, and if you, let's talk about Hadoop first, right, is that, yeah, you still face, like,
similar issues. It's not like anything you mentioned as new is a new problem, right? But in the Hadoop
era, you have all the knobs at your disposal, right? And so you can actually, like, deep, you know,
you can sort of do whatever you want with the, you know, on top of the core system, right? I mean, you know,
you have the, you have, like, large providers, you know, Hortonworks and others that are like productionizing,
a lot of that and sort of making things easier, right?
Then you go to the Snowflake Area and the promise of the
crowd, the cloud is that they're doing a lot of
that for you, but then they also give you APIs
and other things, but then they have these,
you know, they are, they're actively
solving those problems, right?
And so it's
not a new problem, but sort of
the, they are
contained, right? And I think
that's what's interesting about
what you're talking about is we're introducing
problems from different sources,
right? It's not like the
system is contained and you can sort of address those problems like well well i'd say i think i can
speak for most data teams most data teams are like yeah we have to interact with APIs like but we do it
way upstream and we don't want to do any of our work directly or actor interaction we want to like
abstract that like put this here and the higher five trainer if we do it ourselves we want to still like
highly abstract that out yep get it into our data cloud as fast as possible and then not mess with
APIs but then if you're going to introduce like tool calling and TP and all that other stuff then
you're like, oh, this is a lot more messy.
Yep. Right.
I don't know. What do you think, Matt?
I think there's always, I think you got those good points there that have just like you're,
it was already not clean in a lot of ways.
And so I think that's just kind of dealing with that.
And, you know, I don't know, APIs have always been one that like you can see the power of
APIs.
But from a data standpoint, they're a pain a lot of the time.
They just are.
And so you're going to run into those problems.
I mean, I think if we go back to the original question and kind of like, where do we see it?
I think I looked at a slightly different from the way you did.
And then I was thinking of where would you put this, where would you put AI into kind of your, in like a data team?
And it's like, my distinguishing at the moment is probably from like, is it like stakeholder facing or is it in the back?
And I would not make it stakeholder facing for a data team.
So I think that's, you're going to run into problems with that.
And I think it's also like, it's going to sound a little weird, but like you're,
beating into, like, bad assumptions if you start, because everyone's going to want to see it.
Yep.
But it's like, it's, if you're going to have it there and it's going to be like, you know,
oh, we could have it like present to the user or something like that.
It's like, no, I don't really want to do that.
I don't know if you need to be there and that part of the staff.
I totally agree with you in principle because you are, you're introducing a huge amount
of risk for the data team itself.
Right.
Right.
which is problematic
but at the same time
it almost seems
inevitable
which is a very strong word
but there's so much
AI being crammed into
analytics tools
where it's like well does the
stakeholder just have this expectation
that they can ask
an anonymous
you know an anonymous
you know
Chad Botter agent a question
and they get an answer
you know which a lot of those are
self-contained
user-facing, you know,
analytics tools where they control the data model.
I mean, right, there's a very large amount of
control there, which is way harder
to achieve, you know, when you're
just producing
analytics, right?
Right. Running sort of the whole end end and stack.
Well, I think there's also that difference
between, are we talking
about your like a data team and it's for
like a product that is going to go
out into the wild
customer and like most of the work
that I did with teams, we
were pretty internal. Right.
Your customers or other teams with the customers are like the marketing director or the product team or something like that.
And so I think even there, you know, it kind of has that same pull to like, oh, well, we'll make BI self-service and that'll make our jobs easier.
And it's like it doesn't really work.
And I think when you try to do it, you end up turning the AI into kind of this little parrot, a very expensive parrot at the end of it.
It's like you're doing all the work in the back anyways.
and then it's just there to summarize what you've done nicely for them.
But it's not going to be the one that's doing the analysis a lot of the time.
And it's not going to be the one that really is like, oh, it found the insight.
Right.
No, it's probably not going to.
Okay, so I have a question for both of you because we've talked about self-service, you know, self-serving analytics before.
A couple times.
A couple times, how do you see, is AI going to be a big step forward in that?
or not.
Here's the funny thing
that came to mind.
So I was fairly involved
in DevOps stuff
four or five years ago.
And one of the things
that is coming to mind
that there's this
there's the chat bot era.
Right.
Did you ever use Hubot, Matt?
No.
So like the idea here
is it's the same concept
as a lot of these AI bots
but it is
but it's like deterministic.
Right.
You could give the thing
a command, and we'd do this all the time
at a previous company like, hey,
well, like, log me out of
this thing. It was like a thing that
like your session would get stuck
and you'd like tell it to log you out or tell it
to like reset this thing or do this thing.
So it was a little bot and it was determined with commands.
So like that version, I mean, it's interesting.
Like a lot of what people want
out of the AI bots, like you could
probably index
10 queries.
Yeah. And do like a little
search match based off of the words and like get the query to run and like almost simulate like
what the AI would be doing. The AI still provides value in the like, you know, in the parsing.
But like that version, I think a lot of companies would actually really like and be impressed with
just that, which therefore is actually imminently possible with AI. And you know, it's a little more
sophisticated and more accurate than doing just like what I described. Yeah, I mean, I think it's kind of like
one of those is it going to make a big step forward i don't know is everyone going to try to make it
make a big step forward yeah i think that's you're all it's going to be this self-service is like
it has this like sirens call to it in a lot of ways because in theory it should add a bunch of
value it should make life easier for everyone involved but reality never really quite gets there
And so I think this is another one.
It's the same type of thing as trying to go like text to SQL and stuff like that
where like there's this idea that if we could just figure it out,
it would think of how great this would be.
But the practicalities of it are always kind of off in a little bit.
And the fact that it doesn't really, you're still not dealing with a lot of times
with the problem of like the zero to one step.
And if you're not getting that zero to one step there
because they don't understand what the data is or it's like,
you know, you're just giving them a blank screen.
and you're like, ask, like, what do you want to know?
Well, I don't know.
You know, maybe if you're looking at a dashboard,
they can get a little further along that,
but it's still one of those that like, you know,
it's even for a human, like, analysts, this is hard sometimes.
Well, and the two, my two, like, data roles
that, like, where the teams were most successful,
and both of them, we had standardized reports,
and there were a few of them, like, less than 10,
seven or seven eight people knew what they were and they knew what the columns meant yeah like if you can get
there and like everybody on the same page then the a i stuff would be amazing because it's like finite like
we know exactly what it is we know what everything means we can train the AI and what it means totally
super cool stuff with that i mean i had one of the most successful teams i worked on we had one query
that did 80% of all the work sure and so onboarding wasn't as hard either because it was literally
just this is your query. Yes, it's a very gnarly 14 joint query and all of that. But like, this is
what you're working from. And we've already renamed everything. And we've already done all that work
for you. And so you're just starting from there. And if you need to make changes around the
edges, you can go look at it or you can go ask someone. And so you could do things a lot
faster because of that. Right. And that's not a, I mean, there's some businesses where that
it doesn't work. Like it is more complicated than that and it's valid. But even
in that scenario, like still getting like this is the five for this team or these are the five
standards and we're always going to derive from the standard to like stuff like that is still the
hard work. And then after you have that done, everybody agrees on what, you know, the definitions
of things are like the AI stuff. Like it's, I don't say it's trivial. Like it's still like some
effort and there's tuning and stuff, but it's possible. Right. I think that's where a lot of the
marketing sort of outpaces the reality,
which is not a new topic on this show.
I'm talking about this for years.
But in particular, I think,
because of the pace of change
and because of how magical it can feel,
you know, it's easier to believe that, right?
But it's not a cure-all for the foundational stuff,
especially when it comes to data, right?
You actually have to get the underline.
lying data model. Well, and I think especially
early on, there was a lot of thought of almost
like we can sell this because
in six months or eight months or
12 months, it'll be, and I don't
really feel like we
that's happened, you know?
Like there's progress that's been made, but it
has not caught up to what the claims
were, you know, 12 years previous to
it. Well, yes.
I mean, I think the
I mean, you can make an argument
for
you know, the need
to tell a story for the valuation of a what right say it diplomatically but there are i think and i think
this is a tricky this is a tricky thing to navigate mainly for people who are trying to figure out
how to productionize this stuff right is that there's a lot of promise out there and in some specific
areas the what is possible is extraordinary right i mean
Think about translation.
Yeah.
Think about, you know, there are areas where it's like, okay, this is clearly such a fundamental leap forward that translation will never be done the same again.
Right.
And it's actually already having really outsized impact on, you know, the labor force, like within translation, right?
Right.
Because it is that transformative.
Yeah.
But that is not evenly distributed, right?
There are a number of areas where the technology is so dramatic, completely changing something,
but that's not evenly distributed, right?
And I think that's part of what makes that very difficult, right?
Is that, okay, well, this could transform absolutely everything, every part of every domain, right?
And that's just not true.
Well, and I do think, because we've been mainly focused on the consumer, like, customer-facing, whatever you want to call it, peace.
There's some pretty transformational stuff, I think, on the back end, like you just mentioned translation, migrations between systems or, like, translations between languages, like, you know, like Python to whatever, or R to Python or SaaS to Python or whatever.
Yep.
Like, that, I think that's pretty transformation.
I imagine there's like a number of companies that may decide to take on some efforts that they wouldn't have if that weren't available.
Yep.
So I do think it affects data people's workflow and people that are writing semantic layers and SQL and things like that for sure.
And then the other one that like I don't hear a lot about that's possible now is using like using these tools inside databases like Snowflake for example and many of the others.
like there are these foundational models inside the database to like contrary to what I'm talking
about with tool calling and all that other stuff like that we are seeing I'll say early
implementations of the foundational tools in the database where it's like hey I want to like
categorize all these products or something you can make a tool call or use clot or whatever
to do that and I think there's some neat applications there with forecasting things like that
but I think it's there I do think it makes an impact but it's a little bit more hidden and
there's less like marketing push behind that stuff because we kind of already had a wave of that
with ML and like yeah I don't know like yeah but maybe a little less app type I have an interesting
view on that actually okay because let's talk about AI like within a database okay there are tools
out there that will you know they're like listeners you know and so you like put it on your database
like sort of run an algorithm that says great there's something that seems like it
change, you know, let's call it observability.
That's been around, but let's sort of categorize
that under ML
that had the
great fortune of good market timing
so that they can just market it as
AI when it's all under the hood.
Right, right. So great, good, that's awesome.
Like, that is wonderful.
Okay, that's, you know, that's been around.
A chatbot inside of a database,
like, is that useful?
Okay, that's a hard question to answer
because there are probably some
instances in which it makes certain things way easier, but it's not the interface through which
you interact with the database.
It's just not, right?
And I think part of the challenge there is that what you actually want is something more
akin to the observability tool that is like constantly curating the things that would be
important or interesting or helpful to you, right? But the reason that isn't happening on a large
scale and production is because it's way too expensive. It only makes sense to incur costs when
the user explicitly says, I'm submitting a prompt, right? But what you really want, you don't
want a blank page. No. What you want is something that is presented to you that, you know, pre-
curates like a bunch of different stuff, right? And I mean, I'm not saying that people are trying
that or that doesn't exist. But that is generally not, in my view, is that's actually primarily
a cost thing, right? I mean, the lost leadership mentality for most of these products is mind-boggling
on the scale of billions, right? And so there's a huge amount of subsidization happening.
Right. Right. With the expectation that the cost will come down every time.
Exactly. Essentially. Exactly. Right. But because of that, like, because it's already somewhat upside down,
like pre-encuring a bunch of that cost
before you know
before you know exactly what the user wants
or whatever like trying to do that
is it's just a very expensive right
and so it's like okay well
let's just use a chat bot
and the user we're going to incur cost
when the user says you know
we're going to incur cost right well I mean there's like
there's opportunities in places where
because we always think of it a lot of times
as you know real time with the user
but there are opportunities where it's doing stuff
in the background and it's like you submit
something and you have to, you know, you wait and then it'll come back. But that cost part of
it does come into it. I mean, because the way you generally have to work with this stuff
because of context windows is you're essentially batching it in a lot of ways. You're summarizing
chunks of it, whether it's text or whatever, and then you're summarizing the summaries. And you
keep doing that until you get it small enough to fit in one context window. But each one of those
is incurring input token,
output token costs.
So you can look at it and say like,
oh, well, my total tokens is, you know, whatever,
and that should cost me a dollar.
Once you've gone through 16 iterations of this,
now you're spending $30, $40 on this.
And that's kind of where that all adds up.
Because, you know, it's one thing of like,
if you get it to a point where you're like,
I don't care how long it takes to do certain stuff.
Like I think that's something that, like,
legitimately could be useful with this is like,
I don't care how long it takes
as long as it's not people doing it
right. Having a computer work
for 20 hours on something
not a big deal
but it's going to be that token cost
that you've got to be careful of.
This is tangentially related
and this is the business
idea legitimately like so
if any of our listeners want to pursue this would be awesome.
I already call it. All right, Matt calls it.
I'm not playing a domain right now.
That's practically a patent.
So, but you're mentioning the token cost and like the things like running in the background.
This does exist.
It's just upside down.
It does.
But like here's a really interesting one that I have not seen nearly enough products in this space is things with a GUI where you pay an admin to do things.
Salesforce admin, for example, or like Marquito admin.
I don't know.
Like all these like complex enterprisey tools that have APIs.
Like why do we not have like chat interface?
with some kind of like
MCP thing to do that
and there's some things out there
but like that feels like
hold on explain that more
so there's a lot of companies
that pay like a full-time
Salesforce admin
oh right right right right
like and it's because
I don't know where the menu
the stuff is in the menus
like normal people could do it
you just can't find it
and AI seems to
yeah or your
it's like so obscure
and you're doing it
and Salesforce admin
is one that's like
pretty complicated
but there's others that like
or are not hard to do it once you can find it,
but you can't find how to do the thing.
So it's almost like a search problem.
Yeah.
Yeah, I don't know, though, because
I will tell you my experience with that, okay?
Is that, no, I think this will change,
but the problem must, okay,
Salesforce and Marquetto are outstanding examples, right?
because their APIs
are horrible
their user interface
is a Frankenstein
that's been changed over multiple decades
many years
and so of course
that makes it more complicated right
but they're very powerful tools
but those to me
are
they're not even close to the real issue
the real issue is that
there is
business logic that has been
represented within like the customization options within these tools that's very difficult to that's
probably undocumented right right that is a mixture of three things like some sort of custom field
names some sort of workflow in whatever workflow builder right four things custom field names
some sort of custom workflow in the workflow builder some sort of custom code in the tool
so in the Salesforce, that would be Apex,
and then some sort of integration,
whether that's a custom integration
or like a quote-unquote native
like daisy chain integration,
whatever, right?
And making changes to a system like that
is not easy.
Yeah, for sure.
That's why you have these people
who's admin rules.
But I think the challenge
is that,
okay, so I fully believe
that if you gave
an LLM
all of the needed context
it could certainly navigate
navigate through that right
but that's the big challenge
is that all of those
intricacies of the business process
the history of like
oh well because an LLM
would run into a situation to say
and it would conclude this doesn't
really make sense
because three years ago
someone made this change and like blah blah
blah and it's like well you can't change that
right we've just accepted
that this is debt within the system
that we live with him to perpetuity.
Well, yeah, I mean, that would be the interesting, right?
Is if you had the full change log of everything that's ever happened,
full context, like maybe you had to do post-training or something
to really, like, cramp of everything in there.
It seems like that would be an application that would be possible.
I totally agree, but the, but you would, I mean,
and actually maybe the business idea is that we will come in
and document everything as context.
I mean, that's the funny.
It's like context engineering is the actual, like.
Yeah, that's the hard problem with this stuff.
Yeah, yeah, yeah.
I mean, because you just got me thinking, and it's like, well, what this is, what this is not?
It's like the amount of money that gets spent on companies that's whole purpose of existing is like, we just help you with implementing Salesforce or other products.
If no one else, I would think they would be ones that are looking at.
And that's probably the answer, right?
It is like a lot of these implementation companies will figure out how to leverage AI for their teams to like to be effective.
You know, it's a situation where enough tools are exposed to where.
you can use an LLM to actually generate
the documentation.
Yeah.
That's going to be another summary of summaries, though.
That's going to be your big thing.
Yeah, totally, totally.
Totally.
But it is, the tribal knowledge
is really hard to replace.
It's really hard to replace.
Yeah.
Well, well, this means,
I mean, I could see some people thinking,
like, okay, now is our chance
to get off of pick your tool.
Because we're going to throw an LLM at it
to try to, like, figure it out.
and pull it out so we can then go to this tool
that actually does what we want it to do
and we can clean start it kind of
without completely erasing everything
building from scratch.
Yeah, I mean...
Someone will try it.
I don't know if it'll work,
but someone will try it.
I think that's a totally legitimate pathway.
Just to say like, you know,
because I think the challenge
and we go back to like the promise
versus the reality.
The challenge is in a lot of those situations,
especially in an enterprise,
like you have to have someone
who can make meaning of the tribal knowledge and like all the whatever, right? But if you just
say, great, we're just going to keep that as a baseline assumption and just start building
stuff on top of it, then... Yeah, well, and that's the funny thing, right? Like, we're talking about
all this, you know, progressions toward AGI or whatever, but there's a context in some of these
to contradict my idea, honestly, there's a context where like, you could throw the smartest person
on this that has everything around it and they would have no clue, no clue what to do.
because they lack the like
Yeah, the tribal knowledge.
But I would say like
okay, this is something
that I've thought about a lot
and especially, you know,
it was a slightly leading question
in the beginning with saying, you know,
what's the jurisdiction of LLMs
within, you know, the data domain?
But the person
who was already
predisposed to be,
to like use the advanced technology
to be like a great Salesforce admin.
Like AI is going to be like crack cocaine
for their productivity.
It's just going to be awesome, right?
Yeah.
Well, and I think that's the right place
to like where it's going to get implemented.
And then the question is like,
does it stay there?
Like maybe it does.
Totally.
Yeah.
Totally.
But like the, it's in the current state
a dramatic accelerant to that person
which probably makes other people
on the team much less necessary.
right? A team of four becomes a team of one, like whatever. Sure. Right? Because if I have the tribal
knowledge, like I can use LLMs to do all these different interesting things like integrations. I mean,
those are domains that are extremely well documented. Even the tooling within, within like Ipass tools
or integration tools like Dapier, they're all getting better. These are all like publicly documented APA that
there's like not weird stuff, right? But at that point, I'm using a set of tools to like just
accelerate things that would have been very
manual, like very difficult to build
in the past. So I think
you're right on that people will
be doing that. I'm going to
give you the shadow side of that though.
That's why I had
to. Which is
on the one hand, in the short
term, you're going to be like, man, we cut
the team down, we're doing it there.
It's so much more efficient.
You have just made your
Gary problem
100x world. Totally.
When Gary leaves, you were going to be more screwed than you ever were in the past.
Because now it's going to be...
They're paying all of Gary's chat logs.
That'll be in the crucial thing.
And now you're going to have people who are...
You hit the context window when trying to like...
You're going to have them manually searching through catalogs and trying to figure it out.
It'll be all this thing where he's developed all of these LLM shortcuts to get the things that he wants to do it.
You're going to be having to read.
right what was the kind you know what was the like the system prompt he used when he was doing
is this going to be like yeah short term it's going to be lovely and long term you're just
it's a great point because that's true of a lot of AI things like there there's a practical low
threshold that makes sense where like let's say we had a team of three and now it's enough work for
one person but now it's just one person like we really like know that we should probably have
more than one person that can do this core like really right thing yeah yeah
Have you solved that problem?
Yeah, it's tricky.
You know, one other thing is we think about that, especially on data teams, right?
And so if we talk about systems integration or, you know, of course, like a huge thing that data teams deal with is, you know, it's like the operational side where it's like, okay, we're producing data products, but those need to get into other system.
Right.
You know, whether that's for an internal customer or whether we're delivering a data product to a customer that actually needs to be in a client side, you know, part of the application.
Right.
it's, you know, maybe an analytics product or whatever that is.
So, yeah, certainly agree there where it's like, okay, well, the attraction of like 75% headcount savings there, you know, even if it's possible, is that wise?
I think that's a really interesting question.
But the other really interesting thing is if you think about, okay, let's go back to something that both of you said, which I've had the same experience.
Like the most effective teams is like, okay, you have a less than 10 reports, you're driving the business, the department, whatever.
the scope is, off of a limited number of reports, they're really tightly scoped. There's
generally, like, good documentation around that, right? You didn't get there by everyone agreeing
with you, right? Like, what actually happened was there were a lot of disagreements that
forged, like, really strong opinions that, like, forced everyone to figure out, like, what is
our actual goal? What are the compromises that we need to make? What?
are we willing to give up, right?
Like, what are, you know, you can't have your cake and eat it too, right, in those
situations, right?
And so there tends to be a lot of healthy conflict that leads to an outcome like that
where you say, okay, the most effective team I was on, like, blah, blah, blah.
But one individual with an LLM cannot, in the current state, reproduce that level of
healthy conflict, right?
Because they're generally very agreeable, right?
And so it's general, like, that's another interesting dynamic.
of, you know, even if you think about founding a company
and there's the whole movement around, you know,
the first, you know, what is it,
the trillion dollar single founder, you know, whatever.
I think the number goes up every year.
Yeah, for sure.
Like billion trillion, I don't know.
With the electric bill.
Yeah, right.
But it is interesting to think about that.
The conflict often leads to the best possible outcome,
and that's actually way harder to achieve.
It's so much work to achieve that with an LLM.
Conflict sometimes, because I can think of times
where I was like, okay, I want to do,
something, right? And I'm like, all right, let me go to the LLM and let me ask it. How would I do this?
And how would I structure it? Or whatever it is, right? I'm going to try something. And then I go and
talk to someone else about it. And they're like, why don't you just do like this other thing
over here? And it's like, look, oh, okay, yeah, that is the way I should do it. I shouldn't even
spend the time. LM's never going to do it. I agree with that. Like, I think the conflict is one thing,
but you can tell it to be adversarial. And if you pick the right LLLM, it will be. The other thing
is what you said is that like
it tends to struggle
with simple solutions
but like just do this
like very easy simple thing
like it tends to overcomplicate things
and if you get it
again maybe you can tell it like simplify
you know totally
whatever if you give it a complicated thing
and you're like I have a great idea
and we're going to set up a server
that does this thing that has a web hook
and that's great let's get going
yeah yeah right talk to someone else
and they'll be like
why are you doing that it's just built into the
product.
Yeah, it's already there.
Totally.
Yeah.
No, yeah, I agree.
Although, okay, yes, you can set it up to be adversarial that you're embedding
your own.
Your bias is still embedded.
The reason that conflict is good is because it's generally, in healthy situations,
is generally rooted in deep conviction about something.
Sure.
Right.
You know, and that's like where it emerges.
And like from a technical standpoint, by the time you've put in, I want
to do x what's happening underneath the hood is that entire network and knowledge web has now
rearranged itself to be those things that are most probable around what you've just said it's
stuck in that local minimum a lot of times it's not going to pull itself out yeah totally right totally
one other thing i think it would have rexon tell us we probably have time for one more question
okay here's another thing that's really interesting so if we think about all
a lot of the things that we've talked about. It goes back to, and this is like a really great
irony, I think, of this age that we live in, which is amazing. I mean, how fun is this?
Like, this is amazing to be living in the age that we are. But there's sort of this dynamic of
forcing people to go back to the fundamentals and do them well. So I'll give you one example.
So linear has done such cool things with their MCP server, their agent technology, all those things, right, where it's like, okay, if you, you know, and then if you think about tools like chat PRD from the product side, right, where you can generate product requirements, et cetera, and then you can actually scope those into like different tasks, you know, combine those into a sprint. And then now even with, you know, with linear agents, like you could actually like automate some of the feature development just based on, you know,
what you've outlined as the tasks that need to be done, you know, in the project and linear.
And which is crazy. I mean, it's pretty wild, right? But what is the prerequisite to that?
It's like writing really good, detailed, like, tickets, you know, or issues in linear formats, you know, breaking a project out into, like, logical sequences.
Like, you know, all of that stuff where it's like, okay, well, that is just generally good.
practice anyways. And so one of my friends who's an engineer is like, oh, like the best
possible thing that happened for forcing our engineering team to like start writing really
good issues was AI, right? Because now they realize like, oh, well, if I want to like leverage this
to help, you know, increase my productivity, I actually have to go back and do the thing that I
should have been doing before. Right. But there were just less, you know, it didn't create as much
And zoomed out even further, it's also, I hope, puts more pressure on doing the right
things to begin with. Because doing the right things, like, assuming we're like working on the
right things from a like feature roadmap or whatever, then there's a cascade down, which
it's really good. But hopefully that frees up cycles to work on like, hey, are we doing the right
things like all the way up at the beginning. And then you've got some like less friction to do the right
thing as far as like implementation with PRDs and requirements and issues and you know
yeah for sure and even if you think about documentation and that could yeah there's another one internal or
external right or like unless the context of a data team and like you know you have stuff that's
really well defined is your internal documentation really good right is well great like AI can be
really helpful for that if it doesn't exist like you're going to have to write it and right
sometimes AI can help with that but you know sometimes in full circle maybe
too for teams it provides
like more robust tooling like
APIs actually get better because like
we go through this process and like we go ahead
and like build out the edge cases that we would have maybe
skipped over before totally
okay lightning round really quickly here
because I think we're close to the buzzer
best AI tool you like use this summer new
AI or like feature within an existing
tool
hopefully within the data space but if not
here's a hack
that is pretty obvious, but
I think it's super interesting to look
at the top tools.
Like you mentioned, linear. It's a great tool.
And then go trace out their integrations
and then go look at those tools and trace out
their integrations. You can have some great
combined experience. It's like we're talking,
you're just talking about linear. Like with chat PRD.
Cool. Like neat tool. That's another one.
Like linear's got a neat like cursor
like thing where you can assign things in cursor.
Yep. Yep.
So, like, just doing that, I think is interesting.
And watching these, like, even, like, who does open AI have on the stage with?
I'm like, go look up those people.
Like, that's a, for me, this is more on the discovery, not, like, specific feedback of a tool.
But that's been a good thing for me trying to, like, keep up with this stuff
because it's so hard to even know, like, what to pay attention to.
Yep.
Matt?
Matt doesn't use AI tools.
Yeah.
I'm just kidding.
It's not like I work for an AI company or anything.
That's actually it makes it hard because, like,
literally I'm dealing with our internal stuff and build you know and like trying to help build
these things so it's like I don't know I don't have a good answer I mean we've done some like in
my current when we've done some cool stuff but like you know kind of consumer based things
trying to think of if there's anything I can really come up with because I mean I use stuff every day
I don't know how much newish stuff I've used though I've tried a couple different things and
kind of gone back to the old stuff
Yeah, it is interesting.
What about you?
Well, I mean, of course I'm a Raycast fanboy,
and so I feel like they have just continually
made the general experience
so integrated into the core workflow.
You know, integrating with, you know,
apps on your computer and tools.
So I would say that's a little bit of a cop-out
on my own question because it's not like a brand-new thing,
but it's such a good example.
I think of just a tool where like
I just generally use Raycast
because it's such a good experience
even you know
even before like you know whatever I have
GBT and cloud and all the you know
all the individual stuff but just their
attention to detail on integrating it into the core
workflow that feels like an operating system level
thing is incredible you know I do wish there was more stuff
that had that in there because I mean that's to me one of the big things that I would
like to do especially like just even in my current role where it's like it's me there's no team
right yeah it's like there's stuff i'd like to be able to do where i'd like to be able to set it up
to be able to say hey go do this go do that yeah you know and a lot of the tools just it's still
it's still really hard to like string it together but i feel like rake out knits it together in a nice
way but a lot of them don't really even they'll say like oh yeah we can do tasks and stuff
and then you'll get into it and it's like oh we can do this one task yeah only
It's still hard, yeah. It's still hard, yeah. And then, like, you know,
it's still not accessible to sort of non-power users. Okay, one anecdote that's really funny.
So my dad's, he owns an automotive shop and they do general stuff, but, like, he's worked in
transmissions this whole life. And so, like, you know, we rebuilt a car together. So he's
very mechanical. And so my lawnmower died before the summer. And now that I'm back, it's like,
okay, well, I need to buy a lawnmower, you know, because, of course, I'm
one of those people who I have to mow my own lawn.
I've hated paying someone to do it.
I bought a robot to be a mile on the summer.
Like a hustle.
Well, my backyard, you have to go upstairs.
And so if the robot can go upstairs, then I'm in.
Otherwise, I have to buy two robots.
Yeah.
But either way, I was researching, like, researching lawnmowers and all my other equipment's
gas powered.
And so it was just easier to whatever buy a gas powered lawnmower.
Although, you know, people are trying to convince.
me to go electric, but maybe I'm just a traditional guy. Either way, I'm looking at the gas-fired lawnmowers
and, you know, trying to get the Labor Day sale and blah, blah, blah. And so I asked my dad,
I was like, hey, well, like, help me, you know, who makes, like, the best motor, and I haven't
shot for a lawnmower in a long time. And he just immediately said, did you ask chat GPT?
I'm like, you're up a chaos. Like, you're supposed to know, like, this is your domain. That's
like internal combustion
and he was just like,
did you just,
why didn't you just
ask Chad GPD?
Raycast,
come on.
Exactly, yeah.
Well, my wife has gone
from where she would look at me for his like,
are you just going to ask chat GPT
that to now she'll be like,
I don't know the answer to this.
Can you go ask chat?
Nice.
Yes.
Awesome.
I love it.
All right.
Well, thanks for having me back.
Lots of fun shows.
Maybe we'll have you back on our show.
Yeah.
Great. That would be a great privilege.
All right. Thanks for listening, and we will catch you on the next one.
Stay cynical.
The Datastack show is brought to you by Rudderstack. Learn more at rudderstack.