The Data Stack Show - Re-Air: Ringing Out the Old: AI's Role in Redefining Data Teams, Tools, and Business Models
Episode Date: October 15, 2025This episode is a re-air of one of our most popular conversations from this year, featuring insights worth revisiting. Thank you for being part of the Data Stack community. Stay up to date with the la...test episodes at datastackshow.com.This week on The Data Stack Show, Eric and John explore the transformative impact of artificial intelligence (AI) on technology and business. They discuss AI's rapid advancements, drawing parallels to historical shifts like e-commerce. The conversation explores the future of roles within companies, particularly in data management and SaaS products, and considers the broader implications for business operations. They also touch on the changing landscape of data roles, the accessibility of AI-driven services, the potential for AI to democratize high-value services and reshape industries, and more. Highlights from this week’s conversation include:The Impact of AI (1:25)Historical Context of Technology (2:31)Pre-existing Infrastructure for Change (4:42)AI as a Personal Assistant (7:10)Future of Company Roles (9:13)Managing Teams in a Dystopian AI Future (12:31)Business Architecture Choices (15:52)Integration Tool Usage (18:07)AI's Impact on Data Roles (21:53)AI as an Interface (24:04)Trust in AI vs. SQL (27:12)Snowflake's Acquisition of Dataflow (29:54)Regression to the Mean Concept (33:49)AI's Role in Data Platforms (37:04)User Experience in Data Tools (44:41)Future of Data Tools (46:57)Environment Variable Setup (51:10)Future of Software Implementation and Parting Thoughts (52:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Hey, everyone. Before we dive in, we wanted to take a moment to thank you for listening
and being part of our community. Today, we're revisiting one of our most popular episodes in the
archives, a conversation full of insights worth hearing again. We hope you enjoy it and remember
you can stay up to date with the latest content and subscribe to the show at datastackshow.com.
Hi, I'm Eric Dodds. And I'm John Wessel. Welcome to The Datastack Show.
The Datastack Show is a podcast where we talk about the technical, business,
business and human challenges involved in data work.
Join our casual conversations with innovators and data professionals to learn about new data
technologies and how data teams are run at top companies.
Before we dig into today's episode, we want to give a huge thanks to our presenting sponsor,
RudderSack.
They give us the equipment and time to do this show week in, week out, and provide you
the valuable content.
rudder stack provides customer data infrastructure and is used by the world's most innovative companies
to collect, transform, and deliver their event data wherever it's needed all in real time.
You can learn more at rudderstack.com.
Welcome back to the Datastack show.
Today, you get me and John talking about data topics.
We thought it would be fun for us to just shoot the breeze on a bunch of data stuff.
And so we're going to talk about a number of different things on the data topics.
the show today. John, we don't have a guest, so I'm just going to say welcome to you. Welcome to you
as well, Eric. Thank you. I feel so welcome. Okay, I feel like we're ringing the rag of AI on the show.
I actually didn't plan on that joke. I don't know. That seems planned. I really didn't. I was thinking
ringing a jibe. Oh, it worked really well. Yes. Thank you. Thank you. Talking about AI, but
really we have to because we're living through
such a fundamental shift in so many things and it's
happening in real time. I feel like when this stuff
first started coming out with the first couple iterations of GPT, it was
really over-rotating on so many podcasts and news articles
about like, okay, this is crazy, but it's come so far
that it really is, I think, the big topic, right? You know what I've wondered
recently? What other thing
if we had as much hype about it as AI would progress really fast if billions and billions of dollars got put into it, right?
Because part of the success of AI is not like, yeah, there's a lot of advancement there that's cool.
But it's also the crazy amount of investment from like every major technology company to make it progress.
Yep. Yes.
So I don't know. I don't know the answer to that.
Yeah. Yeah, it's an interesting question.
It's also interesting to think back on the history of technology,
even the conversations around data specifically.
And there just aren't, it seems like there aren't that many fundamental changes as
as significant as this.
I would agree, and I don't think we've talked about this before,
but one of my original attractions to data
was that it wasn't going to change so much, right?
So, like, it was on the list
because, like, you get into tech and you're like,
oh, front end, like, web frameworks,
the joke's always like, oh, that changes every, like,
five minutes.
There's always a new one.
There's always something new opener.
And then it's like, well, database is in SQL.
Those have been around a long time.
Like, that's not going to fundamentally change.
And there's practical reasons as far as on the front end stuff,
like you can change things with like a lot less consequences typically than the back end.
And then you get into like the world today.
You're like, that might not be true anymore.
I love that. I mean, I love that generally that you were like, this data is not going to change that much.
Right. Right. And the pace of change, even outside of AI is accelerated.
Yeah. Even tooling, right? Like the tooling was like a couple major vendors all using these
exact same language. Yep. Yep. One comparison.
that comes to mind
is e-commerce.
So the, and the reason that came to mind
as an example of the fundamental shift
is that it was an entirely new way
to do something, right?
It fundamentally changed the way that people shopped.
You're talking like,
we went from in-person or a catalog
to being able to shop online.
To being able to shop online, right?
And I mean, I would say there are generally some interesting parallels
for any sort of major change like that, right?
The pre-existing, like the environment needs to be there, right?
And so internet and browsers and other things like that,
there was a lot of infrastructure that preceded being able to shop online, right?
So there are a number of interesting parallels there, right?
Where it's like, okay, you have compute power and you have the advances
in the actual large language model technology themselves.
You have the transformers.
There are a number of things that were precursors
to create the environment in which this could happen.
You know what's really interesting around the compute?
I had been thinking about this too.
So crypto first, big boom, lots around that,
and then the AI stuff.
I don't actually know enough about the background here.
How much of stuff that maybe was provisioned or thought
of like, yeah, we're going to use this for cryptos?
I was like, oh, you can throw that an AI problem instead.
Yeah.
Yeah, that's interesting, because there's this huge boom.
And we know that it's true from a like power consumption and like
baseline stuff of like things.
Yeah, that is interesting.
Yeah, yeah, sure.
I wonder if some of that is and like we'll continue to.
Yeah.
I mean, crypto is still a thing, obviously, but we'll continue to shift to AI.
Yep.
But it's, I was thinking about the, my grandparents, my grandmother's still alive.
She's 97.
Wow.
My grandfather passed away a couple years ago.
And they didn't use a computer, really.
They had a computer, but they just, they didn't use it, right?
They just, it wasn't part of their day-to-day life.
Yeah.
And my grandmother shopped a lot on QVC, right?
Like, call this 800 number.
Oh, yeah, okay, yeah.
But what's so interesting is that when e-commerce comes about, you have, it's,
there are a, it was a very,
large group of people, I mean, it's kind of crazy to think about this, who didn't really have
a personal computer, right? They went to work, they used a computer at work, but like they don't
really have a personal computer. Right. And so the very concept of shopping online and not
having to use a fun is a wild concept, right? Yeah. And of course, that changed very rapidly,
but it feels the same because it's hard to even imagine that you would have an AI agent doing like
these operational things like it just like it's whoa that's kind of it's just fundamental there's
sort of this fundamental shift and it kind of seems like e-commerce i don't know that was the main
thing that came to mind in the interesting thing about the AI operational agents let's just say like
in a personal like for personal things like hey go book a reservation or book traveler whatever
the interesting thing about that to me is that absolutely exist as a pattern today it's like
a personal assistant or whatever but from an accessibility
standpoint, you just opened up
kind of a luxury service,
right? Like not like, not
everybody's going to like be able to afford
or want to use a travel agent or not everybody's
going to have like a personal shopper or whatever.
But you've got this like what's historically been this
like kind of luxury thing
that if the AI gets there
to be able to do those things, like you open it up
for a ton of people. So that's an
interesting like
space thing versus like opening up
something that's more like mundane
that like yeah, we used to like do
this manually now it's more automated. Like that's one thing, but taking something that like
people valued highly enough to pay a lot of money for in commoditizing. I think it's fascinating.
Yep. I think one of the really fascinating things about this technological shift is how many things
it's impacting. Simultaneously. Simultaneously. Which is in, in, I mean, completely unrelated,
it's probably an overstatement, but let's just say completely unrelated spaces. So for example,
e-commerce, okay, it changes the way that you shop. I mean, there are a lot of things, right? The way
that payments, and I mean, there's so much that grew sort of out of that, right? But if you think
about the example that you just gave of, okay, I can use AI to book a reservation at a restaurant,
it's also fundamentally changing the way that we think about developing software, right? Yeah.
It's fundamentally changing the way that people are even thinking about finding information
generally, right? Which has an impact on the way that you even think about searching the internet
at all, right? And I mean, it's changing the form factor there to some extent. And so you have all
these different areas where it's having sort of disruptive impact. We had that conversation
the other day of like Google it, right? It's like we'd say that all the time. But you're like
perplexity. Like what's the AI? Perplex it. Just like fundamental things like that. I don't
know with the verbal data, but there will be one.
Specifically with data,
I had this honestly horrifying
thought the other day.
We were talking about this. I need this.
We were talking about this before the show.
And it's like, all right, you take all this stuff out
five years or ten years or however many years.
Like, what did companies look like?
One of my horrifying thoughts was
that it might really drastically
change a bunch of roles, which is like, okay,
a lot of people think that. But it might turn into
essentially most companies have sales people that sell things to other people and then like
operations people that operate and like and beyond that like of course there's still going to be
some levels of specialization and you probably still have some kind of finance accounting things
but I think a lot of companies will consolidate down into less divisions and we're talking about
SaaS specifically I mean the dream like me being from like a technical background the dream is like
Yeah, like we're going to put that landing page out there, drive some inbound traffic to it.
People put the credit card into Stripe, but I've got a SaaS product and we're going to scale it and grow it.
You don't have to interact with people.
Like, because of the fundamental change of like this barrier to entry, I think probably continuing to lower into being able to create a SaaS.
Yep.
A product, I just think it's going to fundamentally change the growth trajectory of most of them.
And it's going to be, sure, you'll still have the viral stuff.
have, like, influencer-driven stuff.
But I think the other part is, like, well, you're going to have to sell a lot.
Like, probably have a lot more people selling, which if you're a founder, somebody that's
technical, like, oh, this is terrible.
Like, I don't know.
I don't like this future.
Yeah.
I think it's a real, like, possibility.
I think it's already happening.
Okay.
Let's, can we explore this topic by, by digging into this dystopian future?
Yeah.
And, by the way, I want royalties, if this turns out of a book that you write.
Perfect.
A book that you.
use AI to write. But, okay, let's take into this dystopian future. So I'm thinking about
our listeners who are very similar to you. They are managing data at company XYZ, right,
which is our roles that you've had. Okay. In this dystopian future, let's actually make
it specific. Okay, so when you were running data at the large e-commerce company,
rough team structure, like rough team structure, how did, what did your team look like? Okay. Yeah. So,
analysts, a couple of analyst type roles.
Yep. Several engineer type roles and some specialization in that, like one that's more
focused on like pipelines and integrations, one that's more focused on like front end stuff.
Yeah. Some, a team we had a like an offshore group that we used for like certain parts of the tech
stack. What else? Yeah. So that was the basic like tech side of the house. What am I feel like I'm
missing something? No, but yeah, I think that was the basic tech side of the house. And then like on the
digital side, marketing, paid advertising, agency.
Right, because you managed all that side.
Yeah, you manage that side of it as well eventually.
Part of it.
So, yeah.
That's interesting.
Okay, you manage both of those teams.
Okay, so.
Which is not very common.
That's not very common, but it's actually interesting because it'll make the dystopian
future spicier.
Right.
Okay.
In this dystopian future, what is, well, and you have the sales team that's, yeah,
that's taking calls from customers who want to order 10,000 of this particular part
that you were selling through the that you were selling right yeah okay in the dystopian future
let's start with the data team what does that actually look like right and let's just say let's just say
for example because we need a protagonist that you're still a human in this equation right as the
leader of data right what other humans are there and what do they do and what's been replaced by
AI. So I, one, I think there will be more merging of like, like in this particular thing,
there's like maybe a digital ops type thing. And then maybe the, because we did like
warehousing and actual physical cause, maybe there's like physical ops. So there could be
divisions. But I think it's like this digital ops thing. And you, so you, someone reports to
you their title is digital ops something. Let's say director of digital ops. Yeah. Yeah. It's a very
flat company at this point. Yeah. Right. But director of digital ops.
And you essentially, like, have what people that used to be in, like, marketing type roles,
like, rolling up to that.
You essentially have people that used to be in, like, more specialized technical roles.
And that director of digital ops, like, maybe it should be manager of digital ops.
Because, like, it might only be a couple people, like, one person that's, like, a little bit
more on the, like, marketing creative side that's, like, managing an army of, like, AI, more
creative type things.
And maybe some, like, some outside help as well on.
on areas that still require like specialization or deeper knowledge.
Yep.
And then somebody that's a little bit more technical leaning that's doing like data movement
and like transformations of data and things like that.
But I think the spread of like, hey, like right now, like the graphic designer could never
do the database administrator's job.
I think your spread is going to be a lot tighter.
You're going to have the ability to have people that are way more down the middle that
like, yeah, like this person is better.
they're more creative,
they're more on the marketing side.
This person is a little better
on the technical side,
but it's going to be way less extreme than it is.
Yep.
Is what I suspect.
So I want to know a couple things.
I want a couple practical,
I want to take a practical look at a couple of things
in the Susopian future,
just as far as like managing data
and the data stack itself.
Okay.
And let's just say, okay,
also do you have an analyst?
So you have the director of digital ops.
you have a creative, you have a creative person.
Right.
A creative person.
And then just a generic, like technical person.
And then a generic technical person.
So they're essentially a team of three.
Right.
The digital ops team.
Right.
Okay.
Potentially, yeah.
There's a lot of things that would have to go right or wrong.
I'm not sure which for this to happen, but yeah, potentially.
Okay.
And what was the team size previously just across?
Man, order of magnitudes more like, like, so we're at three, I don't know, call it like eight.
wow okay yeah so that's significant right okay in each level of and it will drastically depend
I think on what you're doing and it will depend on how you want to architect the business I think
it'll be a lot of businesses that you can architect like hey we want to optimize for at least people
possible yeah people opt for that model others will opt toward like hey human touch like this a really
big part for us yep we're going to optimize for another yeah yeah I agree with that I actually think
I mean, the new, there are already entirely new business models and like ways to think about operating a company.
Yeah, we talked to a founder recently, like single one person founding a SaaS company doing extremely well, doing all of it.
Yeah, Mike Dragalus, shadow traffic.
Yeah, shadow traffic.
Yeah, yeah.
And it can probably scale quite a while.
Yeah, yeah.
Just him.
Okay.
On the practical things, so you need to set up a new data pipeline.
So you get into work on Monday, okay, there's a meeting with, I guess, a small number of people.
Everyone fits in a company.
Yeah.
Whatever.
Okay, you need to create a data pipeline to do something, okay?
A feed of inventory to some system to do something, right?
Let's just say update inventory in real time or something of that nature, right?
So, okay, a new pipeline needs to be deployed.
What does that process look like?
So you leave this executive meeting because it's a new pipeline.
an AI world, like the notes and action items are already materialized for all these people.
And so you set up a quick meeting with your digital ops team. Who does what? And like,
what does that process look like? And kind of how are they in this dystopian future creating
the pipeline, all that stuff? Yeah. So I think you're going to buy tools of like, hey, this is,
we have an integration tool. We bought like a tool. So I think that's like number one. They'll have a lot of
abilities for inputs and outputs of various formats
and probably a storage component
or work with the storage component you want
and then on this technical person
I think that would be the person
is essentially going to go
okay vendor A
like show me your docs
what are your specs
okay the origin and then destination
like vendor B or thing like show me your docs
show me specs and then like all right
we use X like middle layer tool
and essentially like feeds all three of those
with a little bit of guidance to some kind of AI type tool
or maybe that will get built into the integration layer
eventually and says, okay, go build this thing.
And I do not think it will be perfect
the first time for a long time.
But I do think somebody slightly technical
will be able to coach it through a few quick iterations
and get to something that is pretty good.
And then it's going to depend on like how important is this?
like should we have some kind of code review step?
Can AI do the code review?
I think that is, right now that's a really tricky step
because you can get pretty far with is like vibe coding concept.
But it doesn't make sense.
And I wouldn't be comfortable.
I think most people are not comfortable for like true production use.
Yep.
Of a lot of this stuff.
Yep.
But I mean, actually there's a business model out there already.
I think it's pull request.com
where you just like have a third party review all of your pull request.
So you can imagine like something like that integrated into your system.
And you've got one person kind of coaching AI like,
Docs, Dox, Dox, Integration, and then you send it off to pull request.com.
They review it. They happen to have a specialist as an expert in whatever tool that you're
using to integrate. And they're like, all right, it looks good. And they use a human maybe or AI
in the loop with a human. So I don't think that's, I don't think that's now, but I don't think
that's forever from now. I agree. Yeah, I totally agree. Okay, the, so you need to generate
analytics based on this new pipeline that you set up, same basic flow.
generate like the okay yeah so we got all the data flowing and like we want to like some insights
on the orders we have flowing through yeah yeah yeah i think so i think it would start with some kind
of like like okay what does ex executive want to see or what is whoever want to see and then there's
probably again somebody that's responsible for feeding that into the system and then like
coaching it through a couple iterations to get something like that they know it is the right thing
yep and then saving it off for the executive to look at and then theoretically
the executive has an easy way to tweak it a little bit more if they want to.
Okay, what we're kind of getting at here, one of the interesting threads,
thank you for giving me a glimpse into your dystopian future.
I do think there's another version of this that also is likely to happen
where this stuff is where we have lots of horror stories that come out of where like AI
really screw stuff up.
We've got, and this will happen.
It just depends on like at what rate and who's to blame.
of security breaches because people are just like
slinging code and
of things going horribly wrong
and companies will probably react to that
and like pull way back
at least first on industries, maybe everyone
and that would definitely drastically
decrease I think adoption
and depending just how rocky that gets
I could definitely see another version of this
where like the future is actually
five years from now is not that different
but the reason
I think that is less likely to happen than I would have believed previously is how much money
is tied up in all the major tech companies for this to succeed.
Yeah.
And if Microsoft pushes it, it tends to happen.
If fill in the blank with some other like company, things just tend to happen.
Yep.
Because that's who the big companies trust with like, what should we do with their tech and
it?
Yeah.
Right, right, right.
Fascinating point.
So what areas, that's a great, that's actually where I wanted to go next is,
which areas do you see in data
AI having the most impact
and where is it going to have the least impact
and I mean
like a couple specific examples of that right
like one of the
one of the things we talked about was
producing the analytics around this right
and so you don't need a team of analysts anymore
AI is, it's going to get to a point where it can generate,
could SQL, whatever, right?
That sort of concept, right?
That it's going to have a huge impact there, but there may be other areas.
Probably still have analysts depending on what it is,
but they'll actually be analyzing.
Most analysts don't actually analyze anything.
So I do think, yeah, it's true, right?
They clean data, they move data around, they copy data.
So I actually think you probably still have analysts in some form or fashion.
It's either combined with another job.
because it doesn't need to be a full-time job
or to your point
or to what I was saying earlier
like they actually start analyzing data
versus just moving it around.
Yep.
And what do you think,
what are examples that come to mind
of things that won't change?
Things that won't change.
I mean, what's going to be largely the same
in five years?
I mean, I know the actual answer.
I, largely the same in five years from now.
There's one, there's one glare
storage. Essentially, we will probably still use similar commodity storage for data.
Yep. And there'll be a lot of like noise happening above the storage, but like I don't think
the storage changes fundamentally. I agree. The spreadsheet. Oh, sure. It will never die. It will
never. The spreadsheet will probably never die. But but I also think that one thing that AI can't kill.
Yeah, but I also think since like essentially like S3 or S3 equivalent is behind everyone.
of these things still, that probably doesn't change. Yeah, I agree with that. But on the user-facing
side, yeah, some version of a spreadsheet highly, highly Beth, that will still be around. Yep. And that's
like a, is it Lindy principle? There's a principle around, like, especially like how long something's
been around drastically impacts, like how long we'll be around the picture. Okay, we'll look that up
and put it in the show notes. My laptop, the way we're sitting, it's too far away for me. It's too
far away to Google that live, to perplex it live. Sorry. So,
one of the big things that I think is going to be really interesting as we think about
where AI is going to have impact and where it's not is the dynamic of
how do we want to frame this essentially being an interface to all sorts of other platforms
and tools yeah right which is really interesting so the here's an
extreme example, right? How often, if you could essentially manage your infrastructure, let's just
snowflake or whatever it is, if you could just manage all of that through AI in the same way
that you talked about, right? Okay, here's some documentation, here's whatever, like just go do this
thing. Right, right? How much are you logging in to Snowflake? Right. Sure. How much are you logging
and logging into these platforms, right?
It's just interesting to think about the interface for that changing, right?
Yeah, whereas essentially to all these platforms we used to directly interface with essentially
like you never look at anymore or you only look at if there's a problem.
Right, right.
Well, yeah, it essentially becomes the platform's utility becomes troubleshooting and other things like that.
I mean, visibility, observability, those sort of things.
But that's really interesting.
I mean, one of the ways this is already having an impact is like general decreases in website traffic in certain areas.
I was talking with a friend who works in and around that industry.
And it's like, okay, it's like, oh, wow.
Like the actual inquiries that people are making, I think are dramatically increasing.
It's just that some of that is shifting over to GPT, right?
Instead of going into Google and then going to a website.
right? It's just that it's delivering
it's delivering that
end user visit directly to you
right well I mean I think it's fascinating
because we've talked recently just
in the customer data space
and attribution specifically
like a tool like a rudder stack
doing server site attribution and looking
through like kind of raw data around
that seeing open AI
and seeing like these other tools
pop up in the attribution is really
interesting yeah like and I'm
actually seeing that
happen, especially in like in Sashland of like, wow, like there's a decent amount of traffic being
driven from these AI tables. Yeah. Yeah. Yeah. It's super interesting. I do wonder though what the
impact of the trust factor is going to be, right? Especially when you think about things like production
pipelines. Yeah. Where I mean, it's just hard to trust, right? Like you just, right? It's really
hard to trust as opposed to writing sequel, right? Which is sort of exploratory by nature in many
cases and iterative. And I think that's the thing, right? If it's, if you can get to a result
that's like abundantly clear if it works to a human, AI is actually really great for that.
And I found like we talked about this too, like visual like front end stuff. Oh. Like it's pretty
cool for that. Like when you're working on a website tweaking visual front end stuff because it's more
evident that like, oh, cool, it's in the right place. It's the right size.
Now, like, you can still have some bad stuff going on in the console. You can still have
a security problem. Like, there's things that could happen. But I do think it can be less
evident, like, further down the stack of like, oh, like, we made a major problem here. And
it's an edge case that we'll find eventually, but no idea when. Yep. And then the question
becomes like, well, that's true of humans too, right? They're going to make mistakes that show up
later to buy this. And the question becomes, is it more frequent with AI? Is it, could we use the
AI to try to catch those? Yeah. And a separate, separate tool that doesn't have the knowledge of
the original tool, just like you would with humans, like a double blind like audit. Right, right,
right. Yeah. Yeah. Super interesting. We're going to take a quick break from the episode to talk about
our sponsor, Rudder Stack. Now, I could say a bunch of nice things as if I found a fancy new tool,
but John has been implementing Rudder Stack for over half a decade.
John, you work with customer event data every day and you know how hard it can be to make sure
that data is clean and then to stream it everywhere it needs to go.
Yeah, Eric, as you know, customer data can get messy.
And if you've ever seen a tag manager, you know how messy it can get.
So Rutterstack has really been one of my team's secret weapons.
We can collect and standardize data from anywhere, web, mobile, even server side, and then send
it to our downstream tools.
Now, rumor has it that you have implemented the last.
longest running production instance of rudder stack at six years and going. Yes, I can confirm that.
And one of the reasons we picked rudder stack was that it does not store the data and we can
live stream data to our downstream tools. One of the things about the implementation that has been
so common over all the years and with so many rudder stack customers is that it wasn't a wholesale
replacement of your stack. It fit right into your existing tool set. Yeah, and even with technical
tools, Eric, things like Kafka or PubSub, but you don't have to have all that complicated
customer data infrastructure. Well, if you need to stream clean customer data to your entire
stack, including your data infrastructure tools, head over to rudderstack.com to learn more.
Okay, well, continuing on with the discussion about AI, Snowflake made a big acquisition,
Datavolo. Is that how you pronounce it? Yeah, datavolo. They announced it late last year,
and it seems like they're starting to kind of roll out
and have customers using it more
and this first quarter going into second.
Yep. What,
okay, industry-wise,
we'll just do industry pundant.
We'll do an industry pundant segment.
What is the,
what's the move by Snowflake there?
What's the, how are you reading it?
Yeah, I think.
What is also, I guess,
we should establish, like,
what does data fall actually?
Yeah, I'm not like deep in the tool,
but just from like reading
about it some. It is one of these
data pipeline tools that
is kind of
marketed toward and maybe
specifically like adapted toward
people that want to pull data
and then do AI things with it.
Like at a really high level.
So the specialization is unstructured data.
Right. So you can
yeah. So you can yeah
you can move
data that is in various different
formats that you would want to run
AI workloads over
which is different than a typical
yeah than a typical like
ETL tool for example right exactly
exactly it's interesting because I've seen
there's a number of tools
in the space and then there's some interesting
ones like
altrux is one
there's several others that that
have been around a while this
altrux is like more of a graph graphical
like you can yep and they've got all these little modules
I'm not I haven't super
kept up with what they're doing but I
believe they're getting into the AIP
as well and it's interesting with that space where there's going to be these all like all in one
tools of like data AI use case great ETL great like whatever you want to do yep and they're like a
graphical more of a graphical interface then I think there's these specialists more specialized tools like
hey I want to move customer data around I want to move like I said AI data around I want to move
structure data around and so it's like there's two axes there's like the no code low code versus like
as code like movement like hey I want drag and drop I don't want to do any code or hey
I don't want interface at all I want all to be like yep yamble or something yep so that those two
axes and then there's a specialization access as well as far as like really attuned to like
the customer data movement problem we really nail that yeah yep use case versus like we're
a generalist tool like we'll move data wherever you want it yeah it'd be interesting to see what
happens there yeah yeah yeah it really will be interesting I mean it is going
to be fascinating to see how, because the cost is decreasing as well. And if we go back to our
previous example of like an interface on top of this tool, and you think about Snowflake,
and actually, interestingly, we were talking, who is I talking with someone? They were running
Neo4J inside of Snowflake, which is really cool. A graph database. We were talking about some
identity resolution, identity resolution type use case. Anyways, they were talking to
talking about running, it's super cool, right? You just, you can run Neo4J inside of Snowflake and actually
you can sort of like push stuff out as tables or views, a couple other things. I don't know all
the specifics, but just like, oh, that's cool. But you think about, I heard that and then you think
about like DataVolo doing unstructured data for AI workloads, et cetera. And it's, which this is
clearly like Snowflakes long term intent is, okay, what do you want to do? Right. Right. Right. Yeah.
What do you want to do? You want to, do you mention customer data, right? Do you want to build an
identity graph? Do you want to do something with generative AI over this or whatever, right? And it's
got the pipelines. It's got the query engines, like all that sort of stuff. It's pretty fascinating.
Yeah. Yeah, for sure. And this actually reminds me of, you're familiar with thinking fast and slow,
the book? I haven't read it. Oh, you haven't read? Okay. It's a good read. And I am probably going to
butcher this, but it got me thinking about
there's this concept that he talks
about the book, not specific to this book,
but like regression to
the mean, right? So it's
essentially you've got like outliers
and there's just like principle of things regressed for the middle.
Yep. The example he gives in the book,
which I think is fascinating, is it talks about
coaching. They're coaching you up. Say you're
a baseball player and you like
strike out and you've got a guy
on third base and like, oh, like, come on
Eric. And the coach yelled at you, right?
Yeah. You go up. And like next time, like you play
better. It's like, oh, that must have worked.
Options to do it. Like, you go up, you hit a home run. It's like, oh, like, good job, Eric.
Yeah. And the next time, like, you do worse. So what, so you're the coach and like, oh,
I need to be harder on Eric. He does better every time. But the actual principle here is you
regressed to the mean. So you hit a home run and then odds are you're going to do worse next
time. You struck out odds are or whatever, struck out two times. Right, right, right. Yeah,
yeah. This is an interesting thing. You're going to, you're going to, you're going to eventually reach
your batting average. Yes, exactly. Exactly. Exactly. So like the way it relates to like this and
AI that I've been thinking about is I think there's going to be a stronger pull to the mean
if people are using AI tools. Because if you're thinking about this like AGI concept and stuff
of like, and this is kind of a pushback on like the generalist concept that we like launched off with,
there is I think going to be this like these unique scenarios where like there's such a pull to the
mean of like, oh, we should solve this in this one common way where people are going to be like,
no, like, no, actually there's this like novel way that like is much better that's not pulled
to the mean. And that's where I think a lot of the engineers, like really good engineers are going
to gravitate toward those problems where it's like like ETL of like, oh, okay, cool.
98% of the time, 90% of the time like, yeah, use this generic ETL tool. It's the right tool.
And there's going to be a stronger pool there where that could have been 60% of the time before
maybe it becomes 80 or 90% of the time. Yep. But the last 15 or 20% I think.
will exist for a long time and then engineers will work on those like really interesting problems
because they i don't think that goes away completely and i don't think that strong pull to the mean
like well you might get to 80 percent or some high percentage is going to go to 100 percent yeah
yeah super interesting i need to borrow that book from you yeah it's a good read it's it's just got a lot
i should read it again this got a lot packed into it okay let's dig into the i want to dig into that
topic a little bit more in terms of specialization and generalization, right? So if you, okay,
there's a, there's kind of a, there's kind of an accepted narrative, right? And we'll take
Snowflake, for example, we'll praise them and then we'll pick on them a little bit here, right?
Okay, they, and they've acquired a number of companies, right? Stream, I mean, they've been very
acquisitive, which makes sense. Streamlet, Data, Bolo, a number of other companies, right?
And so, and which makes total sense, right?
Because they're clearly building towards the scenario that we talked about, right?
Which is like, what do you want to build?
I mean, you can do whatever you want.
Right.
Do stuff in real time with streamlet, do, you know, whatever.
AI stuff.
And it's, and it just becomes this like really big platform to like,
right.
You said to build something.
Right.
And so the narrative that's sort of generally accepted is as that happens, though,
that it becomes a big platform, right?
That you can do anything in.
And that's actually part of the problem.
and what creates the opportunity for a smaller specialized company to disrupt.
And so in the world of data, like, I mean, it clearly, the storage aspect and the things
that we've talked about with Snowflake and with Databricks, and there's consolidation
there because they want to be like large cloud platforms, right?
Right.
What are the other tools that you think are going to get generalized like that?
In the data space?
Yeah, yeah.
I don't know.
I mean, Microsoft is already
from a marketing perspective approach
that's like, hey, Microsoft Fabric. And then like,
that's the marketing thing. This is one thing.
You have all these components to it. In reality,
like it's essentially just a bunch of different components
that they branded as fabric. But I think that
happens for Databricks for Snowflake
for others. Yeah, where it's like, cool.
Like, data stuff, like
do it in our platform. You can do it all. You can have
the Viz layer. You can have the pipelines. You can have
the storage. You can have the AI
LLM built into it. You can do
all the things. And
The question in my mind is, does those companies being able to use AI internally change the equation where it used to be like, oh, well, yeah, that happens.
You become a generalist as a company.
You grow big, great.
But then opens up a bunch of doors for specialization to do X piece better.
Yep.
Is that still true when these companies have these sophisticated AI models where maybe they can juggle more?
I think probably, yeah, it is.
Because guess who else has the AI models and similar technology?
technology, the innovator has the same thing, right? I don't know that you're a competitive advantage.
Right. Yeah, that isn't, that's a great point. Right. That's not actually a competitive
advantage. Right. Which that comes up all the time with like people like in security. Like,
oh, like all these thieves are going to have all this AI and it's going to be terrible. It's like,
yeah, sure. But so will all the security companies. Right. Right. Right. Right. So it's kind of like
there's like equal force both directions. Yep. But other than like those companies like
consolidation, I, for good or
for bad, and actually probably a little bit more for bad, I do think it ends up, you end up
picking more mainstream winners where you have a bigger gap between like the mainstream
winner for the most use cases is like down the middle. And that's like a really big like chunk of
the market. And then you still have like the like I was saying like the people that really nail like
a specific painful problem on the sides. But I do think it probably makes
that gap bigger where if like down the middle like CRM for example simply in CRM
sales force does not have 90% of the CRM market it's I don't remember the number but
it's low interesting I think it's below half we it's really low I might be wrong about that
it might be like a little bit but it's not like 90% interesting one of us book I'm going to
reach over to my computer and Google this one of which is you got somebody like HubSpot
second is industry specific
CRMs that pop up
and third is a number
of companies that don't really use CRMs still
I am
perplexifying it
I'm not oh I actually should use
perplexity sorry I just I use
Raycast and it defaulted to
DPP40 but look at
Raycast is doing a really nice
webbed up to Raycast
very cool
21.7 to 21.8
21.8
No
Yeah, it's tiny.
Wow.
Less than a fourth.
Wow.
I am processing this.
Yeah.
Less than a fourth of the market.
That seems so crazy.
Yeah.
Now, the real question will be, though.
I say, like, there's going to be more, like, progression to the mean and more like that middle lane.
Maybe it gets less congested.
But it could just get more competitive and not necessarily combined in, like, one product.
Yep.
Because you could still have a.
three or four major people
that are in that middle lane
that are essentially the same
but they're still competitors
and they still have fine
and they're selling different ways
and people just prefer one over the other
I mean think about
clothing brands that's all just clothes
and we've got a ton of those
or car like same with cars right
it could become more like
a car shopping experience of like
does it have four wheels yes
does it have four doors
can it track like it's like
they're all and car people are
to be super like, no, they're not. But like from a transportation standpoint, like, they're all
pretty much the same. Yep. But technology could become more of the car brand thing of like,
yeah, well, like you got four options, like down in the middle in this like 89% lane. Yeah.
Then you just go with your preference. Yeah. Essentially. Yeah, it is. You know what's interesting
to think about? This was very early on in the show. But Seattle data guy, Ben, he's been on a show multiple
times. We were to, I don't even remember the, I don't, he's been on the show a couple times. I don't
remember this specific episode, but we were kind of asking, I mean, he was at meta. Yeah, right, right,
doing data stuff. And then he does consulting in different projects or whatever, right? And so,
we kind of ask him, okay, what is, like, what's your go-to tool set? This is several years ago.
I think this is like early in the life of the show. Okay. Yeah. Like, what's your go-to
tools? Like, what do you use? If you're building a data stack, like, blah, blah, blah. And he mentioned that
He mentioned Snowflake.
He's like, I, he's like, there's, obviously if you're building out of data stack,
you have to have a data store, blah, blah, blah, right?
He's like, so you get a data warehouse for data warehouse stuff.
And he's like, I like Snowflake.
And he kind of paused.
He said, yeah, he's like, I like snowflake just because I like it.
And he's like, there's just something about it.
Like, I just, I like the, I like it.
Yeah, I think it's already starting where like a lot of these, there's definitely not
feature parity, so I'm not saying that between all of them, but I think there will continue
to be closer to future parity. And I really think it's going to be more of a car thing.
There will be definitely differences. Like as there are with cars about like, I want to optimize
for this use case. I like off-roading or I don't know. I want good highway miles. Like there's
obviously going to be that. But it's going to be like a really strong like I'm on this team.
I think it'll be more of that. Yep. Yeah. For sure. For sure. I mean, I think the other thing that's
going to be really interesting, circling back to progression to the mean. If we think about
different data tools, right? And you mentioned like alterics. There are a couple of up a gate.
Yeah. There's a couple on that case. Yeah. Tools that have been around for like a long time.
Allend is a good one. It's been around a long time. Anyways, one of the things that
modern companies, a common tool in their toolkit in terms of creating competitive advantage from
themselves from giant incumbents is a dramatically better user experience.
Yeah.
And I mean, one, I actually think Five Train is a great example of this.
I mean, they just have a phenomenal, it's just so easy to use, so easy to set up.
It's great, right?
It's just is great, like compared to a lot of other tools and you sort of end up paying
them money because you're just like, this is just a great tool.
Yeah.
In the data space, another classic example is linear project management space, right?
It's just like, okay, wow.
I mean, and that's not the only thing.
That's not the thing that made FiveTrain successful or the thing that made linear successful.
But it was a big part of it and sort of reflects like the DNA of the company.
But what's so interesting is it's getting so much easier if you think about these different data tools to deliver an absolutely phenomenal user experience.
Right.
Which is super interesting, right?
And I'm really fascinated because you've got, I think you're going to get a stronger and harder split between audiences here of like, because for data tools for me, like I'm gravitating really heavily toward fill in the blank as code, BIS code, data pipeline, all that stuff.
Because the productivity like increases drastic.
Yep.
And it will continue to increase, I think, with AI tools because, guess what?
AI tools are good at text.
if you've there's some neat stuff out there with like like jp chat gpts operator thing where it can
like browse the web and stuff but that is nowhere near where it is with text yeah yeah yeah
knows that but from a human perspective like humans are like no i i've yet and maybe this day is
coming i've yet to say like man this product is just like a killer user experience just so ergonomic
and it's like all command line like like that i've never had that feeling maybe we'll get there
we'll get there.
But so, yeah, that'll be a really interesting path of, like, how do you handle that?
Or is somebody going to be able to really nail, like, the as code piece and then also
just build, like, a beautiful, like, interface, and you can seamlessly switch back
and what seems like a possible, but way more work than, like, just nailing one or the other.
Yeah, for sure.
That's a really interesting point.
It's, I kind of think about Postman as an interesting.
example there. Okay.
Because you can do a bunch of
different stuff in a command line. There's so many
niceties that they provide
for doing all sorts of different things, right?
Like graphical
organization. Right, right.
Yeah, and I think there'll be more of that. Yeah.
And I think that's when you can switch into like YAML mode
and like quick stuff and like switch back.
Totally. Yeah. Yeah, that's a good example.
Totally. Yeah. Yeah, that is. Yeah, it's super interesting.
All right, any last AI thoughts
before we turn off the recording? I don't know.
I think in conclusion.
I really am torn between like, does the future look like that generalist future we talked
about? Or does it look like that like regression to the mean where there's like the X percent
that is like generalist, but there's like a ton of stuff on the edges where you actually get
more hyper-specialized because like the general problems are solved and like technical,
really gravitating hard toward the edges. I think that's a real possibility too. And they're not
necessarily mutually exclusive. Sure. What's, okay, last question. We'll both answer
this. I feel like I did kind of interview you this. Yeah. Yeah. Old Habist die hard, I guess.
It was supposed to be like a whatever. Yeah, back and forth. What's the craziest thing you've done with
AI lately? Or like the thing that's sort of, you're like, whoa, that was crazy. Yeah, I think front
and stuff. Like messing with like, hey, here's a landing page. Like, really vague. New agreeable data
website. Yeah. Yeah, that's true. Yeah. Launch that. Definitely used that on some of the
front-end layouts but just yeah like this general like very vague because i'm like i'm not a
designer by any stretch of the imagination of like hey make this landing page look good like very vague
language yeah and it like and then like being pretty surprised with the outcome yep yep
super interesting yeah my turn the i think the coolest oh this is basically
building prototypes, which we do
an immense amount of different
things at that. But today
actually I did something new. So there's a tool
out there that I was looking at, like, oh, I wonder
if I should use this tool, if we should
get this tool to use like in the rudder stack
in rudder sacked platform. Yeah.
To sort of accelerate a feature, whatever, right?
And so, I mean, it's like a component. It's a set
of APIs, et cetera, right?
And so I thought, okay, like I go create a test account for this thing.
That's great.
I get the API key.
And I was like, you know what I'm going to do, actually?
As I'm just going to spin up like a dummy thing and I'm going to actually try to install
this, like, and try to install this and actually kind of see, see what it's like to use
this thing and see how it actually works on the back end and see the...
Whereas before you have to get a call with engineering, like, hey, let's do it.
Totally.
Like isolated environment.
Right.
And I'm not a software engineer.
Right.
I mean, I'm not a software engineer.
I know enough to like make...
Create problems for others, right?
Yeah.
Okay, but this is what's astounding.
And this to me was just...
I think we were talking about this the other day.
So I...
I create an account with this thing.
I get the API key.
I just hop over into Versailles, which we use Versailles.
We've used a number of different tools,
but we've deployed a number of different things
on the VERSL platform.
And so we have an account
and you can add VZER to the account.
And it does a number of nice things.
Which VCR is like their AI agent
essentially for generating.
Yeah, generating apps, software, websites,
all that sort of stuff.
Absolutely astounding, by the way, if you haven't.
It's neat.
It is a really cool tool.
Okay, but this is what was like,
this is what was so wild.
This to me, I was like, this is amazing.
I go in there and I
click create new project.
in Versel. And I was like, okay, I'm just going to grab any, they have templates, right?
It's just like, I'm going to grab anything or I'll just create something. It doesn't know, right?
So I go to create a new project. It's like, oh, there's a template thing. And I was like, I wonder what
templates in there. So I go and look, it's like, this product that I had signed up for has a starter
kit. Okay. Right. And it's just a, it's a fully functioning NextJS app. Okay, right.
And so I was like, oh, that's great. So I grab that. I create a Git repo from Versel because
my app account is connected.
Right.
It creates a Git repo for me.
It does everything, right?
And then the thing that I had to do to get it running locally was create a dot ENV.
Right, yeah.
Just fill it some environment variable.
Literally the environment variable to get it running locally.
I pull the repo and using cursor, I'm literally like, I'm using this tool, right?
And I'm seeing how the API works and I'm seeing like, I mean, it was just totally astounding to
actually go through with that, right? And then I can push it and it, and Vursell will deploy it
and I can share it with people on the team and like have a discussion about it, right? And everything's
fully transparent and we can sort of see how this thing works. Just to me felt that is a product
demo. I mean, holy cow. Because we talked about this too of like there is my last take or last
hot take on this. There is this future where like everybody in software now is selling generic
things for people to
use their imagination to implement
in their company. I think there's a future
where one of the major human value
ads is like, hey, we looked up what
your company does. We imagined for you
what it can do. And here's a demo of it, like
exactly what it would do for your company.
Yep. That's huge.
Totally. That is really big. Totally. Yeah.
It's wild. All right. We're at the buzzer.
Thanks for joining the show. John. Thanks for
joining the show. All right. We'll catch you next time.
You haven't.
The Datastack show is brought to you by Rudderstack, the warehouse native customer data platform.
Rudderstack is purpose-built to help data teams turn customer data into competitive advantage.
Learn more at Rudderstack.com.