The Data Stack Show - Re-Air: Bridging Gaps: DevRel, Marketing Synergies, and the Future of Data with Pedram Navid of Dagster Labs
Episode Date: November 26, 2025This episode is a re-air of one of our most popular conversations from this year, featuring insights worth revisiting. Thank you for being part of the Data Stack community. Stay up to date with the la...test episodes at datastackshow.com. This week on The Data Stack Show, John and Matt welcome Pedram Navid, Chief Dashboard Officer at Dagster Labs. During the conversation, Pedram shares his career evolution from consulting to his current role, where he oversees data, developer relations (DevRel), and marketing. The discussion delves into the synergies between DevRel and marketing, emphasizing the importance of understanding developers' learning preferences. Pedram explains data orchestration, highlighting its role in managing and automating data workflows. He also discusses Daxter's unique asset-based approach, which enhances visibility and control over data processes, catering to users from novices to experts, and so much more. Highlights from this week’s conversation include:Pedram’s Background and Journey in Data (0:47)Joining Dagster Labs (1:41)Synergies Between Teams (2:56)Developer Marketing Preferences (6:06)Bridging Technical Gaps (9:54)Understanding Data Orchestration (11:05)Dagster's Unique Features (16:07)The Future of Orchestration (18:09)Freeing Up Team Resources (20:30)Market Readiness of the Modern Data Stack (22:20)Career Journey into DevRel and Marketing (26:09)Understanding Technical Audiences (29:33)Building Trust Through Open Source (31:36)Understanding Vendor Lock-In (34:40)AI and Data Orchestration (36:11)Modern Data Stack Evolution (39:09)The Cost of AI Services (41:58)Differentiation Through Integration (44:13)Language and Frameworks in Orchestration (49:45)Future of Orchestration and Closing Thoughts (51:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Transcript
Discussion (0)
Hey everyone, before we dive in, we wanted to take a moment to thank you for listening
and being part of our community. Today, we're revisiting one of our most popular episodes in the
archives, a conversation full of insights worth hearing again. We hope you enjoy it and remember
you can stay up to date with the latest content and subscribe to the show at datastackshow.com.
Hi, I'm Eric Dots. And I'm John Wessel. Welcome to The Datastack Show.
The Datastack Show is a podcast where we talk about the technical, business,
business and human challenges involved in data work.
Join our casual conversations with innovators and data professionals to learn about new data
technologies and how data teams are run at top companies.
All right, welcome back to the Datastack show.
We're here with Pedram Navid from Dagster, the chief dashboard officer.
Pedram, welcome to the show.
Great, you're here. Thank you.
Yeah, so I think it's your second time on the show.
It's been a little over a year.
We'd love a quick kind of update and then tell us a little bit about your current role.
Yeah, I think last time I was here, I was enjoying consulting life, which meant lots of birdwatching,
lots of looking outside, being outside.
Since then, I've joined Dexter Labs about a year and a half ago, initially to run data in Devrel
and now also marketing, so far less time to do birdwatching.
That's too bad.
Yeah, it's too bad.
Back to the grind, as it were.
Okay.
so we're going to spend a few minutes chatting we've been spending a few minutes chatting preparing for the show
I'm excited to kind of get into like how you've gotten to this point and orchestrators in general
what are you looking forward to chatting about yeah I mean I can always talk about orchestration
we'll talk about data platforms how we've got to where we are could be kind of a fun story
we can always talk about AI we can talk about data engineering and how you somehow
accidentally end up running marketing could it all be fun right I'm like I'm
excited. Let's do it. Padram, excited to have you. Let's talk a little bit about how you ended up
at Dagster. So you were doing consulting, had some time to kind of work as you pleased, and now
you're back at a startup. So tell us about that process. Yeah, I think what happened is I was actually
consulting for Dagster initially. We had a great relationship and Pete and Nick, CEO and founders,
asked me if I wanted to join. Initially, I said no, because I was enjoying my freedom too much.
But one thing I found with consulting is your scope of work is often limited and you don't get to see things, you know, fully end-to-end.
And I also kind of missed a camaraderie of having, you know, people to work with and teams.
And so after some thinking about it, I reached out back to them and I said, hey, you know, if that offer is still on the table, I'd love to chat about joining.
And so we talked about a role, which was initially just a had a devrel role with, I believe, maybe data on the side as well, a small team of two or three people.
and that was almost June of last year.
And so I've been a year and a half now since I've been here.
A couple months ago, we took on marketing as well as part of Devrel,
which initially I wasn't so sure about,
but now that I've seen it operate,
it makes a ton of sense or Devral marketing to be close together and working together.
Yeah, that's really interesting.
So I've had a previous experience too where I ended up having a data team in marketing as well.
Tell me about maybe some of the unexpected,
synergies there. You've got dev rel, marketing, data kind of on the side. Like, what's,
what's come about? You're like, wow, this is cool, this is unified. Yeah. If you had told me initially
that I would be on a Devrel team reporting to marketing, I probably wouldn't have taken the job
because I've always felt like marketing didn't quite get Devrel. But this way, it's kind of
flip. It's like marketing and Devrel reporting to me, and I'm okay with that. So what I found is,
like, the Devrel's side of the house is like the content arm. Dexter's a technical product. We
target technical people.
And so we just need technical people
who have experienced in the field
to create the content.
For me, content's a broad term.
It's not just blog posts.
It's tutorials, workshops, webinars,
how-to, actual integrations.
Our team, our Deverell team
has built integrations that have, like, one deal.
So Deverell is, like, the producers
of, like, the marketing arm of Daxter.
And then the rest of the marketing org
is really in support of the distribution of that, right?
where the dev rel team probably doesn't have its expertise
is how to get their content out into the world,
whether that's through like paid ads or events
or campaigns, that type of thing.
And so having the two teams together,
it's like really, actually a lot of synergies.
I hate to use that work, but it is exactly that.
Yeah, yeah.
Where they sit together, we're on the same meetings.
Every meeting, every week we talk about what we're working on.
And, you know, the advanced person picks up on something
the deverell team's working on.
And so does like the campaign manager.
And then three of them together,
they're like, all right, let's go build something more holistic around that rather than
just as one-off content that you created. Yeah, that makes a lot of sense. So I've got Matt here
co-hosting today in place of Eric. Matt, you've been that technical data audience before,
you know, looking to purchase products or things like that. I'm curious. And I'll have to ask you
the same thing, Pedro. Like, what really clicks with you if you think back about like content
or maybe even just interacting with people around these technical products.
Like what mediums or what, like, what can you think of this really, like, clicked with you in the past?
Yeah, I think anything that lets you kind of see how the product actually works in a real kind of way
and not just the super trivial kind of look, one plus one equals two type of manner.
Right.
I think that helps, especially because previous to a lot more of this, it was very marketing-y.
So it was everyone feeling like they were trying to bait you into giving them your information or trying to thing or whatever.
So things that kind of give you that ability to see it.
And I think they have that credibility of professionals who've used it and who can show you.
This is what it's actually going to help you with.
So not like the 10 tips for personalization.
In your marketing using data.
But yeah, so same question to you, Pedro.
Like, what have you found that works?
because it's the data, at least in my opinion,
that data technical audience is a tricky one.
It's a tricky one to find,
tricky one to, you know, resonate with.
It is.
And it is like this meme almost of like developers hate being marketed to.
And I don't think it's true.
I think developers need a certain type of marketing that works for them.
Yeah.
Along their journey.
And their journey often might look different than, you know,
someone like a leadership role, for example.
And it just has to because a developer is going to,
going to sit down. And what they want to do is almost every single time is they want to try the
product. I want to figure out, is this thing, what it says it is, is it useful for me? Will it work
in the way that I need it to? And so a lot of the Devrel focus and the focus of Degster's
marketing arm is to enable developers to be successful in that entire journey from like becoming
aware of the product, trying it out, learning about it. And so things like docs matter a lot more
to a developer than they might get to a technical like leader even. As a director,
of data, for example, you probably aren't going to sit down and try Dynxter. You might care more
about what are the features, benefits, how is it solving, you know, the five things my CEO keeps yelling
to me about. But your data engineer is going to want to actually try the product and make sure
it hits the things that they actually care about. Well, and that can also be a little tricky
just because the technical ability of data people, there's a pretty wide spectrum you can fall on
there. There's some that are very, like they came from software engineering. And then there's
others that are very self-trained and might be coming more from the, I'm doing data engineering
or whatever because I have to and no one else here to do it. And so I'm scraping together
YouTube tutorials and stuff like that. How do you kind of, do you guys have a specific part of
that you're targeting or do you try to kind of have content more of a wider swath of that
spectrum? We definitely do the whole, we try as much as possible the whole range. You have to. There
Or, like, what I've learned is that not everyone is me.
And, like, I, like, a certain way of learning that other people don't.
And people, like, ways of learning that I refuse to use.
A great example is, like, Dexter University.
It's something we spun up last year.
It's, like, an online course.
It's, like, structured.
You go through lessons.
And that's the last thing in the world I would ever want.
And when they suggested it, I'm like, I don't know about this, guys.
All right, we'll try it.
And people love it.
They love it.
We get five out of five.
Like, if you look at our ratings, it's like 4.8 out of five.
And we get weekly emails, like how much they enjoy it.
And it was like completely forward to me because that's not how I learned.
What I've learned is like you have to provide scope for everyone.
There's people who want structured training.
There's people who want to just read the docs.
Some people just want to install it and look at the source code.
That's like the range that you have to deal with.
And all of that has to be good.
Your source code, your documentation, in your app, your code.
your code has to look good
in a way that people can understand
and interact with it,
all the way up to your tutorials and video
and people want to sometimes sit down
and watch like a 30-minute training video
on the product as well.
And so we do all of it.
And we hire for that too, right?
We have people on the Duffrell team
who are much more focused on the earlier stage
for someone of people who are like
getting first started
and we have people focused on much more
deeper understanding of the product as well.
Yeah, I'm definitely one of those
that like I do not want to watch a video.
are. For any reason, I cannot sit through a 30-minute technical video. I'm one of those
that wants to pull up the code. We'll reference the docs when needed. We'll struggle through it,
and then we'll think to myself. I should have just watched that video. So, yeah, I mean,
that's a really wide persona. And Daxra is a super flexible tool. You can use it a lot of different
ways. And that's got to be a challenge as well, where I come to it with like, oh, I have this
specific problem and you've got a tool that like well we can solve lots of problems like how do you
bridge that gap that is also a great question we are looking to bridge it through product
improve it so we have something coming up called thanks for components i don't know if i'm allowed
to leak it yet but it is coming it will be more focused on providing almost like building blocks
to develop the data platform yeah and so it'll be a command line based tool initially but you'll
have like your yamble schema you'll have a very easy ways to
plug and play different integrations. That's like our approach to sort of addressing that while
always being able to expose the underlying Daxter framework, which, as you said, is extremely
flexible, which has both as pros and cons. The pros is like you'll never really be constrained.
If you can do it in Python, you can do it in Daxter's essentially your limitations. The cons can
be like for a very simple setup. It can often feel like a lot to go through if you just want
to orchestrate like one simple task. Yeah, that makes sense. So let's assume
out a little bit for people that have no idea what Daxter is, maybe I've never even heard
of orchestration, like that kind of analyst persona. How would you describe just the general
field that you all are in, the data orchestration field to someone that was like, I have
no idea what this is? Yeah, it's a great question. Everyone orchestrates. They just might not do it
intentionally or they might not know that they're doing it, right? Orchestration could be as simple
as you log into your computer once a week and you click on a button and you kick off a process.
It's a very manual orchestration, but it's totally fine, and often it's the right decision for you.
It can become a little bit more complicated when you start to use something like a Cron scheduler that runs every single day or every single week at a certain time.
And that's often enough for many tasks.
When things start to get a little bit complicated is when you need to add dependencies or you need to be resistant to failures, essentially.
Once those two things become into play, like you want to make sure that A runs before B every single time.
can't rely on Kron, you sometimes can, like, fudge it. You'll say, you know, you start at 12
and we'll start this one at three. And I'll hope it never takes more than three hours.
Exactly. And it will always succeed. And if that's true, you probably don't need an orchestrator.
But often what happens is, I think people realize they need an orchestrator a little too late.
What they thought was true no longer becomes true. You can't really observe Kron that well.
Your tasks take too long. Something fails. Or even worse, your vendors, like, oh, by the way,
that thing we sent you two months ago, it was wrong.
Here's an update.
I go and fix that.
And it's like, well, I can't rewind time.
And my cron schedule doesn't know how to rewind.
And so once you start to get into these types of things,
that's where orchestrators come into play.
And they start to manage some of these more complexities for you.
I feel like you said there that, you know,
where you bring it in probably later than you should.
I feel like that's a recurring theme for a lot of successful data things are,
you know, if you would have brought this in two months ago,
this is a five-minute fix.
Now we're very limited in what we can do
and that type of a thing.
But I also don't know if there's a way around that.
Yeah, I mean, you don't know what you don't know, right?
And especially if you're doing something for the first time,
it's like, oh, like this works.
And then, like my favorite.
Because I think most people, if you're in a data role,
get to that, at least get to that time gap thing,
where I'm going to have this run at midnight,
this run at 2 a.m., this run at 5 a.m.,
everything's fine.
And then usually, if you get in that world,
you have some bad mornings where like the first one failed
and then it's like kind of a house of cards
and then because some of these take maybe you know hours to run
like it takes like you're kind of sunk
like you basically lost a full day for having the data correct
I think that's experienced right like you get burned
hopefully only once and you learn your lesson
or you work with people who have been burned
and they've learned their lessons and they'll impart that on you
or you'll listen to the data stack show
and you'll like learn about you know things not to do uh it's also human nature i think it's so much
easier i mean this is why ice cream tastes good you don't really think about the consequences
right running pipelines on cron feels good because you don't have to think about the consequences
right until it's too late so um we try to educate people about how it's probably easier it's not
that hard to set up a pipeline in daxter just cron like you can do it just in cron you don't
to use any of our advanced features. We have a cron scheduler. We have it in Dexter. And you'll get a
pretty UI, which is all more than you normally get out of cron. And that's worth its weight in gold.
And then from there, you can evolve as you need to. You don't have to go and build these complex
dependencies if you don't want to. But get started with something when it's simple, when it's just
like a few tasks, a simple DBT pipeline. Very easy to do in Dexter. We've got a great integration.
Or do it in a different orchestra, too. It doesn't have to be Dexter. There's others out there. But
get it in something that you can observe
because I think every engineer
knows observability and logging are like critical
to any system.
Yeah, that makes a lot of sense.
I've used Xer for a couple of projects
and this was kind of interesting.
I had, it was last weekend.
So anytime, no, it was two weeks ago,
it was around kind of that New Year's Christmas holiday.
And I got an error. I had set up
alerting, I got an error, which was handy.
And I thought, like, you know,
what is this?
It's like I better check on it.
And sure enough, you know, it's an API, like access to NID type error because I was
pulling data from an API.
So like, what happened, you know, figure out like, do I need to, did the credentials expire
or what happened?
It was funny.
So essentially what happened is I was pulling data from, there's like 28 different locations
on this project.
And essentially one of the locations I closed at the end of the year.
But since I had everything like separated out, it was like, okay, cool.
I can just like turn that location off and like everything keeps going and it's not a big deal.
I think those are the types of things that like, you know, had it been the other way where essentially everything like cascades through and you're like, oh, like I'm going to have to like rewrite a bunch of stuff, et cetera.
Those are the fun moments.
So I guess I'm curious from your perspective, obviously there's lots of different orchestrators out there.
What's special about Daxter and maybe even what's special about Daxter for analytics orchestration specifically?
Yeah, orchestration has been around for a long time. I think Cron is like the classic, right?
From there, I think Airflow is probably the next biggest registrator, most people have heard of.
And that's a task-based orchestrator, right? So you've got a thing you want to do, you tell it, and it runs, and it's like black box, and you sort of hope every box continues the way you wanted to.
But you have no ability to, like, peer into the box. What Daxter sort of said is, like, what if we split that or reverse that?
And instead of telling us about the task, tell us about the things you actually care about, or let us discover those for you.
So a great example is, I think, a DBT project, everyone sort of kind of gets where that is.
It's a collection of, like, tables that you want to materialize at some, you know, regular cadence.
The traditional airflow way would be to have a DBT task that just runs your DBT project, and then you sort of assume all those models in there are completed.
In Daxir, what we do is we flip that around and we actually expose every single model.
as an asset. And so Daxter is what we call an asset-based orchestrator because everything you care
about is now represented in this big graph of things that you can sort of follow all the way through
their logical conclusion. And so you can see all your DBT models within the Daxter view.
And you can actually be kind of clever about it. You could run the whole thing at once every single day
if that's what you want. Or you can say, you know what, my stakeholders care about these five
models. Run everything that depends on those on a five-minute schedule because they really want those
things to be updated. And then these other models over here, those put them in a group that runs
once a day whenever you feel like it doesn't really matter to me as long as they're refreshed daily.
That's something you can start to do with that extra. And then because you have this like asset
view, you can start to connect things outside of DBT as well in a really intuitive way.
Maybe you have a BI dashboard in Sigma. Maybe you have, you know, some stuff happening in
register stack that you want to connect it to some files dropping S3 bucket, FTP. All these things start
to connect and you build lineage on them. And so you can be really really.
clever about the full end-to-end orchestration of this thing, rather than just focusing on a
specific task. And so Dyx has really been, I think, the next level of where we are going with
orchestration. And in fact, airflow is even starting to move in this direction, which I find
really validating that, like, this is really the future of where orchestration's going.
Yeah, I think one, two, two benefits that I've seen from this, like, asset style orchestration
has been essentially what you said, one-time compression. Because if I have separate, like, extract
jobs that then load into a warehouse and then I have to transform and it's all like
linear the time compression to get that one essentially one report that I need to be fast like
fast as in like very up to date is there's just a limit right like if I'm having to do all of it here
all over here all over here the there's a time compression but since everything is is compute based now
there's also a cost implication right because if I can compress some of these like times for the
the ones that I want to be really fast, I can also do the opposite for things that like,
I only need that once a day. Before I was running this whole thing and everything was like every
five minutes, I can delay this 80%, which I don't care that it's a day old. And then that's,
that's compute savings in your warehouse, potentially cravings and, you know, your ETL tool.
Yeah. I think that's a big deal. You could take it even further. Because you've exposed to it's like
data lineage, you get all these side effects almost for free. And that's something we've actually
learn ourselves. It's like, now you have this data catalog essentially, right? You understand all your
data assets and you have the source of truth of where your data is defined. Well, now you can search
that and now you have a data catalog for free. Like, you don't have to go and maintain a separate
one. Yeah. Data quality becomes something you bolt on top of your actual execution. It's not an
afterthought. It's like as part of your pipelines, you can start to emit what we call asset checks
or data quality of things. And like you said, time compression becomes a much more interesting
problem because we can actually be very declarative in Dexter instead of saying we want to run
these things every day at 5 o'clock. You can say this asset needs to be updated by this time.
Do whatever it takes to make sure that's done. Make sure like you run all its parents whenever you need
to. And now you're limited by only the chain of things that matter to that asset and not everything
that comes before it. So we get a lot of really, I think nice side benefits of this asset view
that I don't think we really knew we were going to get when we first started going down this path,
but it's become really interesting.
Well, and that I think speaks to one of those things that you see
is that a lot of teams find themselves kind of they're drowned in whatever their process is.
And so they can't really see what the next thing they could be doing is.
And it's only once they kind of free up that space or that mental thing
because, okay, now I've got Dagster that's running this and I don't have to think about it.
Oh, now look at these other three things that have popped up that we can do
that we're never part of our initial plan of, you know, we were just trying to, like,
not have to spend three, four hours every day, you know, troubleshooting or fixing or running
whatever.
And it's like, now that's gone.
Now we can actually see more opportunities that we could have never thought of before.
100%.
There's that old cartoon of like a two caveman and one has like a square wheel and he's trying
to push it.
And his friend with the circle wheel is like, oh, you should try a circle wheel.
And he's like, oh, I don't have time for that.
I'm spending all my time pushing this.
square wheel up the hill. And I feel like that's the same way with like orchestration. Often it feels
like just an extra like step that I have to go through. But that extra step is like going to
compound your productivity down the line. Yeah. So I'm curious about the little bit about the
software space, software stack. So we're in 2025 now. I think the modern data stack was declared
dead last year. I don't know. Last year or two. Which I think practically means
like people are seeing like consolidation essentially.
I'm curious, like some of your thoughts on where do you think that shakes out?
Because we've got so many different layers we've added into a data stack of like extraction,
observability, orchestration, transformation, you know, the list really good storage.
Like the list goes on, like, how do you see that playing out in the next few years?
Yeah.
I feel that any time, like, you're not enterprise ready until you've been declared dead.
like that's sort of
Yeah
exactly
Love that
So the modern data stack
I think is now
Enterprise ready
I think it's ready
for you know
the mass market
to adopt
and what we might call dead
I see being implemented
still
there's so many companies
going through
like cloud modernization efforts
for sure
and they're moving towards
Snowflake
they're moving towards
Databricks
you're moving towards
DBT and cloud
like that's not dead
so if we define
modern data stack
as like cloud data
warehouses and like a few really good tools that's fine yeah i think modern data stack sort of
if you want to talk about the 2020s version of it where every basically function you had to do
what's its own company yeah yeah that's probably dead i don't think people want 27 vendors to
do three things at the end of the day right and so consolidation's going to happen i mean we're seeing
at a dixter like our customers are asking for us to like combine catalog and quality into
yeah one thing yeah our catalog will never be as good as a full
featured catalog that you go out and buy and pay like a grand for it. That's not where we're
competing. But there's probably some elements of those things that you can combine within the
products you're already using. That's going to continue. I mean, I think Bynchran is doing this
with like their transforms. I know you guys at Rutterstock are doing this as well. Thanks for
doing it. I think it's just natural. And what's going to happen is what happens all the time.
We see a bunch of consolidation. People get annoyed at the consolidators. Some new tool comes out.
And it's like, I'm really good of this one particular thing.
Interesting to go down again.
We get 100 of those things.
It's going to be a cycle.
And I think right now we're just in the, what is that plateau of productivity area where
I think things slowing down has actually been really good for data teams in general.
You don't have to pay attention to 500 different things.
You can kind of just put your head down and get your job done.
And the tools you're using to do that just keep getting better on their own, which is a good feeling.
Yeah, I think also during especially that peak, like,
2020-ish, 2021-ish time period, a lot of teams got very hooked on all the different tools
and kind of, you know, I mean, I saw where teams could kind of lose track of like, well, what
is this ultimately supposed to be serving, you know? Well, look, but we've got, we've got all these
different things and we've got all this data in a warehouse. And it's like, okay, but what's
happening to it? How is it actually turning into revenue or savings or profit or whatever?
Yeah. I mean, and it wasn't just data. What I realized now, I mean, I'm in marketing land a little bit. And the exact same thing was happening there. What was going on in marketing is everyone wanted a tool to solve their particular niche use case. And almost like nobody wanted to do the work. They just wanted to buy tools to do the work for them. And you ended up with like these massive marketing stats with like 40, 50 different tools to do like three things. So it wasn't just us, but it was everywhere it felt like at that time. But I think we're now in a better.
place where I think interest rates solve a lot of problems to be honest like yeah sure yeah
money not being free yeah it solved a lot of efficiency problems anyway I'll put it that way right
so we're seeing that consolidation it might not feel good to everybody but I think at the end of
day businesses are operating more leanly and they probably aren't you know losing a lot at that
expense either yeah I think that's right so talked a little bit about orchestration what that is
Daxter's unique twist on that. I'm curious about your kind of career trajectory. You mentioned
when we were talking earlier, data science, data engineering, now you're in Evrel marketing data.
Tell us about that journey. I think it's a little bit of a unique journey and be interested
how that all played out for you. Yeah. When I did like, I think it was in high school,
they asked you to fill out the survey. It'll tell you what kind of job you had. I don't even remember.
what it was, but it was like a job I'd never heard of. And I never knew what I wanted to be
when I, like, grew up. I just sort of fell into different jobs based on what I was interested
in at the time. Data science was, you know, a thing that was everyone's mind back in 2018. I think
it was. I was listened to all the data science podcasts. Many of them were now defunct RIP.
But they were, it was the next hot thing, right? And so I was like, all right, I'm going to figure out
how to become a data scientist. And I did that for a few.
few years. And what I realized was the new batch data scientists that were coming in, they weren't as
technical as I had been. I spent more of my time programming than they had. And so they were great
of building models much better than I was because they were trained in it. But they couldn't deploy them
at all. And so I started building like infrastructure just to make it easier for them to deploy because
their code was better than mine. So I ended up becoming a data engineer by accident. And I found that
really rewarding. It was great to like build something. And then the reward is like someone using.
it whereas with data scientists the reward is like maybe in a year you'll find out if your
experiment was correct yeah right so for me like that instant validation of like knowing i
built something that clearly like works or doesn't and the person next to me is benefiting
uh was super empowering and so that's how i started in data engineering did that for however many
years eventually became a head of data at a company called high touch which back then was like
really focused on the data persona
and as part of that
I was also doing what we can call
Devrel essentially talking about
the product to data people.
Ended up
starting a team there moving on
to consulting where
I thought initially I was going to do data consulting
and help people with their data problems
but almost every company that talked to me
wanted me to help them with their marketing problems
and even though I didn't think of myself
as a marketer I think they saw
the Devrel activities I was doing
and success we were having at High Touch
and they wanted to replicate that for them
and a lot of that was just educating
them that like
you know, copying the thing that I did
that won't work for you.
Yeah.
It's not the blog post that's successful.
I think a lot of people look at like DPT for example
and they saw their like massive community
and they thought, oh, I should open a Slack community.
And it's like, well, why?
Like how?
Like where's the value to the actual user?
Do you think did people want
25 different Slack communities
or do you think they want one or two places
to hang out. That might already be like a place that's covered for them. So it was more about
talking through what were really marketing principles, but to me it was just a common sense
about how to get to data people in a way that made sense. And that, I guess, like put the mark
of marketer on my head and eventually I joined Daxter initially as Devrel and more recently
Devrel and marketing and also data. Yeah, that makes sense. There's a trajectory makes sense.
and I would imagine
so the alternative here is like
let's just you know for Dexter
like well sorry you're a marketer right
yeah and there's got to be
we've already talked to some about the synergies there
but there's also got to be this like
scratch your own itch
you kind of get to market to yourself
or to your previous self which like
that has to be an advantage
I think for a company like Dexter
and for like any technical company
that markets to technical people
having a technical person
who really gets the audience and the go-to-market motion
and really gets it is critical.
And I think we've even made mistakes with this as well in the past
where like we're an open source core product,
like by our nature we are.
And so we shouldn't hide that fact.
And I think if you talk to a traditional marketer,
they might be like scared that people might use open source
because we're not capturing an email, right?
So direct them to the email form instead
and get rid of all the open source things.
from our website.
And bury it deeply.
Right.
Like it doesn't exist anymore.
Kill it.
And like that's the mentality
of someone who doesn't understand
how like developers might operate.
Right.
Like a developer is not going to want to sign up
for a course or fill out of form.
They're going to want to try the product.
And they do that through open source.
And so open source to me is not a competitor
to Daxter Plus or enterprise offerings.
Open source.
It's like a channel.
It's a channel where people get to try it.
And if people go out and they're successful
with open source and they never want to talk to us.
That's totally fine by me.
That's another Daxter user out there in the wild
talking about how great Daxter is.
That's free marketing.
And so for me, open source is part of it.
And like, you really have to understand developers
to be able to market to them.
And that's really kind of why this marketing journey
between Devrel and marketing made sense to me.
At first, I was suspicious.
I think if you asked me as a Devral person
to report into marketing,
I probably would have said no.
But if you have Devrel and marketing working together
and they're all reporting it to me.
It kind of felt fine.
And I'm seeing it today.
Like, it actually works out really well.
Yeah, and I think that's also, when you get to the open source stuff, I mean, especially
when you're trying to do something at scale, it can be most open source projects are really
hard to continue at scale.
So it gives you a way of people like it.
They trust it.
And then they can go to, okay, how do I make this easier for myself to use over time?
Yeah, and we see that all the time.
Like, people don't want to run and maintain infrastructure.
generally.
Right.
It can't be the only thing
because often the companies
that are good enough
at eating Dexter,
they can figure out
how to deploy Dexter themselves eventually.
It's not that hard.
So you do need to have
things that are value-driven
in the enterprise offering,
hopefully that will drive people to that.
But also, it's easier to get
open source into an enterprise
than it is a vendor.
So if I work at a big company
and I really like Dexter,
will I go and try the open-source product
and prove its value?
Or will I get into this,
long, lengthy, lawyer-driven vendor negotiation thing before I've even like shown it to my peers
that it's a good idea. I'll often start up the source, I'll build some momentum. And then once we've
proven out its value, we've hit either scaling limits or I just don't want to maintain it or we want
additional features. I've proven it's useful. I can go and have that conversation and I'll go contact
a sales team and have them start. But like knowing that's a journey that people go through is
I think critical in building out like technical orcs that market the technical people.
Yeah, I couldn't agree more with that. And there's this other component to where you've got,
you know, a team that's vetting a product, proving it works. Like, imagine that you're going
through a traditional enterprise sales process. And I've done multiple of these where you don't get
to see, touch, do anything with a product until basically the money has changed hands.
It's been a while since I've done one of those type deals, but I've done those before.
And those are scary as a technical person.
A lot of times, and a lot of times is maybe driven by like marketing or sales, for
example.
They've got to have this product.
And then you as a technical person stuck with like, you've got to integrate to implement
it.
So number one, like for people who have been around a little bit, they have that in the back
of their mind as far as like the alternative and hate it.
And then number two, like you have this other practical competitor in a sense where the
open source product keeps you, I think, honest as a company.
where like if you ever were to like 10x your prices overnight,
like people could switch to open source, for example.
But if you're like a traditional enterprise type thing
and you tinex your product and people are kind of stuck
because it's hard to replace,
then people are stuck and they have a lot of pain to switch.
So I think that's another component that I've always appreciated about open source.
Well, and I think the other one with that is like when I first came on
to Rutter Stack on the marketing side,
one of the things that I told them was because there was
I was talking to someone on the marketing team,
and they were like, well, we really want, you know,
Rudderstack to be the reason you get your next promotion.
And my reply to that was,
I don't know anyone in the data side
who buys software to get promoted.
I know people who don't buy it
because they don't want to get fired.
And the open source kind of helps you bridge that gap
where we're not saying, like,
hey, I need to make a really big commitment
that's going to take time to implement.
and I really hope it goes well
or I'm not going to be here in a year.
Yeah. Yeah, I mean, the other thing we've seen is
like if you really want to get promoted
is that you build Daxir for First Principle
and it takes three years and then you quit, right?
You get that staff level engineer
and then you just like, all right, I'm out of here.
Off to the next one.
And then what you built is like an in-house shitty version
of a product you could have bought in for, right?
So there's two sides of that.
I think the open source just makes it easier for everyone.
There's this idea he might be able to avoid vendor lock-in
as well,
think really is appealing to people.
But, I mean, there's also great software that doesn't have open source and people buy it
and love it.
There's technical things you can do with it.
But I think we all, as engineers, have seen those, like, monster implementations that
promise, like, often the best ones are the ones that promise you have no need to talk
to your engineers at all when you implement it, right?
Yeah.
In the sales process, oh, yeah.
You just, like, plug and play and click a few buttons, and you're in.
And then as soon as the deal signed, oh, by the way, where's your engineer?
years, we need them to come implement this thing they've never heard of before.
Right. That's the thing I think everyone wants to avoid during these like vendors.
Yeah, or the other version of that is, oh, we're going to handle everything for you.
We're going to help you along the way. And then you sign the deal. And you say, okay, how do we
migrate this data? And they go, oh, well, it has to follow these, this standard. We don't do
anything before that. That's all on you. It's like, well, that would have been nice to have known a
month ago. Yeah. Okay. So we played this game on the show where we see how far into the show we
can get without mentioning AI. So I don't know where we're cocking in today. I think we did okay.
But I want to talk a little bit about AI and we got to talk about orchestration. You know,
I think Daxter is a tool you can also use to orchestrate when you're, you know, pulling data
together for AI or doing other things. I'm curious, like, what are people actually doing? So maybe, you know,
people using Daxter that are more on the cutting edge of using LLMs and, you know, maybe
AI agents.
Well, what are people actually practically doing with AI and orchestrators?
Yeah, we see a lot of data prep for AI within Daxter itself.
We even see some companies building foundational models and doing experimentation, but that is like
I would say cutting edge.
But bread and butter use cases, at the end of the day, I think AI engineering is,
data engineering, and we even believe data engineering is software engineering. So if you follow
this logical conclusion, it's all really the same thing. You're moving around data, you're transforming
it, you're storing it, you're converting it, you're embedding it, you're calling APIs. Is that data
engineering or is that, you know, working with open AI and LLMs? Like that's one and the same.
Often, what we find is actually AI engineering is a little bit easier with thanks to than ML engineering
because you're relying a lot on these like third-party providers, for example, for embedding,
you're experimenting like you're not doing a lot of training models right it's done for you you're
really just experimenting and like putting things out and so we've seen a lot of companies do things like
I mean rag is the big one right everyone's trying to you like AI is great but it needs context
without context it's often garbage if you go to open AI or cloud today and you ask it to write a
Daxter pipeline it's often going to write really terrible code because it was trained on like
that extra code from three years ago, which probably isn't valid anymore. But what we've done is
we've built internally a rag model that uses our documentation, our GitHub issues, our GitHub
discussions to power what we call Ask AI. It's a Slack bot in our Slack community. And it does
really good. Is it perfect? No, but it's like a lot better than nothing. And so...
Yeah, I've used it. It's pretty great. It's pretty good, right? Yeah. Not bad for a POC.
And, you know, we could always make it better. Sometimes it gets confused. But
It's better than not getting an answer, which is always what I tell people.
So context is everything, I think, in AI.
And so what is context?
Context is data, right?
So ingesting our data, transforming it, picking the right ones, adding metadata,
running experimentation on those different context windows, on different models.
That's really where Dexter, I think, shines.
It's just like running these pipelines.
So help me out with this.
There is basically a clone, a thing about a data stack,
where the modern data stack from 2021,
there's a clone of almost every single component
that's like AI focused, right?
Like there's orchestration tool,
ETL tool, database specific.
And I'm not personally not super knowledgeable
about each of those components when it comes to AI.
Do you think that stays or do you think it all gets consolidated back
because it's not that different?
Yeah, it's a good question.
Maybe the vector databases stay.
Yeah.
If they're lucky, it's by best guess.
Or did they, but I don't know technically how hard that would be to implement, you know,
for Snowflake and Databricks to implement that.
Most of the database is to implement some type of embedding already.
Yeah, right.
Snowflake already has a vector version of their database.
Postgres has vector embeddings now.
I think even Mother Duck, DuckDB habit.
Is it that hard to store a vector of numbers?
Probably not.
There might be out of benefits to using a, like, dedicated vector database for, I don't know.
Sure.
Those are going to become specialized cases that you run into.
That's my guess.
And outside of that, the ETL stuff, I think we love reinventing things.
My guess is most people who are getting into AI today,
they're not coming into it from a background of data engineering.
Yes.
And so they just don't know the tools.
So if you don't know the tools, you think you have to invent things, right?
Or maybe you just want to build new things because old things are boring.
Yep.
Some of those will probably stick around because they'll be good enough that everyone uses
them and they devolve, I think a lot of them will fall by the wayside when we realize
AI problems are actually data problems and we have data tools to solve that already.
Right.
Well, I think a lot of people still, like there was this confusion, I feel like I still hear
around there, which is this idea of we should be replacing all of our deterministic processes
with AI.
Yeah.
It's like, but I don't need it to give me seven different answers to it.
I just want the one answer that's right every time.
Yeah, I mean, it's people using AI as a calculator, and it's like, well, a very expensive way to warm up the world.
So I don't know.
Maybe we don't need to do that.
I don't know.
Sometimes all you need are if statements and a reg X, and maybe AI can replace that.
But at the end of the day, like, whatever is faster is what's going to work for people.
Right.
I think on that one, AI is just going to replace me having to look up how to write the reg X.
Yeah, that is a decent application.
So, yeah, along the AI, kind of.
questioning. I mean, you just kind of alluded to this. I mean, it's still very expensive.
And the billions of dollars being poured into these companies mask the expense for now.
Like just, you know, just this week came out that the $200 a month plan still loses money
for Open AI. And I think they weren't even necessarily expecting that. So what? I mean,
and of course, like the thought here is like, okay, we're going to keep investing money in this and
we'll have better hardware, this can drive cost down,
we'll have better models that don't have to be trained
as in the same way to reduce cost.
Well, I mean, this is just speculation at this point,
but it'll be interesting, and I'm curious your take,
what does that curve look like?
Because eventually, like, the money, I think,
could run out before we get to that spot.
I mean, I don't know, what do you think?
Just speculation on what might happen there.
I mean, there's already some evidence of plateauing.
If you remember the great VC funded days of Uber and DoorDash, where it didn't cost anything to use these tools, and if you were smart, you would just abuse them as much as you could.
You would get the referrals and the $100 here, the credits there, and it was like $0.5 to cross the city.
You can get free food pretty much every single day, and that was wonderful.
And then the company's been public, and that it would cost like $50.
to go five miles, right?
I know, yeah, exactly.
Anywhere near an airport, it's like at least $50,
even if you're just going across the street.
Yeah.
It was supposed to be better.
It was supposed to be the Stutopia,
and it ended up just being a company
that makes money off people, right?
And they did so at the expense
of like killing their competitors.
So will AI be the same way?
I don't know, probably.
Yeah.
People need to make margins at some point.
Cash is not infinite.
Right now, it's really driven off
massive amounts of funding.
at some point, that'll change.
We'll come down for sure,
but when the margins go down,
like the research also slows down.
And so they will probably plateau
and we'll probably find them useful
in some limited capacity
that's probably not going to fundamentally
solve AGI, for example.
And I think we're seeing also
that like having the best model
is not really much of a moat at this point.
So it's not like you can say
Well, yeah, we're going to spend billions
But once we get it there
We're going to capture everything
And it does sound a bit like that Uber time
Of it's like, profits don't matter
We just need to capture market
And then eventually
Once we capture the whole market
We'll make money off of it
Yeah
Yeah, it's tough to capture the market
When really it's a commodity too
So I think where AI differentiates
It's through product actually
So
So anyone can build a model these days, right?
A lot of them are good.
It's great open source models out there.
Integrating that model in a workflow is where differentiation, I think, really happens.
And great companies who really understand that can make it a lot better.
So I think Anthropic and Cloud, for example, do a really good job with, like, their projects
and the way they've sort of structured law to make it very, like, useful in particular context
for solving these, like, problems and discussions.
I use it all the time.
Open AI, maybe not as good.
I would say product-wise as Anthropic these days.
They have more features that I don't end up using,
but purely from like a chat agent with documentation store.
I think Cloud does a better job.
Yeah.
I imagine in a few years we're going to find companies
that, like, really get the product perspective, right?
And they built really cohesive products,
which are really powered by AI,
rather than just like an AI chatbot,
that is really good at generating responses,
which I think we've sort of hit a peek on,
regardless of how much better they get.
Yeah, the other one it makes me think of a little bit
is like a satellite telephone stuff
where it costs a whole lot of money
to get the satellites up
and to get the infrastructure there.
And once you had done all of that,
it was really hard to make money off of it.
But then when the next people came around
and were just using the infrastructure
that was already out there,
you could make a profitable model,
like business model off of it.
Like with a GPS, for example.
Yeah.
Even satellite phone.
It's like it's still around and the companies are more profitable with it.
Right.
Because they didn't have to pay to put all the satellites out there.
Yeah.
That's interesting.
Yeah.
So we have a few minutes left here.
I'll throw this to Matt.
So Matt, you've spent a little bit of time with Daxter recently.
And I'm curious.
And you've got a data background.
You know, Matt, Matt worked for a publicly traded company and data.
I'm curious.
Yeah.
How does Dexter and the orchestration landscape strike you with what you used in some of your previous roles?
Like, how is it different? What's the evolution like?
Well, so most of the places I worked, we didn't really have an orchestrator.
So we had some more like pipeline-related things, but we didn't have like a dedicated orchestrator and a lot of it.
So it's been an interesting little journey having to get to know it a little bit more and, you know, try to sometimes wrap my brain around the concepts.
I think that's usually it.
Because a lot of, I mean, there's a lot of stuff that when you get into, like, okay, I'm planning things, I'm putting them in sequence or in parallel, those types of ideas.
A lot of it then comes down to what's the framework that they're using to talk about these things?
What's the language they're using?
What do they label this stuff?
Makes sense.
Yeah, so, I mean, overall, it's been, I have the added twist of I'm also including rudder stack into this with some new stuff.
So that has, that's thrown some interesting.
frustrations
the time
just learning
the two things
at the exact
same time
but I mean
overall it's been
it's one of those things
that I can look at
and I can see like
oh here's how I could have
used it
yeah
oh yeah
when I had a team of 15
this is how we could
have used this
right
the one thing though
I always
I had to think about
back then
was kind of to go
back to a point
that you made
much earlier
in that there's this
like newer
generation of people who are data scientists or whatever and they got taught a very applied way of
doing things which typically was very software centric and how do I you know how do I call the function
to train a model or whatever and so when you get into that more broader kind of closer to software
engineering world they sometimes get a little scared so you really had to pick stuff that that you
that you knew you could quickly get them in
and get them learning with.
So remember we had a software engineer
as a contractor once
and he was going to show us
how to modernize our stuff
and he did this whole thing of just
basically tearing things apart,
building it from scratch
and trying to show it how great it was.
And I was like, okay, that's great,
but no one but you can run this.
Right.
I got a team of people
that when you're not here,
I need you to run it,
whereas something like Daxter
is definitely one that you could see,
okay, I can get a team of people
to be up and running with this.
I think that's a really big deal.
Two things I thought of
from my previous experiences.
Because I'd use, it's actually funny,
I'd use the product called Rundack.
Adrian, I don't know if you're familiar with that one.
It's like a little bit more than a Windows task scheduler,
but before like we had like that, you know,
kind of DAG type concept.
But it's interesting when you go through,
what you would do every day
and now you have words and language for it.
I think that's the most interesting thing
about finding a good, like a DAX or like a good framework
for, oh, I didn't know I was doing orchestration.
Like I just, you know, schedule this around this and this.
I think that's one of the things.
And then the second one, which Matt just touched on,
which I talk a lot about.
And I think orchestration is a big deal here.
Here when you move, when your data team moves
from like one, maybe two people to be more.
more of a team, it's three, four, five, however many people, that conversion from what I call
single player mode to multiplayer mode, it's a really big deal. The tooling becomes a bigger deal,
the version control, the, you know, and I think like DBT, for example, is one thing that I think
is a big deal. If you're moving into multiplayer mode for your data team, like DBT and people in that
transformation layer, having a solution there is a big deal. In the orchestration, same thing,
where you're now now using the same framework.
There's less esotericness when like how do we schedule a job is defined.
Like we use this.
It has specs and documentation.
And I think knowing that, because I've been a part of at least one company where orchestration had the name and it was an employee named Gary.
And so he ran everything.
And when he left, nothing could run versus if you, and then we were.
scrambling whole group of us to try to get things back together right but we also didn't have any like
we didn't have the language because this was almost 10 years ago now to be able to be like okay no
what we need to do is get this into an orchestrator so that we're not dealing with this anymore
and even just i think the language of how do i talk about these things okay these are assets and
stuff like that giving language to that can be very helpful in just helping i think a lot of times
get out of the kind of limited mind frame they're in,
if that makes sense.
Especially when you're talking about things like,
what does data as a product mean?
Well, to a lot of data scientists who are very new,
it means the model I built and explaining to them,
well, no, you have to have this.
It's the end-to-end collection to delivery is the product,
not just this little part that you build.
So one last take.
Pedro, where do you, we're maybe specifically for
Daxter or generally for orchestration.
Where do you think this goes in the last, in the next couple of years?
What are the core problems in the space to solve for
orchestrators such as Daxter?
Yeah, it's a good question.
I think one of it is something we just touched on is that
not everyone knows what an orchestrator is and when they need it.
And so I think at Dexter we have like two sort of big priorities.
One is just helping generate awareness of what orchestrators are, what a data platform is,
the fact that you probably already have one and how to think about observing and having a single
place to look at these things, right?
You can't just go to Gary every single time.
And so having one place where you can understand where everything is supposed to run.
That's, I think, a big piece of it.
And the other is also just like lowering that adoption for people.
So finding ways to make it easier, more plug and play to use Xaxter with existing
you know, playbooks that you already have
and are pretty common across the industry.
Building those out without losing sight
of sort of the power of Python and Daxter itself
is kind of where we're focused on.
Yeah, makes a ton of sense.
Well, thanks for being on the show.
It's been really fun.
Matt, thanks for being here.
And we'll catch everybody in the next episode.
Thank you.
All right, thank you.
The Datastack show is brought to you by Rudderstack,
the warehouse native customer data platform.
RudderSack is purpose-built to help data teams turn customer data into competitive advantage.
Learn more at RudderSack.com.
