The Data Stack Show - 241: Marketing Meets Data: Measuring Impact and Driving Results with Pedram Navid of Dagster Labs
Episode Date: May 12, 2025Highlights from this week’s conversation include:Pedram’s Background and Journey in Data (1:13)Marketing vs. Data Engineering (2:30)Understanding Marketing Pressures (4:16)Attribution Models and A...ccountability (8:13)Balancing Marketing and Team Management (12:25)Introduction to Dagster Components (15:00)AI Integration with Data Engineering (19:05)Challenges in Data Support (22:05)Self-Service Data Access (26:07)AI in Data Management (28:25)Organizing Data in Technical Teams (31:25)Challenges in Real-Time Data (33:28)Final Thoughts and Takeaways (37:01)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
Transcript
Discussion (0)
For the next two weeks as a thank you for listening to the Data Stack show,
Rudderstack is giving away some awesome prizes.
The grand prize is a LEGO Star Wars Razor Crest 1023 piece set.
They're also giving away Yeti mugs, anchor power banks, and everyone who enters will get a Rudderstack swag pack.
To sign up, visit rudderstack.com slash TDSS-giveaway.
Hi, I'm Eric Dotz.
And I'm John Wessel.
Welcome to the Data Stack Show.
The Data Stack Show is a podcast where we talk about
the technical, business, and human challenges
involved in data work.
Join our casual conversations with innovators
and data professionals to learn about new data technologies
and how data teams are run at top companies.
["Data Work"]
Before we dig into today's episode,
we want to give a huge thanks
to our presenting sponsor, RutterSack.
They give us the equipment and time to do this show
week in, week out,
and provide you the valuable content.
RudderSack provides customer data infrastructure
and is used by the world's most innovative companies
to collect, transform, and deliver their event data
wherever it's needed, all in real time.
You can learn more at ruddersack.com.
We are here on site at Data Council in Oakland, California.
And our first guest this week is Pedram Naveed of Dagster.
And Pedram, I feel like we intercept you
as sort of these major points where your job changes
or your role changes.
So we're gonna hear about the latest iteration of that.
But thanks for joining us. Yeah, thanks for having me. All right. Give us just the quick
background and you can go back maybe just to the beginning of when you were on the show of,
you know, your involvement in data and tell us what you're doing today. Yeah, I feel like I can
just track my career progression just by following the data sac show. I think last time we spoke,
I had just joined Dexter, head of data engineering in DevRel. More recently, back in November,
I took on the marketing function as well,
so we can maybe talk a little bit about that.
Yeah, because I have to be on here.
Awesome, so, Pajam, we were talking before the show
about Daxter components.
I want to dig in on that.
And what are some topics you want to cover?
Yeah, we'll talk about Daxter components.
We can talk about how data engineering,
does YAML engineering as well.
Love that. All right, well, let's dig in. Sounds good. Alright, Paige I'm super excited to
have you on the show today and for those of you that are like hey who's this new
voice I'm Brooks producer of this show usually behind the scenes but here out
in the field with Eric and John at Data Council and excited to bring some
special recorded in-person content
to you all. So, Peyton, want to start out you have kind of as maybe your old
friends in the data space might say gone to the dark side and will you work in
marketing now? Can we switch places? I was in marketing and you know. Yeah.
And I am in the industry.
Yeah.
Wow.
Who made the wrong call?
Well, I don't know.
Yes.
I don't know.
To be determined, we will check back in on the data stack
show in a year maybe.
We'll have an answer.
Pedro, tell us how your background in data
has kind of uniquely influenced the way you're approaching
running marketing at Dagster today.
Yeah, it's a good question.
So I think one thing that helps in running marketing at Daxter today? Yeah, it's a good question. So I think one thing that helps in running marketing
at a company like Daxter is I love Daxter.
I used Daxter before I joined
and I'm very deeply familiar with the product
and what it's trying to accomplish.
So I think for me, I never thought marketing would be
just bringing a little bit of that knowledge
into things like the website and helping update that.
Turns out there's a thousand more things going on
in marketing beyond that.
I think having some ability to understand data,
having built attribution models in the past
and their pros and cons, being able to run SQL.
All of that helps because you can start to self-serve
a little bit as a marketing leader when you have questions.
Can also maybe be dangerous.
I think he could become sort of what you hated in the past,
which is
Alina who thinks they know a little bit too much from this show
Dig into that a little bit and
Tell us about okay. You say this is what you hated in the past But now that you're feeling some pressure of like, okay, I kind of see why this it works the way it does
Can you just share some of your perspective on like now that you've been?
behind the curtain or like in the leadership position in a marketing role,
like maybe share for folks who haven't been there,
like here's actually like the pressures
that are on marketing and why sometimes
maybe the perception is like,
oh, Mark, he's doing a bunch of stupid stuff or you know,
whatever, just share like your learnings
is now you've kind of been in the marketing role. Yeah, the way I think of marketing is it's really their goal is to help sales be effective.
And the goal of sales is to help drive revenue in the company.
And so if you work at a company and you have equity, I think you would want marketing and
sales to be effective for many obvious reasons.
And I find that when you tell the story of what you're enabling to even engineers, they
tend to get it.
I think where marketing can often fail internally in an org is when they're working in a silo,
they're not really connected with product, with R&D, or with sales, and they're just doing a bunch of things.
I call them random apps in marketing. No one really understands why they're being done, it's a blog post.
Some person just reaches out and says, hey, do you have time to write a blog post on this random topic? I just thought of with no context
That's what I think marketing is that it's worse when it doesn't really seem connected to the rest of the org
I think marketing when it's running really well, it just connects these things together
It shows the story you're trying to tell and often what I like to do is go out and look at a big customer
You will one and tell the story like I'm beginning to end of how the entire or impacted that wing
And it's not the seller, you know,
making that call three weeks before.
It's 18 months before they joined a community Slack
and they asked the question.
Someone on DevRel, an engineer popped in
and answered that question for them.
That's like the first point of contact they had.
18 months ago, being able to tell that story
because I think really what marketing is really powerful
and excels at.
Tell me a little bit how, I'm sure some of this happens organically just because of your
background and kind of how you moved into marketing from being like a deeply knowledgeable
user of Dagster.
I think some of this probably happens more organically for you, but especially as you're
growing a team, what are the ways that you are structuring things
so that you don't lose that?
Because it can't all come from you, right?
You have this unique position of I know
and really get the product, I get the vision,
I understand our kind of ICP,
but you can't scale in just your brain.
How are you growing the team
and making sure that you don't lose that
as there are pressures to scale and grow
both the team and the company?
Yeah, the linear marketing structure at Dijkstra
is kind of interesting.
A lot of the content actually comes from the DevRel team
and our DevRel team is staffed by people who use Dijkstra,
love Dijkstra, know Dijkstra really well.
They contribute to the core platform.
They've built pipelines on it previous to their role here.
And so we found that having really technical people
on the content side is really non-negotiable.
We've tried, you know, bringing in people
who don't have that technical background.
And what ends up happening is they end up
relying on DevRel anyway, and we're like,
well, why don't we just read the content from scratch?
Like, it's gonna be much more compelling.
And so I think that's been very helpful to the overall org
is being able to access people who know them.
I write some of our content,
but I would say most of what we do today
comes from the DevRel team.
And then we still do the marketing side.
It's like they have the expertise and distribution.
I kind of feel like they're the engine
that powers the DevRel team and gets the message out there.
That part has to be very specialized
when it comes to like product marketing,
when it comes to digital channels, ads, all that.
I'm not an expert in any of this stuff.
And so having these two like almost distinct teams
working very closely under one roof
has been super powerful.
On the attribution side,
I'm interested in your perspective
now being accountable, right? To sort of show, okay, to show, as a marketing leader, generally
you have some sort of resource allocation.
You have people, you have a budget, and you're responsible for saying, okay, I'm going to
take this, I'm going to do these things, and then what's going to happen is hopefully revenue
increases.
And so you've been on the other side of that,
building attribution models that are trying to
aggregate this data and run some sort of model
in order to prove that and give that number to someone else.
But now you have this really interesting perspective of,
okay, well now you're accountable for that
and you also have the underlying knowledge
of how to do that.
Can you speak to that a little bit?
What's that experience been like?
Yeah, it's great. You just saw your data team attribute everything to me.
Yes. And you look great. Yeah.
It's an interesting problem because you are still over the data team. Right. Because I've actually
been in this role for a minute too, where I was responsible for marketing budget and spent and
over the data. Yes, that's right. So there's a cool thing you're your own
customer there's also a little bit of like like we gotta be really like
precise here and yeah not fudge too much between. The founders trust you
which is great. Yeah right but there's no reason to be yeah they wouldn't but
yeah being your own customer is interesting. It is it's actually really nice on the data
team not because you make up F but we don't spend too much time on that confusion to be honest. We don't try to ditty
up between sales and marketing. Every deal is one together. We know that. That's cool. You don't
really win by yourself. We're a favorite. Yeah. Our sales reps can't do outbound to someone who's
never heard of Dynaster ever in their life with no marketing support. We find the deals come in
through 10,000 touch points. So we went together.
And so we don't even bother being accountable
to who sells this or who's to use that.
That makes things easy.
What we wanna know in the marketing team is like,
are our campaigns effective?
That is a data problem.
That's not a, you know, who gets credit problem.
And so working with our data team,
we brought a couple of great data people there
who have deep marketing experience.
They can help me understand like, you know, these campaigns, these conferences, these events we went to,
the influence purchases for people there in the same accounts as the camps you won, what are the sales cycle collide?
Are there differences between these things? Those are the kind of data questions you can start to ask to become better at marketing.
But we've never been in this place
where we have to like justify marketing assistance. I think at the end of the day if you're not
creating opportunities sales will know and they'll tell you it doesn't matter how much
if we keep the books or not deals are coming through we're not closing deals. I love how it's
refreshingly simple you know. Yeah I mean not that you don't, you know, after you go and run different models and things like that, but it is, it's
great to hear your perspective on, okay, I mean, the business has a goal and
these are the different ways that we can measure whether what we're doing is
effective, which is great.
Yeah.
And for us, it's always been like, we, you have to also remember the things
you measure are not the only things that work.
And so I started to re-appear this a lot of the time. It's very easy to focus on
the campaigns you have in Salesforce and the digital channels and the conferences where you
get views. But you don't record every single person you talk to, you don't record the power
of your brand. There's all these things that happen outside of marketing's, you know, review sometimes,
and you just have to have some trust that you allocated,
you know, some money for this brand exercise.
It's like people getting to know about your company
and there's ways you can measure that,
but like the scale of our budget,
you probably aren't gonna share it for good sense.
And so also what I do when I work with finance
is like I allocate experimental budget.
Like here's money, we're gonna just go
and try a bunch of things and see what works
and what doesn't, what resonates with audiences.
I will be able to attribute anything to this.
Yep, we just accept that.
And here I'm gonna campaign budget,
it's two line items.
And here's the data I have that backs up the value
of these things.
And you have to do both.
You can't just rely on one or the other.
Yep.
Does it surprise you how complicated that can be since you moved
into marketing and you're like, oh now I'm like, I got my hands dirty in this and marketing's actually
pretty complex. When you start getting into it, it's a lot more complex than I wanted to be.
Pete, our CEO, had this like lovely idea that said I think when I was going to do this that I would
be like, you know, part I see part running marketing and they would like have time to do both of these things.
And I tried that and I failed very quickly.
So a lot of my work now has been just enabling the team and the future.
They know what the priorities are and having them go out and execute where I can find the
most impact is on the review side and getting feedback.
I can't go create a content as much as I used to or as I wanted to, but I'm still very much
in the reads
when it comes to seeing the content they produce,
being there on the calls, reading the content
and getting feedback on where I think they're nailing it
and where I think they could improve.
Yep.
Yep.
Let's help you out and create some Daxter content right now.
And I've got your name.
Yeah, free content.
You're super excited to talk about Daxter components.
I've got one more marketing question.
I want to know, so like how long has it been since you've been like November and I was
also off for two months of pat leave right in the middle.
At least a couple of months.
So I want to know one thing that you guys have done that you're like maybe in that experimental
category where you're like I did not think this would have been working like kind of
a surprise success and maybe one thing on the flip side of like I did not think this would have been a work and like kind of a surprise success.
And they do one thing on the flip side of like,
I was, swore this was going to be great.
And like maybe it like wasn't as good as he wanted it to be.
I have to ask you a question.
I would say the conferences have been more impactful
than I thought they would be.
Okay.
We did re-event for example, that was quite a good one.
Prior to that we did like snowflake data breaks.
We always knew like vibe space, those were good. But it's only been a couple of weeks since we've had the data to actually
show and prove that. One problem is sales sectors are so long that people don't have
time for the data to come through. So those have been, I think, bread and butter for us
and we tend to do them.
On the experimental side, it's really hard to know right now. I don't think I have an
answer today. We're running a bunch of experiments to try to see what works, but it's still early to date.
Cool.
All right, so Daxter components.
I said before the show I use tools.
So you can give us a better view into that than I can.
But I've used some tools in the past.
I'm talking about the yamelification
of data engineering.
Yamelification.
Yamelification. YAMLification.
Oh, and even we're sharing with the team some articles coming out about that.
So I've actually done this with kind of a traditional team over the past year that is
not used to YAML.
That is like write SQL, writes for procedures, not even really a data engineering team.
And it's been fascinating around the little usability things
that everybody struggles with.
But I think it's been so long, you forget.
And the most simple one is spacing.
Literally, the YAML film, different parts,
or the spaces are wrong.
And they don't know about letters.
But they just have a text editor open.
They're like, how do you configure this file?
And they're trying to like figure it out
So that anyways that that's been on my mind and then thinking about like how you guys are solving this problem thinking about how AI
Like I mean, that's like a letter as a Marys solution
But even AI more so could be a solution for those people
so own it to you up tell us about components and the tell us about this bridge between
Stakeholders and you know data engineers of data platform engineers and whatever we want to call them.
Yeah for sure. So Dijkstra components is something we're working on
actively right now.
It's already a product line and it was built to solve a specific problem we
heard from customers which is
often our Dijkstra champion comes in, loves Dijkstra, implements it, they're really happy.
They get the framework, they know what it's like out,
and then they go to like their team,
their stakeholders and they're a little bit confused
by how to use or operate on this thing.
And so often what they end up doing is they create,
you know, YAML for their team.
They do factories, they parse YAML,
they create assets themselves.
And that is fine, but it's a lot of work for those people
to maintain and everyone's doing it a little bit differently
because they're all solving this problem independently.
And so we thought, why not just make that easier for everyone,
build it in-house, and have something
that's sort of backed by the framework itself rather than
kind of bolted on.
And so that was our initial phase of components.
We started building out this.
You could be YAML.
It doesn't have to be Python as well.
It's a mix of both.
You create components. The Daxo framework will as well. It's a mix of both, you create components,
the DAX framework will recognize them.
It'll actually do schema validation for you as well.
So if you put in grand spaces or bad key,
it'll have a little highlighter, new text editor.
Just like classic YAML editing stuff.
Yeah, helps you build more quickly.
But as you were doing this, we're like,
oh, we just created a structured format
for building pipelines, which is really where we find tools that are AI powered tend to work really well.
You put AI in front of like the back of the framework on its own, and it's just like an angry three year old who doesn't know what's going on.
It's like banging pots together, everything's falling apart, there's like spaghetti on the floor.
Like you don't know what's going on but if you say hey AI person robot whatever you miss what if instead of having
access to the entire framework of Dynaster which is like too much for your
little brain to handle what if he gave you a little box for your player that
box is a Yandestima there's descriptions of each field so you understand what
those are we created that for people but it turns out the same people who benefit
from a very constrained box
and a nice sandbox to play in,
LLM's also really offer it really well in that.
They love context and they love structure.
And so what we realized is we can actually now use LLMs
to create these pipelines with Dexter.
We have like an MCP server that's coming up.
And so you can validate through your AI bot,
you can say hey
build me a pipeline that takes data from Snowflake and puts it into DuckDB and it
has examples, it has documentation to develop it out. It runs the test, makes sure
that the animal is valid, if not it can fix it itself which is really cool and then
eventually it'll build a pipeline for you. Builds the bones, uses all the entire
framework, the platform itself, data engineers continue to build and maintain that.
But our view is that data engineering is changing, and we're going to bring more people in to
the ecosystem.
So this is one way we help lower that gap.
What are the biggest things?
So working on creating these environments that are more contained, that have guardrails,
what are the main things that people are trying to do within those, right? Like as you saw this need, so you have your champion, and then they go talk with
their stakeholders. What are the main things that the stakeholders are trying to do?
Very simple stuff often, right? Like these are not complicated things. They want to add
maybe a table, they want to move some files around. Maybe they want to unzip something
and then take that data and put it somewhere else. If they're a data science team, they
might want to run some models, they might have a SQL query, they want to unzip something and then take that data and put it somewhere else. If they're a data science team, they might want to run some models.
They might have a SQL query, they want to persist that data somewhere.
And then they want to run some Python script against it.
Very simple stuff to you and I, but like be able to connect these things for them.
It's not always obvious, right?
Like your framework can be a little bit complicated.
And so if you're not familiar with that, you almost have to like learn Dynaster first before
you can get the job done.
And so what we're trying to do is just reduce the orchestration bit of Daxter as much as possible.
And so you can just focus on the business logic pieces, much as you can.
So from the engineering side, I really like the like essentially limited blast radius for people and AI, right?
And I don't know, was the people, the initial audience,
it was like, oh, this works great for AI,
is that like come a flow or was it like,
were they kind of both coexisting when you guys thought?
Because 100% of people first,
we weren't thinking about AI too much,
I mean, not for this particular product,
we were thinking of specific customers who said,
it takes too long for us to deliver value next
or we wanna reduce that gap.
And we thought, okay, well, I'll help you do that.
We were doing that.
And at the same time, we're all exploring different AI tools and MCP servers and playing
them locally on our laptop.
And one of our engineers said, you know, we should try adding an MCP server to this.
It seems that will work.
But I think everyone was like very hesitant and didn't believe it would work at all to
like this stupid who uses AI for anything
Whenever I like the marketing team to sell steam they're all like yes
Let's go when I wrote the engineers don't like oh, I don't think so. Well one engineer is like, we'll just try it
Let's see what it does did a demo showed the demo to the team. Everyone was like sold immediately
You literally talk in English to it. You say, hey, go me a pipeline next to YDRA.
First time maybe you get mixed up mistake,
it's not perfect, but once you inject context
into something, it's so much more powerful.
Yeah, well, I mean, it's put on your marketing hat
or your sales hat.
And I mean, that's where a lot of the requests
are coming from.
And a lot of these companies is like,
marketing wants to connect A and B or sales
wants to make A and B and they'll be fully capable of like you said just like dictating in plain
English like hey I want to connect these tools and essentially like let's say the AI gets it like 98
right or 95 percent right and you like have a TR to review of like oh like I don't think I meant
that and but it's like instead of like going through the whole process of like
requirements or whatever, you know, what's the process, like it's just so fast.
Yeah.
I think the future data to do is changing.
I think we kind of have to just accept it.
There's a powerful, I was just going to say, I want to dig into that really quickly
or maybe not really quickly, but to say it more directly.
So we talked about stakeholders.
John, you just mentioned sales and marketing
can connect A and B, which is creating a data pipeline.
Yeah, right.
If I'm a listener to the show,
I'm immediately thinking everything about this
sounds like a bad idea idea based on almost all of
experience almost all of my prior experience okay so we just have to
accept it but like what are we accepting here this inevitable future this is what
happens when you move from a data team to marketing team yes you become an
optimist this how long is the process This is kind of an un-jading in a way.
An un-jading.
It's been great to be in marketing because you're now accountable to results and that
just changes your whole perspective of life.
Whereas running a data team, we all have to pretend we're accountable to the results,
but every few months there's a new thought piece about what's the value of a data team.
I think that's just like internal solver to like,
what is my purpose?
I mean, your purpose is to support the rest of the org.
Marketing and sales.
Mostly marketing, but also something.
But it gets a little meta, right?
We started this with marketing as really supporting sales.
So if data is supporting marketing and supporting sales,
you could get really, really low.
And that's why the value is so hard to measure
because you're like supporting like three levels down.
Yeah, you need like a semantic layer to sort of,
you know, interpret this.
We're gonna take a quick break from the episode
to talk about our sponsor, Rudder Stack.
Now I could say a bunch of nice things
as if I found a fancy new tool,
but John has been implementing Rudder Stack
for over half a decade. John, you work with customer event data every day if I found a fancy new tool.
you know how messy it can get. Yes, I can confirm that.
to your existing tool set. Yeah, and even with technical tools, Eric,
things like Kafka or PubSub,
but you don't have to have all that complicated customer data infrastructure.
So I am truly interested in this. So data engineering is changing. We have stakeholders coming in. So let's just say a couple of years from now,
what does this future look like as components,
unless it'd be AI manifestation
of running inside of components and stakeholders using that.
What does that future look like?
I mean, like example workflows or other things like that.
Yeah, I think it's that you look at the backlog
of work today, for example, on the sales and marketing side or you've got a cost of entities, similar Yeah, I think it's that you look at the backlog of work today, for example, on the sales and
marketing side, or you've got a cost of entities, similar questions, I think.
And most data teams today are already small on the resource constraint.
I don't know a single data team that's able to meet the capacity of questions.
And so today what happens is if you're on one of those teams and you know SQL or Python,
you can go get your answers today.
But if you don't, you're sort of stuck.
And there's different tools that try to address this
in different ways, right?
There's BI tools that are saying,
we can put AI in front of your BI,
and now your stakeholders can ask questions,
pull data that way.
I think that works great for some use cases,
but there might be cases where the data's not quite there
yet, or you're transforming into both pipeline.
You want to combine some things.
Maybe that belongs in your data platform. I don't see why that's that different. Is it scary? Yes,
but it's also, you know, I think it comes back from new engineers to meet their stakeholders where
they are. I think if you have access to them, they can make it easier, but with guardrails
and governance, no one's saying free for all, right? But like, that doesn't mean saying no to
everything as well. I think there's a middle ground there somewhere.
Yep.
But I would think there's this like data platform
engineering role that like somebody should understand
how all of this works.
Like, I don't think that goes away.
No.
And then you have the, it's just a better interface.
Like really with, cause I don't know any data engineers
that like love using project management tools or love like managing requests from like stakeholders. Really, because I don't know any data engineers
that love using project management tools
or love managing requests from stakeholders.
Somebody loves that.
So if you can automate that problem, that are more technical that are still having problems today.
Data science teams, analyst teams, BI teams, they understand SQL and they understand Python,
but they might still have trouble or they might rely on you to build these pipelines.
I'm interested, Pedram, I think part of the inevitability of this,
and it's so cool to see Dagster
sort of enabling this within components, but the appetite is going to grow
very rapidly in RD House, right? Because even within the customers that I talk to,
even within our organization, if you know SQL or Python, you can go self-serve
in a way to do certain things., you can go self-serve in a way, like, you know, to do
certain things.
Maybe you can't build an entire pipeline or, you know, join these disparate data sets.
And so, you know, components can help with that.
But more and more people have the disposition that I don't have to know SQL or Python because
I can use an LLM to do that.
Right? Python, because I can use an LLM to do that, right? And so, and of course, I mean, there are, that certainly there is a false sense of actual
domain authority, but truly someone who doesn't know SQL can do way more than they used to
be able to do because you just give it a bunch of context and say, okay, I'm trying to answer
this question and it can generate SQL for you.
And so the appetite to be able to do technical things without the underlying technical knowledge,
I think is increasing really rapidly.
100%. And I think that's what you guys, we all have to start building towards, right?
So when we talk to data engineers or, you know, analytics engineers, they often say,
oh, there's too many stakeholder requests.
I can't go and build high value stuff anymore. Well,
we're seeing there's a way out of that mess. Go build that high value stuff. That high value stuff
is always going to be how do you enable the recipe team to self-serve. That's what we always talk
about. So the way you enable that team now, maybe change this a little bit, but I think it also
becomes easier. You don't have to go and do a bunch of SQL courses anymore as a stakeholder.
What you do is your analyst now goes to create tables that are well documented and well structured
and easy to consume, put those in a place that elements access to hide all the other
shit so it doesn't go the wrong way.
And now you have a great place for a stakeholder plus their AI to go and answer questions.
Yeah.
John, we were talking about this recently with Vercell, and they're doing some interesting things that they I but one of the most fascinating things
You know, there's tons of buzz around okay the prompts and what are they doing and all this sort of stuff, right?
But the real story is that they have these frameworks in this entire system
That makes it really powerful in context within this larger system, which
is exactly what you're talking about.
It's like, okay, you do have to have the platform engineer to your point in all of this context,
but that is actually what makes it super powerful within a component, for example, where it's
like you can do things that feel like magic, but it's actually backed by a very well-thought
system and trainwork
and documentation.
Yeah, so people who think AI is going to take all our jobs, I keep pushing back on that
because AI is actually very dumb.
It's not very good at taking people's jobs.
What it's good at is very constrained systems that are well-designed.
It takes people to build those systems.
Totally.
I might be wrong, AI comes and takes all of our jobs.
Who knows?
But for the foreseeable future,
the future I care about, the less control I have,
I think AI is just going to make it easier for us
to deliver value if we choose to allow it to.
I think some people will fight with stuff tooth and nail
and they say never let AI deliver anything to anyone. it's going to hallucinate and it'll be wrong.
I got news for you, it's already wrong.
It's all right.
Exactly.
Have you seen the sequel?
People are writing, it's not that great today.
So yeah.
But there's a clear path to success here.
You can choose to partake in it or ignore it.
When AI comes for your marketing job, that's when you go heavy on
brand. Correct. Like AI will never really truly get brand. That's true. That's true. There will always be some value for me to find some.
I've tried using AI for marketing and it's actually really bad at it. It generates the exact same stuff
as everyone else and if your marketing is undifferentiated then it's not in the marketing.
So one like, so I think it was this primary narrative here that makes a lot of sense. We were If your marketing is undifferentiated, then it's not on the marketing.
So I think it was this primary narrative here
that makes a lot of sense.
We were bridging, essentially, lightly technical users
to a technical data engineering team or non-technical users.
I think the other maybe a sub-narrative here
is I think they're still used for this.
Let's say, like, in aAS company where like everybody's technical or almost
everybody's technical almost everybody knows like a little bit of SQL maybe a
little bit of Python like even in that like context it's still actually super
valuable to have everything organized because when it's unorganized like like
back let's say like five years ago, there were these practical constraints
of like, all right, the data warehouse is like X big
and that's a constraint.
Now, if it's just compute-based,
unorganized is actually really expensive
because all these people just built their own things
and they all run and they all consume compute
and they all, I mean, storage is still kind kind of basic with three but at least on the compute
side it's actually fairly expensive to do it that way let alone like data not
agreeing and all the like practical problems you have so I'm just curious
like in it like if you weren't like talking to like a really like type of
organization like what are some ways that they might use components I don't
think it changes right like components just simplifies things for everyone even
yourself so if you find you're creating the same thing
over and over again, you might want to use components. We're going to use
components for all our integrations so that instead of having to learn how the
dbt integration works and setting it all up, you can just use a Dijkstra
created component for dbt. And now it's like five lines of code that are all
YAML. And if you choose to go back into python you can but you don't have to so people love
being lazy I love being lazy though if your brain cells I got to use here to
get a pipeline working with the better yeah I love that do we have time for a
lightning round nine minutes nine minutes okay that's good it's more than
enough time to cover the topic of streaming systems and orchestration.
So we'll end early.
So one thing that's interesting is,
a lot of the big cloud providers
are increasingly enabling streaming workflows, right?
So that you can deliver data into,
let's just say a traditional sort of analytics warehouse in near real time,
right? Which is interesting, right? Because this sort of, most of the world when you talk about
streaming is, they think Kafka, right? Okay, you have a streaming use case, you're using some
eventing system like Kafka or Kinesis or something like that. And there are really established
workflows around, you know, where you stream stream that data and then the jobs that you run
in order to stream that data into some sort of warehouse.
But even then, generally you're running off of
a scheduled batch, a series of scheduled batch jobs
that sort of run as a pipeline and you orchestrate it
and you sort of do all of that, right?
So how, at Daxer, I'm interested, how are you thinking about orchestration in the context
of now streaming into these analytics databases essentially in real time?
Yeah, it's a good question.
I'm still trying to figure it all out, to be honest.
So like you said, a lot of these cases are we got to stream this data in and then wait
for some job to run for an hour.
And then, you know, it updates the dashboard, I guess, every day.
Right.
Well, we really do have a streaming for that.
Sometimes it's like a stream already exists and you want to consume that.
Backstrap works well in those use cases.
We have users who do that.
Yep, they've got a common stream.
It runs every hour, every five minutes.
It pulls from the latest and it does a bunch of processing inserts in the dev.
Snowflake, yeah, it tends to work really well,
and I think that's just an established pattern.
Snowflake has their dynamic tables.
If you wanna be a little bit more pretty with things,
you could stream insert
and have it update all the things that matter.
There's certainly use cases for that.
For us, internally we use Kafka as well,
but more on the application side,
so we drive a lot of our application analytics, all the user-facing analytics.
Telemetry and everything.
Exactly.
Back to the user, but that's through Kafka.
We actually had it in Snowflake previously, took forever to run, it was very extensive.
It was delayed by 24 hours, which is not that useful.
So streaming, it was like I replaced it for that entire pipeline.
It doesn't even run through that, it just like I replaced it for that entire pipeline. It doesn't even run through Daxter, it just completely bypasses it.
So I think we're still trying to figure out where this stuff fits and what we mean by
streaming.
Some people mean like a Kafka stream, others mean event-based stuff.
So if you're waiting for a file to show up in S3 and run, Daxter's got a great solution
already in place for that.
We just censor the way for stuff to show up with the Euronomic to come to pass.
It really depends use case by use case.
I think today we don't really have,
not even a dinosaur, but broadly in the data space,
the right way to connect these things together.
Yeah, I agree.
Especially when you get into like a single stream
is whatever, you start to get into joins, transformations,
reading multiple systems together,
act fields, like what do you do there?
All this stuff gets pretty complicated really quickly.
So I think to just say self-help, a gap,
I think at the moment.
Yeah, yeah, I agree.
I agree with it.
And there's been a lot of people I've worked with,
like, oh, I like how with this streaming,
I was like, okay, cool, what do you mean?
And then like you said, you get down to the use case
and you're like, okay, so in use cases, and a couple a couple hours is fine and i'll go over like oh by the way
You're paying for a compute when the computer on like the bill's running
It's gonna be like half the price if we do an hour, right?
And the or whatever the number is like, oh, yeah, i'll have that like almost especially when cost comes into it
Like almost nobody's like no, it must be real time when like not real time is like significantly cheaper.
Yeah.
For a lot of analytics you see,
it's like what's the decision you're gonna make?
There are cases where, you know,
we need a real time decision
because it's costing us money to be weak.
Yeah.
Right?
Like that's when you really need to start building systems.
But often those are like best spoke purpose built systems
outside of a traditional analytic.
Yeah, yeah, totally.
Totally.
Yeah, yeah.
AB testing is one we've seen with our customers, right?
Where you, if a test wins and you can get a 20% increase in conversion,
like that is serious money.
And so great, like I'm willing to stream data to understand that immediately
because as soon as I can point, you know, how to present a traffic there.
But yeah, you get into user-facing analytics
and it's like, well, that's actually
just a completely different system.
Yeah, how many times have you got an ad for something
after you bought it, you're like,
yeah, maybe they need a screen platform.
One really interesting one is auto approvals for things,
like loans or there's different things
where people are doing really complex,
neat systems around that.
They're like, okay, the money is, I'm sure, definitely worth it to do streaming for those specific use cases. There's different things where people are doing
components and kick the tires on. That's a great question.
We just had a webinar, it's on our YouTube.
Go to docks.io, there's probably a little bar
at the very top.
You can go to docs.daxter.io.
You'll see components underlaps
if you wanna try it out yourself.
It's still actually in production.
I guess the last place you could go in
is our community Slack channel.
We have a DG components channel on there.
Come check us out in there as well.
Awesome.
Patron, this has been great.
Thanks for coming back on the show.
Thanks for having me.
Oh, it's what?
Wait, wait, wait.
All right, well we are wrapping up with Pajram.
We will be bringing more in real life interview content
to you from Data Council.
So stay tuned and thanks for joining us.
The Data Stack Show is brought to you by Rudder Stack,
the warehouse native customer data platform.
Rudder Stack is purpose built to help data teams turn customer data into
competitive advantage. Learn more at rudderstack.com.