The Data Stack Show - 241: Marketing Meets Data: Measuring Impact and Driving Results with Pedram Navid of Dagster Labs

Episode Date: May 12, 2025

Highlights from this week’s conversation include:Pedram’s Background and Journey in Data (1:13)Marketing vs. Data Engineering (2:30)Understanding Marketing Pressures (4:16)Attribution Models and A...ccountability (8:13)Balancing Marketing and Team Management (12:25)Introduction to Dagster Components (15:00)AI Integration with Data Engineering (19:05)Challenges in Data Support (22:05)Self-Service Data Access (26:07)AI in Data Management (28:25)Organizing Data in Technical Teams (31:25)Challenges in Real-Time Data (33:28)Final Thoughts and Takeaways (37:01)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we’ll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Transcript
Discussion (0)
Starting point is 00:00:00 For the next two weeks as a thank you for listening to the Data Stack show, Rudderstack is giving away some awesome prizes. The grand prize is a LEGO Star Wars Razor Crest 1023 piece set. They're also giving away Yeti mugs, anchor power banks, and everyone who enters will get a Rudderstack swag pack. To sign up, visit rudderstack.com slash TDSS-giveaway. Hi, I'm Eric Dotz. And I'm John Wessel. Welcome to the Data Stack Show.
Starting point is 00:00:36 The Data Stack Show is a podcast where we talk about the technical, business, and human challenges involved in data work. Join our casual conversations with innovators and data professionals to learn about new data technologies and how data teams are run at top companies. ["Data Work"] Before we dig into today's episode,
Starting point is 00:00:58 we want to give a huge thanks to our presenting sponsor, RutterSack. They give us the equipment and time to do this show week in, week out, and provide you the valuable content. RudderSack provides customer data infrastructure and is used by the world's most innovative companies to collect, transform, and deliver their event data
Starting point is 00:01:16 wherever it's needed, all in real time. You can learn more at ruddersack.com. We are here on site at Data Council in Oakland, California. And our first guest this week is Pedram Naveed of Dagster. And Pedram, I feel like we intercept you as sort of these major points where your job changes or your role changes. So we're gonna hear about the latest iteration of that.
Starting point is 00:01:44 But thanks for joining us. Yeah, thanks for having me. All right. Give us just the quick background and you can go back maybe just to the beginning of when you were on the show of, you know, your involvement in data and tell us what you're doing today. Yeah, I feel like I can just track my career progression just by following the data sac show. I think last time we spoke, I had just joined Dexter, head of data engineering in DevRel. More recently, back in November, I took on the marketing function as well, so we can maybe talk a little bit about that. Yeah, because I have to be on here.
Starting point is 00:02:12 Awesome, so, Pajam, we were talking before the show about Daxter components. I want to dig in on that. And what are some topics you want to cover? Yeah, we'll talk about Daxter components. We can talk about how data engineering, does YAML engineering as well. Love that. All right, well, let's dig in. Sounds good. Alright, Paige I'm super excited to
Starting point is 00:02:30 have you on the show today and for those of you that are like hey who's this new voice I'm Brooks producer of this show usually behind the scenes but here out in the field with Eric and John at Data Council and excited to bring some special recorded in-person content to you all. So, Peyton, want to start out you have kind of as maybe your old friends in the data space might say gone to the dark side and will you work in marketing now? Can we switch places? I was in marketing and you know. Yeah. And I am in the industry.
Starting point is 00:03:05 Yeah. Wow. Who made the wrong call? Well, I don't know. Yes. I don't know. To be determined, we will check back in on the data stack show in a year maybe.
Starting point is 00:03:14 We'll have an answer. Pedro, tell us how your background in data has kind of uniquely influenced the way you're approaching running marketing at Dagster today. Yeah, it's a good question. So I think one thing that helps in running marketing at Daxter today? Yeah, it's a good question. So I think one thing that helps in running marketing at a company like Daxter is I love Daxter. I used Daxter before I joined
Starting point is 00:03:32 and I'm very deeply familiar with the product and what it's trying to accomplish. So I think for me, I never thought marketing would be just bringing a little bit of that knowledge into things like the website and helping update that. Turns out there's a thousand more things going on in marketing beyond that. I think having some ability to understand data,
Starting point is 00:03:50 having built attribution models in the past and their pros and cons, being able to run SQL. All of that helps because you can start to self-serve a little bit as a marketing leader when you have questions. Can also maybe be dangerous. I think he could become sort of what you hated in the past, which is Alina who thinks they know a little bit too much from this show
Starting point is 00:04:12 Dig into that a little bit and Tell us about okay. You say this is what you hated in the past But now that you're feeling some pressure of like, okay, I kind of see why this it works the way it does Can you just share some of your perspective on like now that you've been? behind the curtain or like in the leadership position in a marketing role, like maybe share for folks who haven't been there, like here's actually like the pressures that are on marketing and why sometimes maybe the perception is like,
Starting point is 00:04:37 oh, Mark, he's doing a bunch of stupid stuff or you know, whatever, just share like your learnings is now you've kind of been in the marketing role. Yeah, the way I think of marketing is it's really their goal is to help sales be effective. And the goal of sales is to help drive revenue in the company. And so if you work at a company and you have equity, I think you would want marketing and sales to be effective for many obvious reasons. And I find that when you tell the story of what you're enabling to even engineers, they tend to get it.
Starting point is 00:05:05 I think where marketing can often fail internally in an org is when they're working in a silo, they're not really connected with product, with R&D, or with sales, and they're just doing a bunch of things. I call them random apps in marketing. No one really understands why they're being done, it's a blog post. Some person just reaches out and says, hey, do you have time to write a blog post on this random topic? I just thought of with no context That's what I think marketing is that it's worse when it doesn't really seem connected to the rest of the org I think marketing when it's running really well, it just connects these things together It shows the story you're trying to tell and often what I like to do is go out and look at a big customer You will one and tell the story like I'm beginning to end of how the entire or impacted that wing
Starting point is 00:05:44 And it's not the seller, you know, making that call three weeks before. It's 18 months before they joined a community Slack and they asked the question. Someone on DevRel, an engineer popped in and answered that question for them. That's like the first point of contact they had. 18 months ago, being able to tell that story
Starting point is 00:06:00 because I think really what marketing is really powerful and excels at. Tell me a little bit how, I'm sure some of this happens organically just because of your background and kind of how you moved into marketing from being like a deeply knowledgeable user of Dagster. I think some of this probably happens more organically for you, but especially as you're growing a team, what are the ways that you are structuring things so that you don't lose that?
Starting point is 00:06:28 Because it can't all come from you, right? You have this unique position of I know and really get the product, I get the vision, I understand our kind of ICP, but you can't scale in just your brain. How are you growing the team and making sure that you don't lose that as there are pressures to scale and grow
Starting point is 00:06:48 both the team and the company? Yeah, the linear marketing structure at Dijkstra is kind of interesting. A lot of the content actually comes from the DevRel team and our DevRel team is staffed by people who use Dijkstra, love Dijkstra, know Dijkstra really well. They contribute to the core platform. They've built pipelines on it previous to their role here.
Starting point is 00:07:06 And so we found that having really technical people on the content side is really non-negotiable. We've tried, you know, bringing in people who don't have that technical background. And what ends up happening is they end up relying on DevRel anyway, and we're like, well, why don't we just read the content from scratch? Like, it's gonna be much more compelling.
Starting point is 00:07:21 And so I think that's been very helpful to the overall org is being able to access people who know them. I write some of our content, but I would say most of what we do today comes from the DevRel team. And then we still do the marketing side. It's like they have the expertise and distribution. I kind of feel like they're the engine
Starting point is 00:07:39 that powers the DevRel team and gets the message out there. That part has to be very specialized when it comes to like product marketing, when it comes to digital channels, ads, all that. I'm not an expert in any of this stuff. And so having these two like almost distinct teams working very closely under one roof has been super powerful.
Starting point is 00:07:57 On the attribution side, I'm interested in your perspective now being accountable, right? To sort of show, okay, to show, as a marketing leader, generally you have some sort of resource allocation. You have people, you have a budget, and you're responsible for saying, okay, I'm going to take this, I'm going to do these things, and then what's going to happen is hopefully revenue increases. And so you've been on the other side of that,
Starting point is 00:08:26 building attribution models that are trying to aggregate this data and run some sort of model in order to prove that and give that number to someone else. But now you have this really interesting perspective of, okay, well now you're accountable for that and you also have the underlying knowledge of how to do that. Can you speak to that a little bit?
Starting point is 00:08:43 What's that experience been like? Yeah, it's great. You just saw your data team attribute everything to me. Yes. And you look great. Yeah. It's an interesting problem because you are still over the data team. Right. Because I've actually been in this role for a minute too, where I was responsible for marketing budget and spent and over the data. Yes, that's right. So there's a cool thing you're your own customer there's also a little bit of like like we gotta be really like precise here and yeah not fudge too much between. The founders trust you
Starting point is 00:09:15 which is great. Yeah right but there's no reason to be yeah they wouldn't but yeah being your own customer is interesting. It is it's actually really nice on the data team not because you make up F but we don't spend too much time on that confusion to be honest. We don't try to ditty up between sales and marketing. Every deal is one together. We know that. That's cool. You don't really win by yourself. We're a favorite. Yeah. Our sales reps can't do outbound to someone who's never heard of Dynaster ever in their life with no marketing support. We find the deals come in through 10,000 touch points. So we went together. And so we don't even bother being accountable
Starting point is 00:09:48 to who sells this or who's to use that. That makes things easy. What we wanna know in the marketing team is like, are our campaigns effective? That is a data problem. That's not a, you know, who gets credit problem. And so working with our data team, we brought a couple of great data people there
Starting point is 00:10:04 who have deep marketing experience. They can help me understand like, you know, these campaigns, these conferences, these events we went to, the influence purchases for people there in the same accounts as the camps you won, what are the sales cycle collide? Are there differences between these things? Those are the kind of data questions you can start to ask to become better at marketing. But we've never been in this place where we have to like justify marketing assistance. I think at the end of the day if you're not creating opportunities sales will know and they'll tell you it doesn't matter how much if we keep the books or not deals are coming through we're not closing deals. I love how it's
Starting point is 00:10:38 refreshingly simple you know. Yeah I mean not that you don't, you know, after you go and run different models and things like that, but it is, it's great to hear your perspective on, okay, I mean, the business has a goal and these are the different ways that we can measure whether what we're doing is effective, which is great. Yeah. And for us, it's always been like, we, you have to also remember the things you measure are not the only things that work. And so I started to re-appear this a lot of the time. It's very easy to focus on
Starting point is 00:11:10 the campaigns you have in Salesforce and the digital channels and the conferences where you get views. But you don't record every single person you talk to, you don't record the power of your brand. There's all these things that happen outside of marketing's, you know, review sometimes, and you just have to have some trust that you allocated, you know, some money for this brand exercise. It's like people getting to know about your company and there's ways you can measure that, but like the scale of our budget,
Starting point is 00:11:36 you probably aren't gonna share it for good sense. And so also what I do when I work with finance is like I allocate experimental budget. Like here's money, we're gonna just go and try a bunch of things and see what works and what doesn't, what resonates with audiences. I will be able to attribute anything to this. Yep, we just accept that.
Starting point is 00:11:52 And here I'm gonna campaign budget, it's two line items. And here's the data I have that backs up the value of these things. And you have to do both. You can't just rely on one or the other. Yep. Does it surprise you how complicated that can be since you moved
Starting point is 00:12:05 into marketing and you're like, oh now I'm like, I got my hands dirty in this and marketing's actually pretty complex. When you start getting into it, it's a lot more complex than I wanted to be. Pete, our CEO, had this like lovely idea that said I think when I was going to do this that I would be like, you know, part I see part running marketing and they would like have time to do both of these things. And I tried that and I failed very quickly. So a lot of my work now has been just enabling the team and the future. They know what the priorities are and having them go out and execute where I can find the most impact is on the review side and getting feedback.
Starting point is 00:12:39 I can't go create a content as much as I used to or as I wanted to, but I'm still very much in the reads when it comes to seeing the content they produce, being there on the calls, reading the content and getting feedback on where I think they're nailing it and where I think they could improve. Yep. Yep.
Starting point is 00:12:55 Let's help you out and create some Daxter content right now. And I've got your name. Yeah, free content. You're super excited to talk about Daxter components. I've got one more marketing question. I want to know, so like how long has it been since you've been like November and I was also off for two months of pat leave right in the middle. At least a couple of months.
Starting point is 00:13:15 So I want to know one thing that you guys have done that you're like maybe in that experimental category where you're like I did not think this would have been working like kind of a surprise success and maybe one thing on the flip side of like I did not think this would have been a work and like kind of a surprise success. And they do one thing on the flip side of like, I was, swore this was going to be great. And like maybe it like wasn't as good as he wanted it to be. I have to ask you a question. I would say the conferences have been more impactful
Starting point is 00:13:36 than I thought they would be. Okay. We did re-event for example, that was quite a good one. Prior to that we did like snowflake data breaks. We always knew like vibe space, those were good. But it's only been a couple of weeks since we've had the data to actually show and prove that. One problem is sales sectors are so long that people don't have time for the data to come through. So those have been, I think, bread and butter for us and we tend to do them.
Starting point is 00:13:57 On the experimental side, it's really hard to know right now. I don't think I have an answer today. We're running a bunch of experiments to try to see what works, but it's still early to date. Cool. All right, so Daxter components. I said before the show I use tools. So you can give us a better view into that than I can. But I've used some tools in the past. I'm talking about the yamelification
Starting point is 00:14:22 of data engineering. Yamelification. Yamelification. YAMLification. Oh, and even we're sharing with the team some articles coming out about that. So I've actually done this with kind of a traditional team over the past year that is not used to YAML. That is like write SQL, writes for procedures, not even really a data engineering team. And it's been fascinating around the little usability things
Starting point is 00:14:48 that everybody struggles with. But I think it's been so long, you forget. And the most simple one is spacing. Literally, the YAML film, different parts, or the spaces are wrong. And they don't know about letters. But they just have a text editor open. They're like, how do you configure this file?
Starting point is 00:15:04 And they're trying to like figure it out So that anyways that that's been on my mind and then thinking about like how you guys are solving this problem thinking about how AI Like I mean, that's like a letter as a Marys solution But even AI more so could be a solution for those people so own it to you up tell us about components and the tell us about this bridge between Stakeholders and you know data engineers of data platform engineers and whatever we want to call them. Yeah for sure. So Dijkstra components is something we're working on actively right now.
Starting point is 00:15:32 It's already a product line and it was built to solve a specific problem we heard from customers which is often our Dijkstra champion comes in, loves Dijkstra, implements it, they're really happy. They get the framework, they know what it's like out, and then they go to like their team, their stakeholders and they're a little bit confused by how to use or operate on this thing. And so often what they end up doing is they create,
Starting point is 00:15:54 you know, YAML for their team. They do factories, they parse YAML, they create assets themselves. And that is fine, but it's a lot of work for those people to maintain and everyone's doing it a little bit differently because they're all solving this problem independently. And so we thought, why not just make that easier for everyone, build it in-house, and have something
Starting point is 00:16:12 that's sort of backed by the framework itself rather than kind of bolted on. And so that was our initial phase of components. We started building out this. You could be YAML. It doesn't have to be Python as well. It's a mix of both. You create components. The Daxo framework will as well. It's a mix of both, you create components,
Starting point is 00:16:25 the DAX framework will recognize them. It'll actually do schema validation for you as well. So if you put in grand spaces or bad key, it'll have a little highlighter, new text editor. Just like classic YAML editing stuff. Yeah, helps you build more quickly. But as you were doing this, we're like, oh, we just created a structured format
Starting point is 00:16:43 for building pipelines, which is really where we find tools that are AI powered tend to work really well. You put AI in front of like the back of the framework on its own, and it's just like an angry three year old who doesn't know what's going on. It's like banging pots together, everything's falling apart, there's like spaghetti on the floor. Like you don't know what's going on but if you say hey AI person robot whatever you miss what if instead of having access to the entire framework of Dynaster which is like too much for your little brain to handle what if he gave you a little box for your player that box is a Yandestima there's descriptions of each field so you understand what those are we created that for people but it turns out the same people who benefit
Starting point is 00:17:24 from a very constrained box and a nice sandbox to play in, LLM's also really offer it really well in that. They love context and they love structure. And so what we realized is we can actually now use LLMs to create these pipelines with Dexter. We have like an MCP server that's coming up. And so you can validate through your AI bot,
Starting point is 00:17:44 you can say hey build me a pipeline that takes data from Snowflake and puts it into DuckDB and it has examples, it has documentation to develop it out. It runs the test, makes sure that the animal is valid, if not it can fix it itself which is really cool and then eventually it'll build a pipeline for you. Builds the bones, uses all the entire framework, the platform itself, data engineers continue to build and maintain that. But our view is that data engineering is changing, and we're going to bring more people in to the ecosystem.
Starting point is 00:18:11 So this is one way we help lower that gap. What are the biggest things? So working on creating these environments that are more contained, that have guardrails, what are the main things that people are trying to do within those, right? Like as you saw this need, so you have your champion, and then they go talk with their stakeholders. What are the main things that the stakeholders are trying to do? Very simple stuff often, right? Like these are not complicated things. They want to add maybe a table, they want to move some files around. Maybe they want to unzip something and then take that data and put it somewhere else. If they're a data science team, they
Starting point is 00:18:44 might want to run some models, they might have a SQL query, they want to unzip something and then take that data and put it somewhere else. If they're a data science team, they might want to run some models. They might have a SQL query, they want to persist that data somewhere. And then they want to run some Python script against it. Very simple stuff to you and I, but like be able to connect these things for them. It's not always obvious, right? Like your framework can be a little bit complicated. And so if you're not familiar with that, you almost have to like learn Dynaster first before you can get the job done.
Starting point is 00:19:05 And so what we're trying to do is just reduce the orchestration bit of Daxter as much as possible. And so you can just focus on the business logic pieces, much as you can. So from the engineering side, I really like the like essentially limited blast radius for people and AI, right? And I don't know, was the people, the initial audience, it was like, oh, this works great for AI, is that like come a flow or was it like, were they kind of both coexisting when you guys thought? Because 100% of people first,
Starting point is 00:19:35 we weren't thinking about AI too much, I mean, not for this particular product, we were thinking of specific customers who said, it takes too long for us to deliver value next or we wanna reduce that gap. And we thought, okay, well, I'll help you do that. We were doing that. And at the same time, we're all exploring different AI tools and MCP servers and playing
Starting point is 00:19:53 them locally on our laptop. And one of our engineers said, you know, we should try adding an MCP server to this. It seems that will work. But I think everyone was like very hesitant and didn't believe it would work at all to like this stupid who uses AI for anything Whenever I like the marketing team to sell steam they're all like yes Let's go when I wrote the engineers don't like oh, I don't think so. Well one engineer is like, we'll just try it Let's see what it does did a demo showed the demo to the team. Everyone was like sold immediately
Starting point is 00:20:21 You literally talk in English to it. You say, hey, go me a pipeline next to YDRA. First time maybe you get mixed up mistake, it's not perfect, but once you inject context into something, it's so much more powerful. Yeah, well, I mean, it's put on your marketing hat or your sales hat. And I mean, that's where a lot of the requests are coming from.
Starting point is 00:20:41 And a lot of these companies is like, marketing wants to connect A and B or sales wants to make A and B and they'll be fully capable of like you said just like dictating in plain English like hey I want to connect these tools and essentially like let's say the AI gets it like 98 right or 95 percent right and you like have a TR to review of like oh like I don't think I meant that and but it's like instead of like going through the whole process of like requirements or whatever, you know, what's the process, like it's just so fast. Yeah.
Starting point is 00:21:13 I think the future data to do is changing. I think we kind of have to just accept it. There's a powerful, I was just going to say, I want to dig into that really quickly or maybe not really quickly, but to say it more directly. So we talked about stakeholders. John, you just mentioned sales and marketing can connect A and B, which is creating a data pipeline. Yeah, right.
Starting point is 00:21:36 If I'm a listener to the show, I'm immediately thinking everything about this sounds like a bad idea idea based on almost all of experience almost all of my prior experience okay so we just have to accept it but like what are we accepting here this inevitable future this is what happens when you move from a data team to marketing team yes you become an optimist this how long is the process This is kind of an un-jading in a way. An un-jading.
Starting point is 00:22:11 It's been great to be in marketing because you're now accountable to results and that just changes your whole perspective of life. Whereas running a data team, we all have to pretend we're accountable to the results, but every few months there's a new thought piece about what's the value of a data team. I think that's just like internal solver to like, what is my purpose? I mean, your purpose is to support the rest of the org. Marketing and sales.
Starting point is 00:22:33 Mostly marketing, but also something. But it gets a little meta, right? We started this with marketing as really supporting sales. So if data is supporting marketing and supporting sales, you could get really, really low. And that's why the value is so hard to measure because you're like supporting like three levels down. Yeah, you need like a semantic layer to sort of,
Starting point is 00:22:50 you know, interpret this. We're gonna take a quick break from the episode to talk about our sponsor, Rudder Stack. Now I could say a bunch of nice things as if I found a fancy new tool, but John has been implementing Rudder Stack for over half a decade. John, you work with customer event data every day if I found a fancy new tool. you know how messy it can get. Yes, I can confirm that.
Starting point is 00:24:05 to your existing tool set. Yeah, and even with technical tools, Eric, things like Kafka or PubSub, but you don't have to have all that complicated customer data infrastructure. So I am truly interested in this. So data engineering is changing. We have stakeholders coming in. So let's just say a couple of years from now, what does this future look like as components, unless it'd be AI manifestation of running inside of components and stakeholders using that. What does that future look like?
Starting point is 00:24:37 I mean, like example workflows or other things like that. Yeah, I think it's that you look at the backlog of work today, for example, on the sales and marketing side or you've got a cost of entities, similar Yeah, I think it's that you look at the backlog of work today, for example, on the sales and marketing side, or you've got a cost of entities, similar questions, I think. And most data teams today are already small on the resource constraint. I don't know a single data team that's able to meet the capacity of questions. And so today what happens is if you're on one of those teams and you know SQL or Python, you can go get your answers today.
Starting point is 00:25:04 But if you don't, you're sort of stuck. And there's different tools that try to address this in different ways, right? There's BI tools that are saying, we can put AI in front of your BI, and now your stakeholders can ask questions, pull data that way. I think that works great for some use cases,
Starting point is 00:25:18 but there might be cases where the data's not quite there yet, or you're transforming into both pipeline. You want to combine some things. Maybe that belongs in your data platform. I don't see why that's that different. Is it scary? Yes, but it's also, you know, I think it comes back from new engineers to meet their stakeholders where they are. I think if you have access to them, they can make it easier, but with guardrails and governance, no one's saying free for all, right? But like, that doesn't mean saying no to everything as well. I think there's a middle ground there somewhere.
Starting point is 00:25:46 Yep. But I would think there's this like data platform engineering role that like somebody should understand how all of this works. Like, I don't think that goes away. No. And then you have the, it's just a better interface. Like really with, cause I don't know any data engineers
Starting point is 00:26:03 that like love using project management tools or love like managing requests from like stakeholders. Really, because I don't know any data engineers that love using project management tools or love managing requests from stakeholders. Somebody loves that. So if you can automate that problem, that are more technical that are still having problems today. Data science teams, analyst teams, BI teams, they understand SQL and they understand Python, but they might still have trouble or they might rely on you to build these pipelines. I'm interested, Pedram, I think part of the inevitability of this,
Starting point is 00:26:40 and it's so cool to see Dagster sort of enabling this within components, but the appetite is going to grow very rapidly in RD House, right? Because even within the customers that I talk to, even within our organization, if you know SQL or Python, you can go self-serve in a way to do certain things., you can go self-serve in a way, like, you know, to do certain things. Maybe you can't build an entire pipeline or, you know, join these disparate data sets. And so, you know, components can help with that.
Starting point is 00:27:13 But more and more people have the disposition that I don't have to know SQL or Python because I can use an LLM to do that. Right? Python, because I can use an LLM to do that, right? And so, and of course, I mean, there are, that certainly there is a false sense of actual domain authority, but truly someone who doesn't know SQL can do way more than they used to be able to do because you just give it a bunch of context and say, okay, I'm trying to answer this question and it can generate SQL for you. And so the appetite to be able to do technical things without the underlying technical knowledge, I think is increasing really rapidly.
Starting point is 00:27:50 100%. And I think that's what you guys, we all have to start building towards, right? So when we talk to data engineers or, you know, analytics engineers, they often say, oh, there's too many stakeholder requests. I can't go and build high value stuff anymore. Well, we're seeing there's a way out of that mess. Go build that high value stuff. That high value stuff is always going to be how do you enable the recipe team to self-serve. That's what we always talk about. So the way you enable that team now, maybe change this a little bit, but I think it also becomes easier. You don't have to go and do a bunch of SQL courses anymore as a stakeholder.
Starting point is 00:28:24 What you do is your analyst now goes to create tables that are well documented and well structured and easy to consume, put those in a place that elements access to hide all the other shit so it doesn't go the wrong way. And now you have a great place for a stakeholder plus their AI to go and answer questions. Yeah. John, we were talking about this recently with Vercell, and they're doing some interesting things that they I but one of the most fascinating things You know, there's tons of buzz around okay the prompts and what are they doing and all this sort of stuff, right? But the real story is that they have these frameworks in this entire system
Starting point is 00:29:00 That makes it really powerful in context within this larger system, which is exactly what you're talking about. It's like, okay, you do have to have the platform engineer to your point in all of this context, but that is actually what makes it super powerful within a component, for example, where it's like you can do things that feel like magic, but it's actually backed by a very well-thought system and trainwork and documentation. Yeah, so people who think AI is going to take all our jobs, I keep pushing back on that
Starting point is 00:29:33 because AI is actually very dumb. It's not very good at taking people's jobs. What it's good at is very constrained systems that are well-designed. It takes people to build those systems. Totally. I might be wrong, AI comes and takes all of our jobs. Who knows? But for the foreseeable future,
Starting point is 00:29:49 the future I care about, the less control I have, I think AI is just going to make it easier for us to deliver value if we choose to allow it to. I think some people will fight with stuff tooth and nail and they say never let AI deliver anything to anyone. it's going to hallucinate and it'll be wrong. I got news for you, it's already wrong. It's all right. Exactly.
Starting point is 00:30:12 Have you seen the sequel? People are writing, it's not that great today. So yeah. But there's a clear path to success here. You can choose to partake in it or ignore it. When AI comes for your marketing job, that's when you go heavy on brand. Correct. Like AI will never really truly get brand. That's true. That's true. There will always be some value for me to find some. I've tried using AI for marketing and it's actually really bad at it. It generates the exact same stuff
Starting point is 00:30:38 as everyone else and if your marketing is undifferentiated then it's not in the marketing. So one like, so I think it was this primary narrative here that makes a lot of sense. We were If your marketing is undifferentiated, then it's not on the marketing. So I think it was this primary narrative here that makes a lot of sense. We were bridging, essentially, lightly technical users to a technical data engineering team or non-technical users. I think the other maybe a sub-narrative here is I think they're still used for this.
Starting point is 00:31:04 Let's say, like, in aAS company where like everybody's technical or almost everybody's technical almost everybody knows like a little bit of SQL maybe a little bit of Python like even in that like context it's still actually super valuable to have everything organized because when it's unorganized like like back let's say like five years ago, there were these practical constraints of like, all right, the data warehouse is like X big and that's a constraint. Now, if it's just compute-based,
Starting point is 00:31:36 unorganized is actually really expensive because all these people just built their own things and they all run and they all consume compute and they all, I mean, storage is still kind kind of basic with three but at least on the compute side it's actually fairly expensive to do it that way let alone like data not agreeing and all the like practical problems you have so I'm just curious like in it like if you weren't like talking to like a really like type of organization like what are some ways that they might use components I don't
Starting point is 00:32:01 think it changes right like components just simplifies things for everyone even yourself so if you find you're creating the same thing over and over again, you might want to use components. We're going to use components for all our integrations so that instead of having to learn how the dbt integration works and setting it all up, you can just use a Dijkstra created component for dbt. And now it's like five lines of code that are all YAML. And if you choose to go back into python you can but you don't have to so people love being lazy I love being lazy though if your brain cells I got to use here to
Starting point is 00:32:32 get a pipeline working with the better yeah I love that do we have time for a lightning round nine minutes nine minutes okay that's good it's more than enough time to cover the topic of streaming systems and orchestration. So we'll end early. So one thing that's interesting is, a lot of the big cloud providers are increasingly enabling streaming workflows, right? So that you can deliver data into,
Starting point is 00:33:02 let's just say a traditional sort of analytics warehouse in near real time, right? Which is interesting, right? Because this sort of, most of the world when you talk about streaming is, they think Kafka, right? Okay, you have a streaming use case, you're using some eventing system like Kafka or Kinesis or something like that. And there are really established workflows around, you know, where you stream stream that data and then the jobs that you run in order to stream that data into some sort of warehouse. But even then, generally you're running off of a scheduled batch, a series of scheduled batch jobs
Starting point is 00:33:38 that sort of run as a pipeline and you orchestrate it and you sort of do all of that, right? So how, at Daxer, I'm interested, how are you thinking about orchestration in the context of now streaming into these analytics databases essentially in real time? Yeah, it's a good question. I'm still trying to figure it all out, to be honest. So like you said, a lot of these cases are we got to stream this data in and then wait for some job to run for an hour.
Starting point is 00:34:06 And then, you know, it updates the dashboard, I guess, every day. Right. Well, we really do have a streaming for that. Sometimes it's like a stream already exists and you want to consume that. Backstrap works well in those use cases. We have users who do that. Yep, they've got a common stream. It runs every hour, every five minutes.
Starting point is 00:34:21 It pulls from the latest and it does a bunch of processing inserts in the dev. Snowflake, yeah, it tends to work really well, and I think that's just an established pattern. Snowflake has their dynamic tables. If you wanna be a little bit more pretty with things, you could stream insert and have it update all the things that matter. There's certainly use cases for that.
Starting point is 00:34:39 For us, internally we use Kafka as well, but more on the application side, so we drive a lot of our application analytics, all the user-facing analytics. Telemetry and everything. Exactly. Back to the user, but that's through Kafka. We actually had it in Snowflake previously, took forever to run, it was very extensive. It was delayed by 24 hours, which is not that useful.
Starting point is 00:34:59 So streaming, it was like I replaced it for that entire pipeline. It doesn't even run through that, it just like I replaced it for that entire pipeline. It doesn't even run through Daxter, it just completely bypasses it. So I think we're still trying to figure out where this stuff fits and what we mean by streaming. Some people mean like a Kafka stream, others mean event-based stuff. So if you're waiting for a file to show up in S3 and run, Daxter's got a great solution already in place for that. We just censor the way for stuff to show up with the Euronomic to come to pass.
Starting point is 00:35:25 It really depends use case by use case. I think today we don't really have, not even a dinosaur, but broadly in the data space, the right way to connect these things together. Yeah, I agree. Especially when you get into like a single stream is whatever, you start to get into joins, transformations, reading multiple systems together,
Starting point is 00:35:44 act fields, like what do you do there? All this stuff gets pretty complicated really quickly. So I think to just say self-help, a gap, I think at the moment. Yeah, yeah, I agree. I agree with it. And there's been a lot of people I've worked with, like, oh, I like how with this streaming,
Starting point is 00:35:59 I was like, okay, cool, what do you mean? And then like you said, you get down to the use case and you're like, okay, so in use cases, and a couple a couple hours is fine and i'll go over like oh by the way You're paying for a compute when the computer on like the bill's running It's gonna be like half the price if we do an hour, right? And the or whatever the number is like, oh, yeah, i'll have that like almost especially when cost comes into it Like almost nobody's like no, it must be real time when like not real time is like significantly cheaper. Yeah.
Starting point is 00:36:26 For a lot of analytics you see, it's like what's the decision you're gonna make? There are cases where, you know, we need a real time decision because it's costing us money to be weak. Yeah. Right? Like that's when you really need to start building systems.
Starting point is 00:36:37 But often those are like best spoke purpose built systems outside of a traditional analytic. Yeah, yeah, totally. Totally. Yeah, yeah. AB testing is one we've seen with our customers, right? Where you, if a test wins and you can get a 20% increase in conversion, like that is serious money.
Starting point is 00:36:55 And so great, like I'm willing to stream data to understand that immediately because as soon as I can point, you know, how to present a traffic there. But yeah, you get into user-facing analytics and it's like, well, that's actually just a completely different system. Yeah, how many times have you got an ad for something after you bought it, you're like, yeah, maybe they need a screen platform.
Starting point is 00:37:14 One really interesting one is auto approvals for things, like loans or there's different things where people are doing really complex, neat systems around that. They're like, okay, the money is, I'm sure, definitely worth it to do streaming for those specific use cases. There's different things where people are doing components and kick the tires on. That's a great question. We just had a webinar, it's on our YouTube. Go to docks.io, there's probably a little bar
Starting point is 00:37:47 at the very top. You can go to docs.daxter.io. You'll see components underlaps if you wanna try it out yourself. It's still actually in production. I guess the last place you could go in is our community Slack channel. We have a DG components channel on there.
Starting point is 00:38:02 Come check us out in there as well. Awesome. Patron, this has been great. Thanks for coming back on the show. Thanks for having me. Oh, it's what? Wait, wait, wait. All right, well we are wrapping up with Pajram.
Starting point is 00:38:11 We will be bringing more in real life interview content to you from Data Council. So stay tuned and thanks for joining us. The Data Stack Show is brought to you by Rudder Stack, the warehouse native customer data platform. Rudder Stack is purpose built to help data teams turn customer data into competitive advantage. Learn more at rudderstack.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.