The Data Stack Show - Re-Air: Bridging Gaps: DevRel, Marketing Synergies, and the Future of Data with Pedram Navid of Dagster Labs

Starting point is 00:00:00 Hey everyone, before we dive in, we wanted to take a moment to thank you for listening and being part of our community. Today, we're revisiting one of our most popular episodes in the archives, a conversation full of insights worth hearing again. We hope you enjoy it and remember you can stay up to date with the latest content and subscribe to the show at datastackshow.com. Hi, I'm Eric Dots. And I'm John Wessel. Welcome to The Datastack Show. The Datastack Show is a podcast where we talk about the technical, business, business and human challenges involved in data work. Join our casual conversations with innovators and data professionals to learn about new data

Starting point is 00:00:37 technologies and how data teams are run at top companies. All right, welcome back to the Datastack show. We're here with Pedram Navid from Dagster, the chief dashboard officer. Pedram, welcome to the show. Great, you're here. Thank you. Yeah, so I think it's your second time on the show. It's been a little over a year. We'd love a quick kind of update and then tell us a little bit about your current role.

Starting point is 00:01:05 Yeah, I think last time I was here, I was enjoying consulting life, which meant lots of birdwatching, lots of looking outside, being outside. Since then, I've joined Dexter Labs about a year and a half ago, initially to run data in Devrel and now also marketing, so far less time to do birdwatching. That's too bad. Yeah, it's too bad. Back to the grind, as it were. Okay.

Starting point is 00:01:27 so we're going to spend a few minutes chatting we've been spending a few minutes chatting preparing for the show I'm excited to kind of get into like how you've gotten to this point and orchestrators in general what are you looking forward to chatting about yeah I mean I can always talk about orchestration we'll talk about data platforms how we've got to where we are could be kind of a fun story we can always talk about AI we can talk about data engineering and how you somehow accidentally end up running marketing could it all be fun right I'm like I'm excited. Let's do it. Padram, excited to have you. Let's talk a little bit about how you ended up at Dagster. So you were doing consulting, had some time to kind of work as you pleased, and now

Starting point is 00:02:12 you're back at a startup. So tell us about that process. Yeah, I think what happened is I was actually consulting for Dagster initially. We had a great relationship and Pete and Nick, CEO and founders, asked me if I wanted to join. Initially, I said no, because I was enjoying my freedom too much. But one thing I found with consulting is your scope of work is often limited and you don't get to see things, you know, fully end-to-end. And I also kind of missed a camaraderie of having, you know, people to work with and teams. And so after some thinking about it, I reached out back to them and I said, hey, you know, if that offer is still on the table, I'd love to chat about joining. And so we talked about a role, which was initially just a had a devrel role with, I believe, maybe data on the side as well, a small team of two or three people. and that was almost June of last year.

Starting point is 00:02:59 And so I've been a year and a half now since I've been here. A couple months ago, we took on marketing as well as part of Devrel, which initially I wasn't so sure about, but now that I've seen it operate, it makes a ton of sense or Devral marketing to be close together and working together. Yeah, that's really interesting. So I've had a previous experience too where I ended up having a data team in marketing as well. Tell me about maybe some of the unexpected,

Starting point is 00:03:26 synergies there. You've got dev rel, marketing, data kind of on the side. Like, what's, what's come about? You're like, wow, this is cool, this is unified. Yeah. If you had told me initially that I would be on a Devrel team reporting to marketing, I probably wouldn't have taken the job because I've always felt like marketing didn't quite get Devrel. But this way, it's kind of flip. It's like marketing and Devrel reporting to me, and I'm okay with that. So what I found is, like, the Devrel's side of the house is like the content arm. Dexter's a technical product. We target technical people. And so we just need technical people

Starting point is 00:03:58 who have experienced in the field to create the content. For me, content's a broad term. It's not just blog posts. It's tutorials, workshops, webinars, how-to, actual integrations. Our team, our Deverell team has built integrations that have, like, one deal.

Starting point is 00:04:13 So Deverell is, like, the producers of, like, the marketing arm of Daxter. And then the rest of the marketing org is really in support of the distribution of that, right? where the dev rel team probably doesn't have its expertise is how to get their content out into the world, whether that's through like paid ads or events or campaigns, that type of thing.

Starting point is 00:04:32 And so having the two teams together, it's like really, actually a lot of synergies. I hate to use that work, but it is exactly that. Yeah, yeah. Where they sit together, we're on the same meetings. Every meeting, every week we talk about what we're working on. And, you know, the advanced person picks up on something the deverell team's working on.

Starting point is 00:04:48 And so does like the campaign manager. And then three of them together, they're like, all right, let's go build something more holistic around that rather than just as one-off content that you created. Yeah, that makes a lot of sense. So I've got Matt here co-hosting today in place of Eric. Matt, you've been that technical data audience before, you know, looking to purchase products or things like that. I'm curious. And I'll have to ask you the same thing, Pedro. Like, what really clicks with you if you think back about like content or maybe even just interacting with people around these technical products.

Starting point is 00:05:22 Like what mediums or what, like, what can you think of this really, like, clicked with you in the past? Yeah, I think anything that lets you kind of see how the product actually works in a real kind of way and not just the super trivial kind of look, one plus one equals two type of manner. Right. I think that helps, especially because previous to a lot more of this, it was very marketing-y. So it was everyone feeling like they were trying to bait you into giving them your information or trying to thing or whatever. So things that kind of give you that ability to see it. And I think they have that credibility of professionals who've used it and who can show you.

Starting point is 00:06:02 This is what it's actually going to help you with. So not like the 10 tips for personalization. In your marketing using data. But yeah, so same question to you, Pedro. Like, what have you found that works? because it's the data, at least in my opinion, that data technical audience is a tricky one. It's a tricky one to find,

Starting point is 00:06:21 tricky one to, you know, resonate with. It is. And it is like this meme almost of like developers hate being marketed to. And I don't think it's true. I think developers need a certain type of marketing that works for them. Yeah. Along their journey. And their journey often might look different than, you know,

Starting point is 00:06:40 someone like a leadership role, for example. And it just has to because a developer is going to, going to sit down. And what they want to do is almost every single time is they want to try the product. I want to figure out, is this thing, what it says it is, is it useful for me? Will it work in the way that I need it to? And so a lot of the Devrel focus and the focus of Degster's marketing arm is to enable developers to be successful in that entire journey from like becoming aware of the product, trying it out, learning about it. And so things like docs matter a lot more to a developer than they might get to a technical like leader even. As a director,

Starting point is 00:07:14 of data, for example, you probably aren't going to sit down and try Dynxter. You might care more about what are the features, benefits, how is it solving, you know, the five things my CEO keeps yelling to me about. But your data engineer is going to want to actually try the product and make sure it hits the things that they actually care about. Well, and that can also be a little tricky just because the technical ability of data people, there's a pretty wide spectrum you can fall on there. There's some that are very, like they came from software engineering. And then there's others that are very self-trained and might be coming more from the, I'm doing data engineering or whatever because I have to and no one else here to do it. And so I'm scraping together

Starting point is 00:07:54 YouTube tutorials and stuff like that. How do you kind of, do you guys have a specific part of that you're targeting or do you try to kind of have content more of a wider swath of that spectrum? We definitely do the whole, we try as much as possible the whole range. You have to. There Or, like, what I've learned is that not everyone is me. And, like, I, like, a certain way of learning that other people don't. And people, like, ways of learning that I refuse to use. A great example is, like, Dexter University. It's something we spun up last year.

Starting point is 00:08:27 It's, like, an online course. It's, like, structured. You go through lessons. And that's the last thing in the world I would ever want. And when they suggested it, I'm like, I don't know about this, guys. All right, we'll try it. And people love it. They love it.

Starting point is 00:08:40 We get five out of five. Like, if you look at our ratings, it's like 4.8 out of five. And we get weekly emails, like how much they enjoy it. And it was like completely forward to me because that's not how I learned. What I've learned is like you have to provide scope for everyone. There's people who want structured training. There's people who want to just read the docs. Some people just want to install it and look at the source code.

Starting point is 00:09:02 That's like the range that you have to deal with. And all of that has to be good. Your source code, your documentation, in your app, your code. your code has to look good in a way that people can understand and interact with it, all the way up to your tutorials and video and people want to sometimes sit down

Starting point is 00:09:16 and watch like a 30-minute training video on the product as well. And so we do all of it. And we hire for that too, right? We have people on the Duffrell team who are much more focused on the earlier stage for someone of people who are like getting first started

Starting point is 00:09:29 and we have people focused on much more deeper understanding of the product as well. Yeah, I'm definitely one of those that like I do not want to watch a video. are. For any reason, I cannot sit through a 30-minute technical video. I'm one of those that wants to pull up the code. We'll reference the docs when needed. We'll struggle through it, and then we'll think to myself. I should have just watched that video. So, yeah, I mean, that's a really wide persona. And Daxra is a super flexible tool. You can use it a lot of different

Starting point is 00:10:00 ways. And that's got to be a challenge as well, where I come to it with like, oh, I have this specific problem and you've got a tool that like well we can solve lots of problems like how do you bridge that gap that is also a great question we are looking to bridge it through product improve it so we have something coming up called thanks for components i don't know if i'm allowed to leak it yet but it is coming it will be more focused on providing almost like building blocks to develop the data platform yeah and so it'll be a command line based tool initially but you'll have like your yamble schema you'll have a very easy ways to plug and play different integrations. That's like our approach to sort of addressing that while

Starting point is 00:10:41 always being able to expose the underlying Daxter framework, which, as you said, is extremely flexible, which has both as pros and cons. The pros is like you'll never really be constrained. If you can do it in Python, you can do it in Daxter's essentially your limitations. The cons can be like for a very simple setup. It can often feel like a lot to go through if you just want to orchestrate like one simple task. Yeah, that makes sense. So let's assume out a little bit for people that have no idea what Daxter is, maybe I've never even heard of orchestration, like that kind of analyst persona. How would you describe just the general field that you all are in, the data orchestration field to someone that was like, I have

Starting point is 00:11:22 no idea what this is? Yeah, it's a great question. Everyone orchestrates. They just might not do it intentionally or they might not know that they're doing it, right? Orchestration could be as simple as you log into your computer once a week and you click on a button and you kick off a process. It's a very manual orchestration, but it's totally fine, and often it's the right decision for you. It can become a little bit more complicated when you start to use something like a Cron scheduler that runs every single day or every single week at a certain time. And that's often enough for many tasks. When things start to get a little bit complicated is when you need to add dependencies or you need to be resistant to failures, essentially. Once those two things become into play, like you want to make sure that A runs before B every single time.

Starting point is 00:12:06 can't rely on Kron, you sometimes can, like, fudge it. You'll say, you know, you start at 12 and we'll start this one at three. And I'll hope it never takes more than three hours. Exactly. And it will always succeed. And if that's true, you probably don't need an orchestrator. But often what happens is, I think people realize they need an orchestrator a little too late. What they thought was true no longer becomes true. You can't really observe Kron that well. Your tasks take too long. Something fails. Or even worse, your vendors, like, oh, by the way, that thing we sent you two months ago, it was wrong. Here's an update.

Starting point is 00:12:39 I go and fix that. And it's like, well, I can't rewind time. And my cron schedule doesn't know how to rewind. And so once you start to get into these types of things, that's where orchestrators come into play. And they start to manage some of these more complexities for you. I feel like you said there that, you know, where you bring it in probably later than you should.

Starting point is 00:12:57 I feel like that's a recurring theme for a lot of successful data things are, you know, if you would have brought this in two months ago, this is a five-minute fix. Now we're very limited in what we can do and that type of a thing. But I also don't know if there's a way around that. Yeah, I mean, you don't know what you don't know, right? And especially if you're doing something for the first time,

Starting point is 00:13:17 it's like, oh, like this works. And then, like my favorite. Because I think most people, if you're in a data role, get to that, at least get to that time gap thing, where I'm going to have this run at midnight, this run at 2 a.m., this run at 5 a.m., everything's fine. And then usually, if you get in that world,

Starting point is 00:13:35 you have some bad mornings where like the first one failed and then it's like kind of a house of cards and then because some of these take maybe you know hours to run like it takes like you're kind of sunk like you basically lost a full day for having the data correct I think that's experienced right like you get burned hopefully only once and you learn your lesson or you work with people who have been burned

Starting point is 00:13:59 and they've learned their lessons and they'll impart that on you or you'll listen to the data stack show and you'll like learn about you know things not to do uh it's also human nature i think it's so much easier i mean this is why ice cream tastes good you don't really think about the consequences right running pipelines on cron feels good because you don't have to think about the consequences right until it's too late so um we try to educate people about how it's probably easier it's not that hard to set up a pipeline in daxter just cron like you can do it just in cron you don't to use any of our advanced features. We have a cron scheduler. We have it in Dexter. And you'll get a

Starting point is 00:14:35 pretty UI, which is all more than you normally get out of cron. And that's worth its weight in gold. And then from there, you can evolve as you need to. You don't have to go and build these complex dependencies if you don't want to. But get started with something when it's simple, when it's just like a few tasks, a simple DBT pipeline. Very easy to do in Dexter. We've got a great integration. Or do it in a different orchestra, too. It doesn't have to be Dexter. There's others out there. But get it in something that you can observe because I think every engineer knows observability and logging are like critical

Starting point is 00:15:04 to any system. Yeah, that makes a lot of sense. I've used Xer for a couple of projects and this was kind of interesting. I had, it was last weekend. So anytime, no, it was two weeks ago, it was around kind of that New Year's Christmas holiday. And I got an error. I had set up

Starting point is 00:15:22 alerting, I got an error, which was handy. And I thought, like, you know, what is this? It's like I better check on it. And sure enough, you know, it's an API, like access to NID type error because I was pulling data from an API. So like, what happened, you know, figure out like, do I need to, did the credentials expire or what happened?

Starting point is 00:15:40 It was funny. So essentially what happened is I was pulling data from, there's like 28 different locations on this project. And essentially one of the locations I closed at the end of the year. But since I had everything like separated out, it was like, okay, cool. I can just like turn that location off and like everything keeps going and it's not a big deal. I think those are the types of things that like, you know, had it been the other way where essentially everything like cascades through and you're like, oh, like I'm going to have to like rewrite a bunch of stuff, et cetera. Those are the fun moments.

Starting point is 00:16:13 So I guess I'm curious from your perspective, obviously there's lots of different orchestrators out there. What's special about Daxter and maybe even what's special about Daxter for analytics orchestration specifically? Yeah, orchestration has been around for a long time. I think Cron is like the classic, right? From there, I think Airflow is probably the next biggest registrator, most people have heard of. And that's a task-based orchestrator, right? So you've got a thing you want to do, you tell it, and it runs, and it's like black box, and you sort of hope every box continues the way you wanted to. But you have no ability to, like, peer into the box. What Daxter sort of said is, like, what if we split that or reverse that? And instead of telling us about the task, tell us about the things you actually care about, or let us discover those for you. So a great example is, I think, a DBT project, everyone sort of kind of gets where that is.

Starting point is 00:17:04 It's a collection of, like, tables that you want to materialize at some, you know, regular cadence. The traditional airflow way would be to have a DBT task that just runs your DBT project, and then you sort of assume all those models in there are completed. In Daxir, what we do is we flip that around and we actually expose every single model. as an asset. And so Daxter is what we call an asset-based orchestrator because everything you care about is now represented in this big graph of things that you can sort of follow all the way through their logical conclusion. And so you can see all your DBT models within the Daxter view. And you can actually be kind of clever about it. You could run the whole thing at once every single day if that's what you want. Or you can say, you know what, my stakeholders care about these five

Starting point is 00:17:47 models. Run everything that depends on those on a five-minute schedule because they really want those things to be updated. And then these other models over here, those put them in a group that runs once a day whenever you feel like it doesn't really matter to me as long as they're refreshed daily. That's something you can start to do with that extra. And then because you have this like asset view, you can start to connect things outside of DBT as well in a really intuitive way. Maybe you have a BI dashboard in Sigma. Maybe you have, you know, some stuff happening in register stack that you want to connect it to some files dropping S3 bucket, FTP. All these things start to connect and you build lineage on them. And so you can be really really.

Starting point is 00:18:23 clever about the full end-to-end orchestration of this thing, rather than just focusing on a specific task. And so Dyx has really been, I think, the next level of where we are going with orchestration. And in fact, airflow is even starting to move in this direction, which I find really validating that, like, this is really the future of where orchestration's going. Yeah, I think one, two, two benefits that I've seen from this, like, asset style orchestration has been essentially what you said, one-time compression. Because if I have separate, like, extract jobs that then load into a warehouse and then I have to transform and it's all like linear the time compression to get that one essentially one report that I need to be fast like

Starting point is 00:19:02 fast as in like very up to date is there's just a limit right like if I'm having to do all of it here all over here all over here the there's a time compression but since everything is is compute based now there's also a cost implication right because if I can compress some of these like times for the the ones that I want to be really fast, I can also do the opposite for things that like, I only need that once a day. Before I was running this whole thing and everything was like every five minutes, I can delay this 80%, which I don't care that it's a day old. And then that's, that's compute savings in your warehouse, potentially cravings and, you know, your ETL tool. Yeah. I think that's a big deal. You could take it even further. Because you've exposed to it's like

Starting point is 00:19:46 data lineage, you get all these side effects almost for free. And that's something we've actually learn ourselves. It's like, now you have this data catalog essentially, right? You understand all your data assets and you have the source of truth of where your data is defined. Well, now you can search that and now you have a data catalog for free. Like, you don't have to go and maintain a separate one. Yeah. Data quality becomes something you bolt on top of your actual execution. It's not an afterthought. It's like as part of your pipelines, you can start to emit what we call asset checks or data quality of things. And like you said, time compression becomes a much more interesting problem because we can actually be very declarative in Dexter instead of saying we want to run

Starting point is 00:20:22 these things every day at 5 o'clock. You can say this asset needs to be updated by this time. Do whatever it takes to make sure that's done. Make sure like you run all its parents whenever you need to. And now you're limited by only the chain of things that matter to that asset and not everything that comes before it. So we get a lot of really, I think nice side benefits of this asset view that I don't think we really knew we were going to get when we first started going down this path, but it's become really interesting. Well, and that I think speaks to one of those things that you see is that a lot of teams find themselves kind of they're drowned in whatever their process is.

Starting point is 00:20:57 And so they can't really see what the next thing they could be doing is. And it's only once they kind of free up that space or that mental thing because, okay, now I've got Dagster that's running this and I don't have to think about it. Oh, now look at these other three things that have popped up that we can do that we're never part of our initial plan of, you know, we were just trying to, like, not have to spend three, four hours every day, you know, troubleshooting or fixing or running whatever. And it's like, now that's gone.

Starting point is 00:21:26 Now we can actually see more opportunities that we could have never thought of before. 100%. There's that old cartoon of like a two caveman and one has like a square wheel and he's trying to push it. And his friend with the circle wheel is like, oh, you should try a circle wheel. And he's like, oh, I don't have time for that. I'm spending all my time pushing this. square wheel up the hill. And I feel like that's the same way with like orchestration. Often it feels

Starting point is 00:21:48 like just an extra like step that I have to go through. But that extra step is like going to compound your productivity down the line. Yeah. So I'm curious about the little bit about the software space, software stack. So we're in 2025 now. I think the modern data stack was declared dead last year. I don't know. Last year or two. Which I think practically means like people are seeing like consolidation essentially. I'm curious, like some of your thoughts on where do you think that shakes out? Because we've got so many different layers we've added into a data stack of like extraction, observability, orchestration, transformation, you know, the list really good storage.

Starting point is 00:22:34 Like the list goes on, like, how do you see that playing out in the next few years? Yeah. I feel that any time, like, you're not enterprise ready until you've been declared dead. like that's sort of Yeah exactly Love that So the modern data stack

Starting point is 00:22:49 I think is now Enterprise ready I think it's ready for you know the mass market to adopt and what we might call dead I see being implemented

Starting point is 00:22:57 still there's so many companies going through like cloud modernization efforts for sure and they're moving towards Snowflake they're moving towards

Starting point is 00:23:05 Databricks you're moving towards DBT and cloud like that's not dead so if we define modern data stack as like cloud data warehouses and like a few really good tools that's fine yeah i think modern data stack sort of

Starting point is 00:23:19 if you want to talk about the 2020s version of it where every basically function you had to do what's its own company yeah yeah that's probably dead i don't think people want 27 vendors to do three things at the end of the day right and so consolidation's going to happen i mean we're seeing at a dixter like our customers are asking for us to like combine catalog and quality into yeah one thing yeah our catalog will never be as good as a full featured catalog that you go out and buy and pay like a grand for it. That's not where we're competing. But there's probably some elements of those things that you can combine within the products you're already using. That's going to continue. I mean, I think Bynchran is doing this

Starting point is 00:23:55 with like their transforms. I know you guys at Rutterstock are doing this as well. Thanks for doing it. I think it's just natural. And what's going to happen is what happens all the time. We see a bunch of consolidation. People get annoyed at the consolidators. Some new tool comes out. And it's like, I'm really good of this one particular thing. Interesting to go down again. We get 100 of those things. It's going to be a cycle. And I think right now we're just in the, what is that plateau of productivity area where

Starting point is 00:24:23 I think things slowing down has actually been really good for data teams in general. You don't have to pay attention to 500 different things. You can kind of just put your head down and get your job done. And the tools you're using to do that just keep getting better on their own, which is a good feeling. Yeah, I think also during especially that peak, like, 2020-ish, 2021-ish time period, a lot of teams got very hooked on all the different tools and kind of, you know, I mean, I saw where teams could kind of lose track of like, well, what is this ultimately supposed to be serving, you know? Well, look, but we've got, we've got all these

Starting point is 00:24:59 different things and we've got all this data in a warehouse. And it's like, okay, but what's happening to it? How is it actually turning into revenue or savings or profit or whatever? Yeah. I mean, and it wasn't just data. What I realized now, I mean, I'm in marketing land a little bit. And the exact same thing was happening there. What was going on in marketing is everyone wanted a tool to solve their particular niche use case. And almost like nobody wanted to do the work. They just wanted to buy tools to do the work for them. And you ended up with like these massive marketing stats with like 40, 50 different tools to do like three things. So it wasn't just us, but it was everywhere it felt like at that time. But I think we're now in a better. place where I think interest rates solve a lot of problems to be honest like yeah sure yeah money not being free yeah it solved a lot of efficiency problems anyway I'll put it that way right so we're seeing that consolidation it might not feel good to everybody but I think at the end of day businesses are operating more leanly and they probably aren't you know losing a lot at that expense either yeah I think that's right so talked a little bit about orchestration what that is

Starting point is 00:26:06 Daxter's unique twist on that. I'm curious about your kind of career trajectory. You mentioned when we were talking earlier, data science, data engineering, now you're in Evrel marketing data. Tell us about that journey. I think it's a little bit of a unique journey and be interested how that all played out for you. Yeah. When I did like, I think it was in high school, they asked you to fill out the survey. It'll tell you what kind of job you had. I don't even remember. what it was, but it was like a job I'd never heard of. And I never knew what I wanted to be when I, like, grew up. I just sort of fell into different jobs based on what I was interested in at the time. Data science was, you know, a thing that was everyone's mind back in 2018. I think

Starting point is 00:26:52 it was. I was listened to all the data science podcasts. Many of them were now defunct RIP. But they were, it was the next hot thing, right? And so I was like, all right, I'm going to figure out how to become a data scientist. And I did that for a few. few years. And what I realized was the new batch data scientists that were coming in, they weren't as technical as I had been. I spent more of my time programming than they had. And so they were great of building models much better than I was because they were trained in it. But they couldn't deploy them at all. And so I started building like infrastructure just to make it easier for them to deploy because their code was better than mine. So I ended up becoming a data engineer by accident. And I found that

Starting point is 00:27:31 really rewarding. It was great to like build something. And then the reward is like someone using. it whereas with data scientists the reward is like maybe in a year you'll find out if your experiment was correct yeah right so for me like that instant validation of like knowing i built something that clearly like works or doesn't and the person next to me is benefiting uh was super empowering and so that's how i started in data engineering did that for however many years eventually became a head of data at a company called high touch which back then was like really focused on the data persona and as part of that

Starting point is 00:28:08 I was also doing what we can call Devrel essentially talking about the product to data people. Ended up starting a team there moving on to consulting where I thought initially I was going to do data consulting and help people with their data problems

Starting point is 00:28:24 but almost every company that talked to me wanted me to help them with their marketing problems and even though I didn't think of myself as a marketer I think they saw the Devrel activities I was doing and success we were having at High Touch and they wanted to replicate that for them and a lot of that was just educating

Starting point is 00:28:40 them that like you know, copying the thing that I did that won't work for you. Yeah. It's not the blog post that's successful. I think a lot of people look at like DPT for example and they saw their like massive community and they thought, oh, I should open a Slack community.

Starting point is 00:28:54 And it's like, well, why? Like how? Like where's the value to the actual user? Do you think did people want 25 different Slack communities or do you think they want one or two places to hang out. That might already be like a place that's covered for them. So it was more about talking through what were really marketing principles, but to me it was just a common sense

Starting point is 00:29:13 about how to get to data people in a way that made sense. And that, I guess, like put the mark of marketer on my head and eventually I joined Daxter initially as Devrel and more recently Devrel and marketing and also data. Yeah, that makes sense. There's a trajectory makes sense. and I would imagine so the alternative here is like let's just you know for Dexter like well sorry you're a marketer right yeah and there's got to be

Starting point is 00:29:39 we've already talked to some about the synergies there but there's also got to be this like scratch your own itch you kind of get to market to yourself or to your previous self which like that has to be an advantage I think for a company like Dexter and for like any technical company

Starting point is 00:29:55 that markets to technical people having a technical person who really gets the audience and the go-to-market motion and really gets it is critical. And I think we've even made mistakes with this as well in the past where like we're an open source core product, like by our nature we are. And so we shouldn't hide that fact.

Starting point is 00:30:16 And I think if you talk to a traditional marketer, they might be like scared that people might use open source because we're not capturing an email, right? So direct them to the email form instead and get rid of all the open source things. from our website. And bury it deeply. Right.

Starting point is 00:30:33 Like it doesn't exist anymore. Kill it. And like that's the mentality of someone who doesn't understand how like developers might operate. Right. Like a developer is not going to want to sign up for a course or fill out of form.

Starting point is 00:30:43 They're going to want to try the product. And they do that through open source. And so open source to me is not a competitor to Daxter Plus or enterprise offerings. Open source. It's like a channel. It's a channel where people get to try it. And if people go out and they're successful

Starting point is 00:30:59 with open source and they never want to talk to us. That's totally fine by me. That's another Daxter user out there in the wild talking about how great Daxter is. That's free marketing. And so for me, open source is part of it. And like, you really have to understand developers to be able to market to them.

Starting point is 00:31:13 And that's really kind of why this marketing journey between Devrel and marketing made sense to me. At first, I was suspicious. I think if you asked me as a Devral person to report into marketing, I probably would have said no. But if you have Devrel and marketing working together and they're all reporting it to me.

Starting point is 00:31:30 It kind of felt fine. And I'm seeing it today. Like, it actually works out really well. Yeah, and I think that's also, when you get to the open source stuff, I mean, especially when you're trying to do something at scale, it can be most open source projects are really hard to continue at scale. So it gives you a way of people like it. They trust it.

Starting point is 00:31:49 And then they can go to, okay, how do I make this easier for myself to use over time? Yeah, and we see that all the time. Like, people don't want to run and maintain infrastructure. generally. Right. It can't be the only thing because often the companies that are good enough

Starting point is 00:32:03 at eating Dexter, they can figure out how to deploy Dexter themselves eventually. It's not that hard. So you do need to have things that are value-driven in the enterprise offering, hopefully that will drive people to that.

Starting point is 00:32:14 But also, it's easier to get open source into an enterprise than it is a vendor. So if I work at a big company and I really like Dexter, will I go and try the open-source product and prove its value? Or will I get into this,

Starting point is 00:32:28 long, lengthy, lawyer-driven vendor negotiation thing before I've even like shown it to my peers that it's a good idea. I'll often start up the source, I'll build some momentum. And then once we've proven out its value, we've hit either scaling limits or I just don't want to maintain it or we want additional features. I've proven it's useful. I can go and have that conversation and I'll go contact a sales team and have them start. But like knowing that's a journey that people go through is I think critical in building out like technical orcs that market the technical people. Yeah, I couldn't agree more with that. And there's this other component to where you've got, you know, a team that's vetting a product, proving it works. Like, imagine that you're going

Starting point is 00:33:10 through a traditional enterprise sales process. And I've done multiple of these where you don't get to see, touch, do anything with a product until basically the money has changed hands. It's been a while since I've done one of those type deals, but I've done those before. And those are scary as a technical person. A lot of times, and a lot of times is maybe driven by like marketing or sales, for example. They've got to have this product. And then you as a technical person stuck with like, you've got to integrate to implement

Starting point is 00:33:37 it. So number one, like for people who have been around a little bit, they have that in the back of their mind as far as like the alternative and hate it. And then number two, like you have this other practical competitor in a sense where the open source product keeps you, I think, honest as a company. where like if you ever were to like 10x your prices overnight, like people could switch to open source, for example. But if you're like a traditional enterprise type thing

Starting point is 00:34:04 and you tinex your product and people are kind of stuck because it's hard to replace, then people are stuck and they have a lot of pain to switch. So I think that's another component that I've always appreciated about open source. Well, and I think the other one with that is like when I first came on to Rutter Stack on the marketing side, one of the things that I told them was because there was I was talking to someone on the marketing team,

Starting point is 00:34:27 and they were like, well, we really want, you know, Rudderstack to be the reason you get your next promotion. And my reply to that was, I don't know anyone in the data side who buys software to get promoted. I know people who don't buy it because they don't want to get fired. And the open source kind of helps you bridge that gap

Starting point is 00:34:47 where we're not saying, like, hey, I need to make a really big commitment that's going to take time to implement. and I really hope it goes well or I'm not going to be here in a year. Yeah. Yeah, I mean, the other thing we've seen is like if you really want to get promoted is that you build Daxir for First Principle

Starting point is 00:35:03 and it takes three years and then you quit, right? You get that staff level engineer and then you just like, all right, I'm out of here. Off to the next one. And then what you built is like an in-house shitty version of a product you could have bought in for, right? So there's two sides of that. I think the open source just makes it easier for everyone.

Starting point is 00:35:19 There's this idea he might be able to avoid vendor lock-in as well, think really is appealing to people. But, I mean, there's also great software that doesn't have open source and people buy it and love it. There's technical things you can do with it. But I think we all, as engineers, have seen those, like, monster implementations that promise, like, often the best ones are the ones that promise you have no need to talk

Starting point is 00:35:42 to your engineers at all when you implement it, right? Yeah. In the sales process, oh, yeah. You just, like, plug and play and click a few buttons, and you're in. And then as soon as the deal signed, oh, by the way, where's your engineer? years, we need them to come implement this thing they've never heard of before. Right. That's the thing I think everyone wants to avoid during these like vendors. Yeah, or the other version of that is, oh, we're going to handle everything for you.

Starting point is 00:36:03 We're going to help you along the way. And then you sign the deal. And you say, okay, how do we migrate this data? And they go, oh, well, it has to follow these, this standard. We don't do anything before that. That's all on you. It's like, well, that would have been nice to have known a month ago. Yeah. Okay. So we played this game on the show where we see how far into the show we can get without mentioning AI. So I don't know where we're cocking in today. I think we did okay. But I want to talk a little bit about AI and we got to talk about orchestration. You know, I think Daxter is a tool you can also use to orchestrate when you're, you know, pulling data together for AI or doing other things. I'm curious, like, what are people actually doing? So maybe, you know,

Starting point is 00:36:47 people using Daxter that are more on the cutting edge of using LLMs and, you know, maybe AI agents. Well, what are people actually practically doing with AI and orchestrators? Yeah, we see a lot of data prep for AI within Daxter itself. We even see some companies building foundational models and doing experimentation, but that is like I would say cutting edge. But bread and butter use cases, at the end of the day, I think AI engineering is, data engineering, and we even believe data engineering is software engineering. So if you follow

Starting point is 00:37:22 this logical conclusion, it's all really the same thing. You're moving around data, you're transforming it, you're storing it, you're converting it, you're embedding it, you're calling APIs. Is that data engineering or is that, you know, working with open AI and LLMs? Like that's one and the same. Often, what we find is actually AI engineering is a little bit easier with thanks to than ML engineering because you're relying a lot on these like third-party providers, for example, for embedding, you're experimenting like you're not doing a lot of training models right it's done for you you're really just experimenting and like putting things out and so we've seen a lot of companies do things like I mean rag is the big one right everyone's trying to you like AI is great but it needs context

Starting point is 00:38:04 without context it's often garbage if you go to open AI or cloud today and you ask it to write a Daxter pipeline it's often going to write really terrible code because it was trained on like that extra code from three years ago, which probably isn't valid anymore. But what we've done is we've built internally a rag model that uses our documentation, our GitHub issues, our GitHub discussions to power what we call Ask AI. It's a Slack bot in our Slack community. And it does really good. Is it perfect? No, but it's like a lot better than nothing. And so... Yeah, I've used it. It's pretty great. It's pretty good, right? Yeah. Not bad for a POC. And, you know, we could always make it better. Sometimes it gets confused. But

Starting point is 00:38:46 It's better than not getting an answer, which is always what I tell people. So context is everything, I think, in AI. And so what is context? Context is data, right? So ingesting our data, transforming it, picking the right ones, adding metadata, running experimentation on those different context windows, on different models. That's really where Dexter, I think, shines. It's just like running these pipelines.

Starting point is 00:39:08 So help me out with this. There is basically a clone, a thing about a data stack, where the modern data stack from 2021, there's a clone of almost every single component that's like AI focused, right? Like there's orchestration tool, ETL tool, database specific. And I'm not personally not super knowledgeable

Starting point is 00:39:30 about each of those components when it comes to AI. Do you think that stays or do you think it all gets consolidated back because it's not that different? Yeah, it's a good question. Maybe the vector databases stay. Yeah. If they're lucky, it's by best guess. Or did they, but I don't know technically how hard that would be to implement, you know,

Starting point is 00:39:50 for Snowflake and Databricks to implement that. Most of the database is to implement some type of embedding already. Yeah, right. Snowflake already has a vector version of their database. Postgres has vector embeddings now. I think even Mother Duck, DuckDB habit. Is it that hard to store a vector of numbers? Probably not.

Starting point is 00:40:10 There might be out of benefits to using a, like, dedicated vector database for, I don't know. Sure. Those are going to become specialized cases that you run into. That's my guess. And outside of that, the ETL stuff, I think we love reinventing things. My guess is most people who are getting into AI today, they're not coming into it from a background of data engineering. Yes.

Starting point is 00:40:34 And so they just don't know the tools. So if you don't know the tools, you think you have to invent things, right? Or maybe you just want to build new things because old things are boring. Yep. Some of those will probably stick around because they'll be good enough that everyone uses them and they devolve, I think a lot of them will fall by the wayside when we realize AI problems are actually data problems and we have data tools to solve that already. Right.

Starting point is 00:40:55 Well, I think a lot of people still, like there was this confusion, I feel like I still hear around there, which is this idea of we should be replacing all of our deterministic processes with AI. Yeah. It's like, but I don't need it to give me seven different answers to it. I just want the one answer that's right every time. Yeah, I mean, it's people using AI as a calculator, and it's like, well, a very expensive way to warm up the world. So I don't know.

Starting point is 00:41:22 Maybe we don't need to do that. I don't know. Sometimes all you need are if statements and a reg X, and maybe AI can replace that. But at the end of the day, like, whatever is faster is what's going to work for people. Right. I think on that one, AI is just going to replace me having to look up how to write the reg X. Yeah, that is a decent application. So, yeah, along the AI, kind of.

Starting point is 00:41:45 questioning. I mean, you just kind of alluded to this. I mean, it's still very expensive. And the billions of dollars being poured into these companies mask the expense for now. Like just, you know, just this week came out that the $200 a month plan still loses money for Open AI. And I think they weren't even necessarily expecting that. So what? I mean, and of course, like the thought here is like, okay, we're going to keep investing money in this and we'll have better hardware, this can drive cost down, we'll have better models that don't have to be trained as in the same way to reduce cost.

Starting point is 00:42:23 Well, I mean, this is just speculation at this point, but it'll be interesting, and I'm curious your take, what does that curve look like? Because eventually, like, the money, I think, could run out before we get to that spot. I mean, I don't know, what do you think? Just speculation on what might happen there. I mean, there's already some evidence of plateauing.

Starting point is 00:42:46 If you remember the great VC funded days of Uber and DoorDash, where it didn't cost anything to use these tools, and if you were smart, you would just abuse them as much as you could. You would get the referrals and the $100 here, the credits there, and it was like $0.5 to cross the city. You can get free food pretty much every single day, and that was wonderful. And then the company's been public, and that it would cost like $50. to go five miles, right? I know, yeah, exactly. Anywhere near an airport, it's like at least $50, even if you're just going across the street.

Starting point is 00:43:19 Yeah. It was supposed to be better. It was supposed to be the Stutopia, and it ended up just being a company that makes money off people, right? And they did so at the expense of like killing their competitors. So will AI be the same way?

Starting point is 00:43:31 I don't know, probably. Yeah. People need to make margins at some point. Cash is not infinite. Right now, it's really driven off massive amounts of funding. at some point, that'll change. We'll come down for sure,

Starting point is 00:43:46 but when the margins go down, like the research also slows down. And so they will probably plateau and we'll probably find them useful in some limited capacity that's probably not going to fundamentally solve AGI, for example. And I think we're seeing also

Starting point is 00:44:05 that like having the best model is not really much of a moat at this point. So it's not like you can say Well, yeah, we're going to spend billions But once we get it there We're going to capture everything And it does sound a bit like that Uber time Of it's like, profits don't matter

Starting point is 00:44:23 We just need to capture market And then eventually Once we capture the whole market We'll make money off of it Yeah Yeah, it's tough to capture the market When really it's a commodity too So I think where AI differentiates

Starting point is 00:44:38 It's through product actually So So anyone can build a model these days, right? A lot of them are good. It's great open source models out there. Integrating that model in a workflow is where differentiation, I think, really happens. And great companies who really understand that can make it a lot better. So I think Anthropic and Cloud, for example, do a really good job with, like, their projects

Starting point is 00:45:02 and the way they've sort of structured law to make it very, like, useful in particular context for solving these, like, problems and discussions. I use it all the time. Open AI, maybe not as good. I would say product-wise as Anthropic these days. They have more features that I don't end up using, but purely from like a chat agent with documentation store. I think Cloud does a better job.

Starting point is 00:45:25 Yeah. I imagine in a few years we're going to find companies that, like, really get the product perspective, right? And they built really cohesive products, which are really powered by AI, rather than just like an AI chatbot, that is really good at generating responses, which I think we've sort of hit a peek on,

Starting point is 00:45:42 regardless of how much better they get. Yeah, the other one it makes me think of a little bit is like a satellite telephone stuff where it costs a whole lot of money to get the satellites up and to get the infrastructure there. And once you had done all of that, it was really hard to make money off of it.

Starting point is 00:45:59 But then when the next people came around and were just using the infrastructure that was already out there, you could make a profitable model, like business model off of it. Like with a GPS, for example. Yeah. Even satellite phone.

Starting point is 00:46:10 It's like it's still around and the companies are more profitable with it. Right. Because they didn't have to pay to put all the satellites out there. Yeah. That's interesting. Yeah. So we have a few minutes left here. I'll throw this to Matt.

Starting point is 00:46:24 So Matt, you've spent a little bit of time with Daxter recently. And I'm curious. And you've got a data background. You know, Matt, Matt worked for a publicly traded company and data. I'm curious. Yeah. How does Dexter and the orchestration landscape strike you with what you used in some of your previous roles? Like, how is it different? What's the evolution like?

Starting point is 00:46:44 Well, so most of the places I worked, we didn't really have an orchestrator. So we had some more like pipeline-related things, but we didn't have like a dedicated orchestrator and a lot of it. So it's been an interesting little journey having to get to know it a little bit more and, you know, try to sometimes wrap my brain around the concepts. I think that's usually it. Because a lot of, I mean, there's a lot of stuff that when you get into, like, okay, I'm planning things, I'm putting them in sequence or in parallel, those types of ideas. A lot of it then comes down to what's the framework that they're using to talk about these things? What's the language they're using? What do they label this stuff?

Starting point is 00:47:23 Makes sense. Yeah, so, I mean, overall, it's been, I have the added twist of I'm also including rudder stack into this with some new stuff. So that has, that's thrown some interesting. frustrations the time just learning the two things at the exact

Starting point is 00:47:39 same time but I mean overall it's been it's one of those things that I can look at and I can see like oh here's how I could have used it

Starting point is 00:47:48 yeah oh yeah when I had a team of 15 this is how we could have used this right the one thing though I always

Starting point is 00:47:57 I had to think about back then was kind of to go back to a point that you made much earlier in that there's this like newer

Starting point is 00:48:05 generation of people who are data scientists or whatever and they got taught a very applied way of doing things which typically was very software centric and how do I you know how do I call the function to train a model or whatever and so when you get into that more broader kind of closer to software engineering world they sometimes get a little scared so you really had to pick stuff that that you that you knew you could quickly get them in and get them learning with. So remember we had a software engineer as a contractor once

Starting point is 00:48:41 and he was going to show us how to modernize our stuff and he did this whole thing of just basically tearing things apart, building it from scratch and trying to show it how great it was. And I was like, okay, that's great, but no one but you can run this.

Starting point is 00:48:57 Right. I got a team of people that when you're not here, I need you to run it, whereas something like Daxter is definitely one that you could see, okay, I can get a team of people to be up and running with this.

Starting point is 00:49:08 I think that's a really big deal. Two things I thought of from my previous experiences. Because I'd use, it's actually funny, I'd use the product called Rundack. Adrian, I don't know if you're familiar with that one. It's like a little bit more than a Windows task scheduler, but before like we had like that, you know,

Starting point is 00:49:27 kind of DAG type concept. But it's interesting when you go through, what you would do every day and now you have words and language for it. I think that's the most interesting thing about finding a good, like a DAX or like a good framework for, oh, I didn't know I was doing orchestration. Like I just, you know, schedule this around this and this.

Starting point is 00:49:47 I think that's one of the things. And then the second one, which Matt just touched on, which I talk a lot about. And I think orchestration is a big deal here. Here when you move, when your data team moves from like one, maybe two people to be more. more of a team, it's three, four, five, however many people, that conversion from what I call single player mode to multiplayer mode, it's a really big deal. The tooling becomes a bigger deal,

Starting point is 00:50:13 the version control, the, you know, and I think like DBT, for example, is one thing that I think is a big deal. If you're moving into multiplayer mode for your data team, like DBT and people in that transformation layer, having a solution there is a big deal. In the orchestration, same thing, where you're now now using the same framework. There's less esotericness when like how do we schedule a job is defined. Like we use this. It has specs and documentation. And I think knowing that, because I've been a part of at least one company where orchestration had the name and it was an employee named Gary.

Starting point is 00:50:51 And so he ran everything. And when he left, nothing could run versus if you, and then we were. scrambling whole group of us to try to get things back together right but we also didn't have any like we didn't have the language because this was almost 10 years ago now to be able to be like okay no what we need to do is get this into an orchestrator so that we're not dealing with this anymore and even just i think the language of how do i talk about these things okay these are assets and stuff like that giving language to that can be very helpful in just helping i think a lot of times get out of the kind of limited mind frame they're in,

Starting point is 00:51:34 if that makes sense. Especially when you're talking about things like, what does data as a product mean? Well, to a lot of data scientists who are very new, it means the model I built and explaining to them, well, no, you have to have this. It's the end-to-end collection to delivery is the product, not just this little part that you build.

Starting point is 00:51:54 So one last take. Pedro, where do you, we're maybe specifically for Daxter or generally for orchestration. Where do you think this goes in the last, in the next couple of years? What are the core problems in the space to solve for orchestrators such as Daxter? Yeah, it's a good question. I think one of it is something we just touched on is that

Starting point is 00:52:17 not everyone knows what an orchestrator is and when they need it. And so I think at Dexter we have like two sort of big priorities. One is just helping generate awareness of what orchestrators are, what a data platform is, the fact that you probably already have one and how to think about observing and having a single place to look at these things, right? You can't just go to Gary every single time. And so having one place where you can understand where everything is supposed to run. That's, I think, a big piece of it.

Starting point is 00:52:45 And the other is also just like lowering that adoption for people. So finding ways to make it easier, more plug and play to use Xaxter with existing you know, playbooks that you already have and are pretty common across the industry. Building those out without losing sight of sort of the power of Python and Daxter itself is kind of where we're focused on. Yeah, makes a ton of sense.

Starting point is 00:53:09 Well, thanks for being on the show. It's been really fun. Matt, thanks for being here. And we'll catch everybody in the next episode. Thank you. All right, thank you. The Datastack show is brought to you by Rudderstack, the warehouse native customer data platform.

Starting point is 00:53:25 RudderSack is purpose-built to help data teams turn customer data into competitive advantage. Learn more at RudderSack.com.

The Data Stack Show - Re-Air: Bridging Gaps: DevRel, Marketing Synergies, and the Future of Data with Pedram Navid of Dagster Labs

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.