The Data Stack Show - The PRQL: Are Marketers the Worst Data Quality Offenders?

Episode Date: October 7, 2022

In this bonus episode, Eric and Kostas preview their upcoming conversation with Gleb Mezhanskiy of Datafold. ...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to the Data Stack Show prequel, where we talk about the show that we just recorded to give you a teaser. Kostas, this was a super interesting one talking about data quality. I guess the first question I have is actually related to an accusation that you made, which is that marketers are generally one of the root causes of data quality issues in the data stack. What say you? Yeah, dude. Just think about how many tools and products have been built out.
Starting point is 00:00:33 They're just like the needle marketeers. That says everything. You can never satisfy us. Yeah, and you cannot get satisfied so i do i just speak the truth you know that no it's so true it is so true i do think glenn made a really good point though in that he was really forward about data being messy right you know especially things like event data you know which marketing relies on super heavily. And one of the concepts that he talked about that I think was really helpful, and for anyone of our listeners who's interested in data
Starting point is 00:01:13 quality, you're going to want to catch this show to hear Gleb talk about classifying data quality issues in terms of we mess something up or they messed something up, which I thought was super interesting. They messed something up or things out of your control, right? So you have like a scheduling challenge in your orchestration tool or something along those lines, right? Which is sort of the they problem. And I think Lab did a really good job of articulating why a lot of the issues are actually like we problems, like we made some sort of mistake, you know, whether that was, you know, not doing a good enough job on QA or building an extensible enough system or whatever. And that really informs I think the way that he has built
Starting point is 00:01:59 data fold as a tool. So that really stuck out to me. But what do you think? Yeah, absolutely. I think it's a very interesting framework to use in order to capture the complexity of such a complex problem like data quality, because you need some kind of model and framework to tackle the problem of data quality. Okay. Maybe it's a little bit like dangerous in some environments. The we and them situation might get a little bit spicy. But it definitely works well, like for him who is building like a product and a company and he needs like to create let's say, at the end, a product experience, but they are going to be consumed by different, very different people, from data engineers to people who are consuming the data at the end. And probably they're not technical
Starting point is 00:02:57 at all, right? But they are part of the problem and part of the solution too, of data quality. So yeah, it makes a lot of sense. I don't know, like I really enjoy like the conversation. I think it's like for me also gives like more joy the fact that they have a very developer focused approach, like the products that they're building. So that's something that like I find always like very interesting. Yep. I mean, at the end, yeah, like, okay, like data quality is a very hot space.
Starting point is 00:03:30 We have like quite a few vendors, like probably we brought here most of them. Yeah. A lot of the big ones. Yeah. And I think we should do another iteration a couple of months, like bring all of them again and see like you're like, what's the change with each one of them? Yeah, absolutely.
Starting point is 00:03:50 Well, you're definitely going to want to catch this one. Super interesting if you're working on anything related to data quality. If you don't like it, don't worry. That's a we problem, not a they problem. And of course, we will catch you on the next show. Thanks for joining us and subscribe if you haven't.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.