The Data Stack Show - The PRQL: Will Data Quality Always Require a Human in the Loop?
Episode Date: December 21, 2021Eric and Kostas preview the upcoming show by talking about data quality. ...
Transcript
Discussion (0)
Welcome to the Data Sack Show prequel.
We just recorded an episode with one of the founders of LightUp.ai talking about data
quality.
It's a really interesting episode.
Before we give a little preview though, this is a special episode because you and I are
in the same place.
I'm in San Francisco, which is really fun, especially fun to record a podcast here. Thank you for bringing your mic because I forgot
mine. I'm also wearing a button down, which I think this is the first time I've ever worn a
button down. Maybe on definitely on the podcast, maybe at Ryder Stick. But I thought it'd be a
good idea to look more professional for some reason on a business trip.
I don't know. Unfortunately, I didn't do the same, but we'll keep that for next time.
Okay. This episode is great. And I'll just give you a little preview. So we try to define data
quality. We talk about team structures for data quality, which is really interesting.
But I think the most fun part of it was imagining an economics experiment inside of an organization where
someone has to buy data.
I'm really passionate about the idea of using monopoly money to simulate that.
Maybe we'll see if we can do that.
That leads me into my question.
If you, as a consumer of data, had to purchase, you only had enough money in this sort of internal economy
inside this company to buy like one or two data sets
from the data engineering team.
What would you buy?
What's my role right now?
What am I supposed to do?
Am I marketing?
Yes, of course you're in marketing.
Of course you're in marketing.
What I would buy, I mean, we need to at least start with signups i guess
and make sure that these signups reach salesforce yeah that's a hard question because also like
where's the data coming from right because if i think about like a payload where i could get a
page view with an initial referrer. Yeah.
That's like a two for one deal, but I don't know if, am I selling you the two for one
deal or?
I don't know.
Yeah.
I mean, as a marketeer, I think I'm fine with getting an email.
So I have the leads.
I've done my job.
I can go back home.
Right.
Now, if I'm sales, then we have to decide.
Yeah, exactly.
What is an MQL, right?
Yeah. Well, we actually talked about that on the episode, which is really interesting. And like,
that's a huge part of data quality.
Yeah. And actually, I think one of the most interesting parts of the conversation is like
how important semantics are when we are talking about data and there are some very interesting topics there and like not concerns exactly but like some
points that manu brought up that are i think like extremely extremely interesting especially in
terms of like things like are we always need to have like a man in the loop for that. Count this change and how we can reuse like best practices from
other disciplines. Data quality is still like something that we try to define as a category,
right? Like it's not, it's not there yet. It's very new. We're working towards that, but Manu
is like one of the people that are doing that. So I think it's like super interesting to hear what his opinion is on this.
And
I have a feeling
this is,
no,
this is the second
data quality company
that we're hosting, right?
So we have a big guy.
Bravo.
And iteratively, actually.
Ah, yeah.
So we have more.
Yeah, we have,
yeah,
we should do like a roundup.
Yeah.
Okay, we're at time.
Brooks is telling us
we're at time,
but last question for you.
So if we've never met, we just meet at a party, we talk for a few minutes, would you think
that I'm more of like an MQL or an SQL? You're definitely a PQL.
Perfect. All right. Well, it's a great episode talking with Nanu from LightUp.ai.
Subscribe if you haven't so you get notified of when the episode comes out. We'll talk about
data quality and monopoly money. You're going to love it.