The Data Stack Show - The PRQL: What is Data Discovery?
Episode Date: October 21, 2022In this bonus episode, Eric and Kostas preview their upcoming conversation with Shinji Kim of Select Star. ...
Transcript
Discussion (0)
Welcome to the Data Stack Show prequel, where we talk about the show that we just recorded to give you a teaser.
Costas, one of the phrases that was used in the show towards the end, which was really interesting, was a lot of our customers initially find this tool to act like a Google for their own data, which was a really interesting concept.
What was your take on that?
Yeah, I mean, I think the complexity of working with data explodes really fast.
I mean, if you start collecting data from a couple of sources and then you start
joining the data together and creating like pipelines and all that stuff, it's like super, super hard, like for someone who hasn't been there since the data
started to get collected, like to figure out what data and what this piece of data
means and what to trust and what not to trust and what's
used by which tool and all that stuff.
So I, and then this is like true, like even like in small companies, you don't need to go
have like thousands of tables there for this to happen.
But yeah, like, like it grows, like, I don't know, probably like
very, in a very exponential kind of nature.
So yeah, like having the ability to just go there and like search for something
and like come up with a table that might be helpful for what you're looking for.
I think makes a lot of sense.
Like, yeah, a lot of this kind of like Google simplicity, like searching
the data of the company and then figuring
out like how to connect them together.
So I think like there's a very, very good feedback to get for your products.
David PĂ©rez- Yeah, I totally agree.
And I think, I think one of the things that, you know, was appearing in the
conversation, but you know, you, it's easy to think about the steps of data flow being, you know, sort of distinct things that happen in a particular order, you know, as very clear steps, right? So it's like, okay, we got to ingest data. We need to transform it somehow. Then you model it and then you have BI.
And it's like one, two, three, four.
And the reality is that modeling, transforming, modeling, and BI are all highly iterative processes.
And you have to go through a long life cycle until you get things that are durable enough to where they don't change a lot, even in a small company.
And I think it was just a helpful reminder, and you mentioned this actually in the introduction, that understanding your data, knowing what data you have really is a key first step in actually enacting governance, which is ultimately what Shinji of SelectStar
is helping people do.
And she dug into that on the show.
You got pretty technical, which is really interesting.
How do you do that and automate lineage
and metadata and all that sort of stuff?
So if you are interested in anything around lineage,
data governance, metadata, and data discovery, you'll definitely want to check this one out and we will catch you on the
next one.